Juju solutions for big data

When exploring big data solutions, one of the most daunting tasks users face is the setup and configuration of these usually complex tools. This can take from hours to days, time you could and should spend testing, evaluating, and putting your big data solutions to good use to benefit your business.

Why use Juju for big data?

  • Speed

    Reduce the time to deploy Hadoop and other solutions from days to minutes.

  • Agility

    Experiment with different configurations and solutions to choose what works for you.

  • Flexibility

    Port your solution from one infrastructure to another quickly and seamlessly.

  • Expertise

    Charms encapsulate best practice allowing you to focus on your work

Core processing

Apache MapReduce

MapReduce is a software framework for easily writing applications which process vast amounts of data in-parallel on large clusters of machines. Use this bundle if you are after a vanilla Apache MapReduce that is modelled, configured and ready to work.

Apache Spark

Extend the core MapReduce model to include the Apache Spark execution engine and take advantage of a fast general engine for large-scale data processing.

Key Big Data charms

Data ingestion

Apache Flume/Spark/Zeppelin

An end-to-end Big Data solution that enables ingestion, processing, and visualization of log data. The ingestion component highlighted here is the Apache Flume service.

Key charms included in the bundle

Apache Spark Streaming

Leverage Spark’s built-in ingestion support for twitter, local data and more. This model features the Apache Zeppelin service to make interacting with Spark quick and easy.

Key charms included in the bundle

Kafka Data Ingest Bundle

Key charms included in the bundle

Data analysis

Use Pig Latin to run analytics on your data by connecting Pig to Hadoop Core.

Connect Hive to Hadoop Core for SQL-like data analysis with a MySQL data warehouse.

Data visualization

Web-based notebooks are a great way to visualise job results from Hadoop processing engines. Leverage two different notebooks by extending the Spark processing bundle with either Zeppelin or iPython.

Use the bundles above, alter and extend them or create your own. Get involved and find out more by visiting the Big data community.

A gathering of Ubuntu community