Berkeley Data Analytics Stack

2.c) How does Berkeley Data Analytics Stack help in analytics tasks?

Answer:

Berkeley Data Analytics Stack (BDAS):

The importance of Big Data lies in the fact that what one does with it rather than how big or large it is. Identify whether the gathered data is able to help in obtaining the following findings:

  1. cost reduction,
  2. time reduction,
  3. new product planning and development,
  4. smart decision making using predictive analytics and
  5. knowledge discovery.

Big Data analytics need innovative as well as cost effective techniques. BOAS is an open-source data analytics stack for complex computations on Big Data. It supports efficient, large-scale in-memory data processing, and thus enables user applications achieving three fundamental processing requirements: accuracy, time and cost.

Berkeley Data Analytics Stack (BDAS) consists of data processing, data management and resource management layers. Following list these:

  1. Applications, AMP-Genomicsand Carat run at the BOAS. Data processing software component provides in-memory processing which processes the data efficiently across the frameworks. AMP stands for Berkeley’s Algorithms, Machines and Peoples Laboratory.
  2. Data processing combines batch, streaming and interactive computations.
  3. Resource management software component provides for sharing the infrastructure across various frameworks.

Figure 1.10 shows a four layers architecture for Big Data Stack that consists of Hadoop, MapReduce, Spark core and SparkSQL, Streaming, R, Graphx, MLib, Mahout, Arrow and Kafka.

four layers architecture for Big Data Stack that consists of Hadoop, MapReduce, Spark core and SparkSQL, Streaming, R, Graphx, MLib, Mahout, Arrow and Kafka.

Leave a Reply

Your email address will not be published. Required fields are marked *