Posts

Showing posts from October, 2016

Big Data Intro

Big Data is a term used for processing the large or complex set of data. Data is growing rapidly at a very fast rate. To keep track of the large set of data was becoming a tedious task. In traditional approach, it was not feasible to process the nonstructural data. To overcome the so much huge amount of data and to process structural and nonstructural data big data came into the picture. Three V's that has been included in big data architecture are:-   Volume    : - Quantity of generated and stored data.       Variety     : - Type and Nature of data.     Velocity     : -Speed at which data is generated and processing to meet the demands.

Hadoop Learning

This blog is for those who want to learn the concept of big data Hadoop. Big Data is one of the emerging technologies and very interesting to learn. There are too much scope in the field of Big Data and it will grow as the time progress.  Its more challenging as well as interesting field . I will take you through from the very basic to the advance concepts of Hadoop and its eco systems which will help you to grow in the field of Big Data.  Lets walk through Hadoop. Hadoop is an open source software for distributing storage and processing of very large data sets of clusters built from commodity hardware. Commodity hardware are the low cost computer machines that are easily available to use. Hadoop basically works on the processing of the data by spiting it into multiple sets of small data sets. There are different ecosystems of Hadoop such as HDFS, Map Reduce(MR), Hive, Impala, Oozie, Kafka, Flume, Spark, Pig etc. We will lear...