Posts

Showing posts with the label big data

What Is Hadoop?

Hadoop is a term you will hear over and over again when discussing the processing of big data information. You might have also seen the yellow elephant image, which is the copyrighted icon depicting Hadoop (Hadoop was the name of the founder’s ( Doug Cutting’s ) son’s toy elephant). In the other post, I broke down the idea of MapReduce into the most easily digestible way possible; here is the same with Hadoop. A little history… Hadoop was born out of a need to process big data, as the amount of generated data continued to rapidly increase. As the Web generated more and more information, it was becoming quite challenging to index the content, so Google created MapReduce in 2004, then Yahoo! created Hadoop as a way to implement the MapReduce function. Hadoop is now an open-source Apache implementation project. Overall, Hadoop enables applications to work with huge amounts of data stored on various servers.  Hadoop’s functions allow the existing data to be pulled from vario...