Posts

Showing posts with the label apache

What is Hadoop?

Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the  Apache  project sponsored by the Apache Software Foundation. Hadoop makes it possible to run applications on systems with thousands of nodes involving thousands of  terabyte s. Its distributed file system facilitates rapid  data transfer rate s among nodes and allows the system to continue operating uninterrupted in case of a node failure. This approach lowers the risk of catastrophic system failure, even if a significant number of nodes become inoperative. Hadoop was inspired by  Google 's  MapReduce , a software framework in which an application  is broken down into numerous small parts. Any of these parts (also called fragments or blocks) can be run on any  node  in the  cluster . Doug Cutting, Hadoop's creator, named the framework after his child's stuffed toy elephant. T...