Apache Hadoop

Apache Hadoop
Original author(s)Doug Cutting, Mike Cafarella
Developer(s)Apache Software Foundation
Initial releaseApril 1, 2006 (2006-04-01)
Stable release
2.10.x2.10.2 / May 31, 2022 (2022-05-31)
3.4.x3.4.0 / March 17, 2024 (2024-03-17)
RepositoryHadoop Repository
Written inJava
Operating systemCross-platform
TypeDistributed file system
LicenseApache License 2.0
Websitehadoop.apache.org

Apache Hadoop ( /həˈdp/) is a collection of open-source software utilities for reliable, scalable, distributed computing. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Hadoop was originally designed for computer clusters built from commodity hardware, which is still the common use. It has since also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.