hadoop ☆

« Back to VersTracker

Description:
Framework for distributed processing of large data sets

Type: Formula | Tracked Since: Dec 28, 2025

Links: Homepage | @ApacheHadoop | formulae.brew.sh

Category: Devops

Tags: big-data hadoop distributed-systems hdfs mapreduce yarn

Install: brew install hadoop

About:
Apache Hadoop is an open-source framework that enables distributed processing of large datasets across clusters of computers. It provides a software ecosystem for reliable, scalable, and distributed computing. Its core HDFS component handles storage while MapReduce or YARN manages processing.

Key Features:

HDFS for high-throughput distributed file storage
MapReduce and YARN for scalable parallel processing
Fault tolerance with automatic node recovery
Ecosystem tools like Hive, Pig, and HBase
Runs on commodity hardware clusters

Use Cases:

Big data analytics and batch processing
Data warehousing and ETL pipelines
Large-scale log aggregation and analysis

Alternatives:

Apache Spark – In-memory processing engine that is generally faster than Hadoop MapReduce for iterative workloads
Apache Flink – Stream-processing framework optimized for real-time data pipelines

Version History

Detected	Change	Commit
Oct 4, 2025 10:58am	VERSION_BUMP	ce3ae9c2
Oct 28, 2024 9:53am	VERSION_BUMP	0aa17e9a
Sep 17, 2023 8:08pm	VERSION_BUMP	a8c7cfba
Jan 27, 2023 6:17pm	VERSION_BUMP	271a48b4
Jan 27, 2023 6:17pm	VERSION_BUMP	61cc7f1f
Jan 11, 2023 7:04pm	VERSION_BUMP	a4ca49a3