hadoop
« Back to VersTracker
Description:
Framework for distributed processing of large data sets
Type: Formula  |  Tracked Since: Dec 28, 2025
Links: Homepage  |  @ApacheHadoop  |  formulae.brew.sh
Category: Devops
Tags: big-data hadoop distributed-systems hdfs mapreduce yarn
Install: brew install hadoop
About:
Apache Hadoop is an open-source framework that enables distributed processing of large datasets across clusters of computers. It provides a software ecosystem for reliable, scalable, and distributed computing. Its core HDFS component handles storage while MapReduce or YARN manages processing.
Key Features:
  • HDFS for high-throughput distributed file storage
  • MapReduce and YARN for scalable parallel processing
  • Fault tolerance with automatic node recovery
  • Ecosystem tools like Hive, Pig, and HBase
  • Runs on commodity hardware clusters
Use Cases:
  • Big data analytics and batch processing
  • Data warehousing and ETL pipelines
  • Large-scale log aggregation and analysis
Alternatives:
  • Apache Spark – In-memory processing engine that is generally faster than Hadoop MapReduce for iterative workloads
  • Apache Flink – Stream-processing framework optimized for real-time data pipelines
Version History
Detected Version Rev Change Commit
Oct 4, 2025 10:58am 0 VERSION_BUMP ce3ae9c2
Oct 28, 2024 9:53am 0 VERSION_BUMP 0aa17e9a