apache-spark
« Back to VersTracker
Description:
Engine for large-scale data processing
Type: Formula  |  Latest Version: 4.1.1@0  |  Tracked Since: Dec 17, 2025
Links: Homepage  |  @ApacheSpark  |  formulae.brew.sh
Category: Devops
Tags: big-data analytics distributed-computing etl machine-learning
Install: brew install apache-spark
About:
Apache Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Java, Scala, Python, and R, and an optimized engine that supports general computation graphs. It is designed to scale from a single machine to large clusters, offering significant performance improvements over traditional big data frameworks.
Key Features:
  • In-memory computation for faster performance
  • Support for SQL, streaming, and machine learning
  • Fault-tolerant distributed data processing
  • Rich APIs in Java, Scala, Python, and R
Use Cases:
  • Big data ETL pipelines and data warehousing
  • Real-time stream processing and analytics
  • Large-scale machine learning model training
Alternatives:
  • hadoop – Hadoop MapReduce writes to disk, making it slower than Spark's in-memory processing.
  • flink – Flink offers lower latency for streaming, while Spark is generally easier to use for batch processing.
License: Apache-2.0
Dependencies: openjdk@21
Bottles available for: all
Version History
Detected Version Rev Change Commit
Jan 9, 2026 10:54am 4.1.1 0 VERSION_BUMP 3206653e
Dec 16, 2025 2:20pm 4.1.0 0 VERSION_BUMP 6190c704
Dec 16, 2025 1:56pm 4.1.0 0 VERSION_BUMP b4afb3a2