drake
« Back to VersTracker
Description:
Data workflow tool meant to be 'make for data'
Type: Formula  |  Tracked Since: Oct 11, 2024
Links: Homepage  |  GitHub  |  formulae.brew.sh
Stars: 1,483  |  Forks: 107  |  Language: Clojure  |  Category: Devops
Tags: data-workflow etl clojure automation pipeline
Install: brew install drake
About:
Drake is a command-line data workflow tool that automates and manages data processing pipelines. It uses a Make-like syntax to define dependencies between data steps, ensuring only necessary computations are rerun when inputs change. Its primary value is providing reproducible, incremental data processing for ETL, data science, and analytics tasks.
Key Features:
  • Make-like dependency management for data files
  • Incremental execution to avoid redundant work
  • Workflow definitions in a simple Drakefile format
  • Support for multiple execution engines (local, Hadoop)
  • Integration with various data sources and formats
Use Cases:
  • Building reproducible ETL (Extract, Transform, Load) pipelines
  • Managing data science workflows and model training steps
  • Automating data processing and report generation
Alternatives:
  • make – Drake is specifically designed for data files with built-in content hashing, whereas Make is general-purpose for files.
  • Apache Airflow – Airflow is a larger-scale orchestration platform with scheduling UI; Drake is a simpler, file-focused CLI tool.
Version History
Detected Version Rev Change Commit
Oct 11, 2024 11:12pm 0 VERSION_BUMP 9785eaa8