Tag: big-data 23 packages with this tag
« Back to VersTracker  |  All Categories  |  All Tags  |  Related: analytics distributed-systems sql hadoop cli library query-engine etl machine-learning database
Package Description Version
go-parquet-tools formula 175 Utility to deal with Parquet data 1.41.0
alluxio formula Open Source Memory Speed Virtual Distributed Storage 2.9.5
apache-arrow formula Columnar in-memory analytics layer designed to accelerate big data 22.0.0
apache-drill formula Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud Storage 1.22.0
apache-flink formula Scalable batch and stream data processing 2.2.0
apache-flink@1 formula Scalable batch and stream data processing 1.20.3
apache-spark formula Engine for large-scale data processing 4.1.1
avro-tools formula Avro command-line tools and utilities 1.12.1
c-blosc formula Blocking, shuffling and loss-less compression library 1.21.6
datafusion formula Apache Arrow DataFusion and Ballista query engines 51.0.0
flintrock formula Tool for launching Apache Spark clusters 2.1.0
hadoop formula Framework for distributed processing of large data sets
hbase formula Hadoop database: a distributed, scalable, big data store 2.6.4
hdf5 formula File format designed to store large amounts of data 1.14.4.3
hive formula Hadoop-based data summarization, query, and analysis
libstxxl formula C++ implementation of STL for extra large data sets
mahout formula Library to help build scalable machine learning libraries 0.13.0
pig formula Platform for analyzing large data sets
prestodb formula Distributed SQL query engine for big data 0.296
traildb formula Blazingly-fast database for log-structured data
trino formula Distributed SQL query engine for big data
vespa-cli formula Command-line tool for Vespa.ai 8.620.35
xlearn formula High performance, easy-to-use, and scalable machine learning package