text-embeddings-inference
« Back to VersTracker
Description:
Blazing fast inference solution for text embeddings models
Type: Formula  |  Latest Version: 1.8.3@0  |  Tracked Since: Oct 30, 2025
Links: Homepage  |  @huggingface  |  formulae.brew.sh
Category: Ai ml
Tags: ai machine-learning nlp inference embeddings huggingface
Install: brew install text-embeddings-inference
About:
Text Embeddings Inference (TEI) is a toolkit for deploying and serving dense feature extraction models. It leverages optimized Rust and CUDA kernels to maximize throughput and minimize latency for popular transformer architectures. The solution supports popular models like BERT and includes features like token-based dynamic batching.
Key Features:
  • Optimized Rust/CUDA kernels for high performance
  • Support for popular open-source models (BERT, BGE, etc.)
  • Token-based dynamic batching
  • Easy-to-use API for integration
Use Cases:
  • Production deployment of embedding models for semantic search
  • Building RAG (Retrieval-Augmented Generation) pipelines
  • Real-time feature extraction for NLP applications
Alternatives:
  • FastAPI + Transformers – TEI offers significantly higher throughput and lower latency out-of-the-box compared to a standard Python implementation.
  • vLLM – vLLM is optimized for LLM text generation, whereas TEI is specifically optimized for feature extraction and embedding tasks.
Version History
Detected Version Rev Change Commit
Oct 30, 2025 11:29am 0 VERSION_BUMP e57b3df9
Sep 13, 2025 5:30am 0 VERSION_BUMP e3368c12