libtextcat
« Back to VersTracker
Description:
N-gram-based text categorization library
Type: Formula  |  Tracked Since: Dec 28, 2025
Links: Homepage  |  formulae.brew.sh
Category: Developer tools
Tags: nlp language-detection text-processing library c
Install: brew install libtextcat
About:
Libtextcat is a C library designed for language identification and text categorization using statistical n-gram frequency profiles. It compares input text against pre-built language models to determine the most likely language or genre. This enables developers to integrate robust language detection capabilities into applications requiring automated content analysis or routing.
Key Features:
  • N-gram based statistical analysis
  • Fast and lightweight language identification
  • Support for custom language profiles
  • C library with bindings for other languages
Use Cases:
  • Automatic language detection for content management systems
  • Filtering or routing text based on identified language
  • Spam detection and text classification tasks
Alternatives:
  • cld2 – Google's compact language detector 2, often considered faster but with different license terms
  • fasttext – Facebook's library, offers deep learning-based identification but is heavier and more complex
Version History
Detected Version Rev Change Commit
Sep 14, 2024 9:07pm 0 VERSION_BUMP 37f35e23