libtextcat ☆

« Back to VersTracker

Description:
N-gram-based text categorization library

Type: Formula | Tracked Since: Dec 28, 2025

Links: Homepage | formulae.brew.sh

Category: Developer tools

Tags: nlp language-detection text-processing library c

Install: brew install libtextcat

About:
Libtextcat is a C library designed for language identification and text categorization using statistical n-gram frequency profiles. It compares input text against pre-built language models to determine the most likely language or genre. This enables developers to integrate robust language detection capabilities into applications requiring automated content analysis or routing.

Key Features:

N-gram based statistical analysis
Fast and lightweight language identification
Support for custom language profiles
C library with bindings for other languages

Use Cases:

Automatic language detection for content management systems
Filtering or routing text based on identified language
Spam detection and text classification tasks

Alternatives:

cld2 – Google's compact language detector 2, often considered faster but with different license terms
fasttext – Facebook's library, offers deep learning-based identification but is heavier and more complex

Version History

Detected	Version	Rev	Change	Commit
Sep 14, 2024 9:07pm		0	VERSION_BUMP	37f35e23