htmlcxx
« Back to VersTracker
Description:
Non-validating CSS1 and HTML parser for C++
Type: Formula  |  Tracked Since: Dec 28, 2025
Links: Homepage  |  formulae.brew.sh
Category: Developer tools
Tags: c++ html parser library web-development
Install: brew install htmlcxx
About:
Htmlcxx is a lightweight, non-validating HTML and CSS1 parser written in C++. It provides a simple DOM interface for parsing and manipulating malformed web content found in the wild. Its primary value is offering a robust solution for processing real-world HTML without the overhead of full standards compliance.
Key Features:
  • Handles malformed HTML gracefully
  • Lightweight C++ library with minimal dependencies
  • Provides a simple DOM API for traversal and modification
  • Includes CSS1 parser support
Use Cases:
  • Web scraping and data extraction from legacy websites
  • Building custom HTML renderers or analyzers
  • Processing and sanitizing user-generated HTML content
Alternatives:
  • libxml2 – More comprehensive and standards-compliant, but heavier and stricter than Htmlcxx.
  • Gumbo – Google's HTML5 parser; strictly follows the HTML5 spec, whereas Htmlcxx is more lenient.
Version History
Detected Version Rev Change Commit
Sep 11, 2025 5:45am 0 VERSION_BUMP f7773147