Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Unreleased​
9.2.10 - 2026-02-07​
Changed​
- Remove VERSION files and use
importlib.metadatafor dynamic versioning (#950, #951) - Use Rust TextClassifier with
.binmodels for classification (#935) - Update sentiment models to use underthesea_core TextClassifier (#946)
- Consolidate classification into single module (#935)
Added​
- Add pure Rust FastText inference to underthesea_core (#947)
- Add TextPreprocessor to underthesea_core for Vietnamese text preprocessing (#942)
- Add underthesea_core API documentation (#948)
- Add workflow to publish underthesea_core to crates.io (#943)
Documentation​
- Separate sidebars for Technical Reports, API Reference, Datasets, and Changelog (#945)
- Add blog posts for Rust-powered text classification and CRF (#934, #935)
- Rename docusaurus folder to docs (#933)
Security​
9.2.0 - 2026-01-31​
Added​
- Add Agent class with custom tools support using OpenAI function calling (GH-712)
- Add default tools: calculator, datetime, web_search, wikipedia, shell, python, file operations (GH-712)
Changed​
- Upgrade underthesea_core to 2.0.0 with L-BFGS optimizer (#899)
- 10x faster feature lookup with flat data structure
- 1.24x faster than python-crfsuite for word segmentation
- L-BFGS with OWL-QN for L1 regularization
9.1.5 - 2026-01-29​
Added​
- Add Agent API with OpenAI and Azure OpenAI support (GH-745, #890)
- Add ParserTrainer for dependency parsing (GH-392, #880)
- Add POS tagger training pipeline (GH-423, #883)
Documentation​
- Add Vietnamese News Dataset (UVN) documentation (GH-885, #888, #889)
- Add UVB dataset documentation (GH-720, #887)
- Add UUD-v0.1 dataset documentation (#886)
- Add UTS Dictionary dataset documentation (GH-622, #884)
9.1.4 - 2026-01-24​
Added​
Changed​
- Remove NLTK dependency (#879)