Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Unreleased​
Added​
underthesea chat— local chat TUI bridging theclaudeCLI (#1021)- Spawns
claude --print --output-format=stream-json --verboseper turn and parses the JSONL event stream (system / assistant / result), so the user's existingclaude loginsubscription provides the LLM with no API key or per-token cost - Tracks
session_idfrom the first system event and replays it as--resume <id>on subsequent turns to keep conversation context - Each invocation starts a fresh timestamped session (
YYYY-MM-DD_HH-MM-SS) under~/.underthesea/assistant/<name>.md(format compatible with the upcoming MarkdownMemory abstraction); pass--session <name>to resume a saved chat - Textual chat UI inspired by OpenClaw: no header bar, no widget borders, a full-width slate bar (
#2a3447) for user input, plain text for assistant replies streamed token-by-token, a single-line dim status footer showingstate | workspace/session | model | session_id | tokens/200k ClaudeBridgeis async-iterable and used standalone of the TUI;Sessionprovides atomic markdown round-trip- New
[assistant]optional extra (textual>=0.50,rich>=13.0); the rest of the assistant module is import-safe without it - 19 unit tests (
tests/agent/assistant/test_bridge.py+test_session.py) and 3 visual snapshot tests viapytest-textual-snapshot(tests/agent/assistant/test_tui_snapshot.py) under a new[test-tui]extra - End-to-end terminal tests live in a sibling
workspace_underthesea/e2e_tests/Node project driven by@microsoft/tui-test, covering both the underthesea chat TUI and the upstreamopenclaw chatagainst a parallel persona workspace
- Spawns
Changed​
- Rename CLI commands so the verb form matches the upstream OpenClaw surface (#1021)
underthesea chat— wasunderthesea assistant(the chat TUI introduced in this release)underthesea agent— wasunderthesea chat(the Next.js + Node web app already shipped in v9.4)
Documentation​
- Pivot the
docs/research/personal-ai-os/notes to a global growth positioning (#1020)- Roadmap rewritten as three milestones — M1 Smallest Agent Framework + Free TUI Assistant, M2 Personal AI OS in Pure Python, M3 Growth Amplification
- Drop the multi-agent orchestrator phase from the 6-month plan; drop Vietnamese-specific channel work (Zalo) in favor of global ones (Telegram / Slack / Email / Discord); add explicit non-goal about not using Vietnamese identity as a marketing lever
[9.5.0] - 2026-05-17​
Added​
- A2A-compatible HTTP server adapter
underthesea.agent.server(#1016)make_app(agent, path=...)— raw ASGI callable (no web framework dep at the library level); plug into uvicorn / hypercorn / daphne / granianserve(agent, ui=True)— one-line uvicorn entrypoint with bundled chat UI at{path}/ui, JSON-RPCmessage/streamover HTTP+SSE, and a discoverableAgentCardat{path}/.well-known/agent-card.json- Per-session
Agentclone per A2AcontextIdso each conversation keeps isolated history - Tool calls stream live as
tool_callartifacts via a ContextVar hook around eachTool.func— theAgentclass itself is unchanged - Sync
Agent.__call__offloaded viaasyncio.to_threadto keep the event loop responsive - Bundled chat UI is generic: title from
AgentCard, RPC URL auto-discovered fromwindow.location, modal-based agent card viewer - New
[agent-server]extra (uvicorn + starlette + httpx, version ranges aligned with google-adk-python) - 17 tests via
httpx.ASGITransport; newci-agentworkflow runs them on every PR
Fixed​
agent/wiki.pyruff lint errors (unusedosimport, twoB904missingraise ... from) that previously blocked CI on any PR (#1016)
9.4.0 - 2026-04-11​
Added​
- Agent tracing with local file and Langfuse support (#1003)
LocalTracer: auto-saves JSON traces to~/.underthesea/traces/with datetime-prefixed filenamesLangfuseTracer: sends traces to Langfuse v4 with full hierarchy (agent → generation → tool), token usage, and cost tracking@tracedecorator withcontextvars-based nesting — inner calls become child spans automatically- Agent auto-traces by default (like Claude Code), disable with
UNDERTHESEA_TRACE_DISABLED=1 - Agent inherits
@trace()context — no explicit tracer needed when called inside a decorated function - 48 tests covering LocalTracer, LangfuseTracer, decorator, Agent integration, context inheritance, env var toggle
- Add Chat App for AI Agent (#893)
Changed​
- Rebrand Underthesea as Agentic AI Toolkit from v9.3.0 (#997)
Documentation​
Security​
- Bump Next.js 16.1.6 → 16.2.3 to resolve vulnerabilities (#1001)
- Resolve Dependabot vulnerabilities (#993)
9.3.0 - 2026-04-11​
Added​
- Multi-provider agent harness with zero external LLM SDK dependencies (#988)
- Provider classes:
OpenAI,AzureOpenAI,Anthropic,Gemini— each using raw HTTP (urllib + json) LLMauto-detect: selects provider from env vars (Azure > OpenAI > Anthropic > Gemini)- Streaming:
Agent.stream()+chat_stream()across all providers via SSE - Session manager: multi-session agents with context reset and structured handoff
- 101 tests all passing
- Provider classes:
- Add address converter module for post-merger admin units (#959)
Changed​
- Switch license from GPL-3.0 to Apache-2.0 (#956)
- Improve
sent_tokenizewith trained Punkt parameters (#970)- Replace hand-crafted abbreviation list (368) with data-trained parameters (1032 abbrevs, 378 sent_starters)
- Add numeric period handling (
1.500.000,15.12.2025) - Add ellipsis handling (
...mid-sentence) - Add Vietnamese academic abbreviations (
PGS.,TS.,BS.,GS.) - Accuracy on sentence-segmentation-1: 57.7% → 60.0%
Documentation​
- Restructure README to focus on Agent, move NLP details to NLP.md
- Rewrite README in English
- Add technical reports for Language Identification and POS Tagging (#954)
- Adopt Docusaurus-style versioning with version dropdown (#968)
Security​
- Bump dependency versions (#969, #971, #972, #973, #974, #976, #977, #978, #979, #980, #981, #982, #986)
9.2.10 - 2026-02-07​
Changed​
- Remove VERSION files and use
importlib.metadatafor dynamic versioning (#950, #951) - Use Rust TextClassifier with
.binmodels for classification (#935) - Update sentiment models to use underthesea_core TextClassifier (#946)
- Consolidate classification into single module (#935)
Added​
- Add pure Rust FastText inference to underthesea_core (#947)
- Add TextPreprocessor to underthesea_core for Vietnamese text preprocessing (#942)
- Add underthesea_core API documentation (#948)
- Add workflow to publish underthesea_core to crates.io (#943)
Documentation​
- Separate sidebars for Technical Reports, API Reference, Datasets, and Changelog (#945)
- Add blog posts for Rust-powered text classification and CRF (#934, #935)
- Rename docusaurus folder to docs (#933)
Security​
9.2.0 - 2026-01-31​
Added​
- Add Agent class with custom tools support using OpenAI function calling (GH-712)
- Add default tools: calculator, datetime, web_search, wikipedia, shell, python, file operations (GH-712)
Changed​
- Upgrade underthesea_core to 2.0.0 with L-BFGS optimizer (#899)
- 10x faster feature lookup with flat data structure
- 1.24x faster than python-crfsuite for word segmentation
- L-BFGS with OWL-QN for L1 regularization
9.1.5 - 2026-01-29​
Added​
- Add Agent API with OpenAI and Azure OpenAI support (GH-745, #890)
- Add ParserTrainer for dependency parsing (GH-392, #880)
- Add POS tagger training pipeline (GH-423, #883)
Documentation​
- Add Vietnamese News Dataset (UVN) documentation (GH-885, #888, #889)
- Add UVB dataset documentation (GH-720, #887)
- Add UUD-v0.1 dataset documentation (#886)
- Add UTS Dictionary dataset documentation (GH-622, #884)
9.1.4 - 2026-01-24​
Added​
Changed​
- Remove NLTK dependency (#879)
Security​
9.1.3 - 2026-01-24​
Added​
- Add dependency tree visualization (#867)
Changed​
- Support PyTorch v2 for dependency parsing (#871)
- Update CP_Vietnamese-VLC README with HuggingFace dataset (#872)
Fixed​
- Fix ValueError when loading DependencyParser from non-existent path (#873)
- Fix KeyError in Sentence.getattr (#870)
- Fix TTS UnicodeDecodeError on Windows (#869)
- Fix underthesea[voice] installation (#868)
9.1.2 - 2026-01-24​
Added​
- Add
labelsproperty toclassifyandsentimentfunctions (#865)
Fixed​
- Fix sklearn >= 1.5 compatibility for loaded models (#866)
9.1.1 - 2026-01-24​
Fixed​
- Fix VERSION file to match pyproject.toml
9.1.0 - 2026-01-24​
Added​
- Vietnamese-English translation module with
translate()function (#856) - English to Vietnamese translation example in README (#858)
Changed​
- Support Python 3.14, deprecate Python 3.9 (#862)
- Migrate from Flake8/Pylint to Ruff for linting (#857)
Fixed​
- Fix missing sdist (tar.gz) on PyPI for underthesea_core (#859)
8.3.0 - 2025-09-28​
Added​
- Train text classification model for dataset VNTC2017_BANK (#819)
- Add datasets UTS2017_Bank (#822)
- Add bank model (#824)
- Build wheels for macOS x86-64 (#820)
Removed​
- Remove flake8 as runtime dependency (#818)
8.2.0 - 2025-09-21​
Changed​
- Update project structure, create extensions/lab folder (#812)
- Create Sonar Core 1 - System Card (#813)
- Update output format of model sonar_core_1 (#815)
8.1.0 - 2025-09-21​
Fixed​
- Fix missing .pkl files (#809)
8.0.1 - 2025-09-21​
Fixed​
- Fix missing .txt files (#806)
Changed​
- Update publish distribution to PyPI workflow (#805)
Security​
- Security updates for dependencies
8.0.0 - 2025-09-20​
Added​
- Underthesea Languages v2 (#748)
- Interactive Page for Most Frequently Used Vietnamese Words (#756)
- Support Python 3.12, 3.13 (#777)
Changed​
Fixed​
6.8.4 - 2024-06-22​
Added​
- Add lang_detect module (#733)
Changed​
6.8.0 - 2023-09-23​
Added​
Changed​
- Code refactoring (#713)
Fixed​
- Fix permission errors on removing downloaded models (#715)
6.7.0 - 2023-07-28​
Added​
- Zero shot classification with OpenAI API (#700)
6.6.0 - 2023-07-27​
Fixed​
- Fix bug word_tokenize (#697)
6.5.0 - 2023-07-14​
Fixed​
- Fix text_normalizer token rules
6.4.0 - 2023-07-14​
Fixed​
- Fix fixed_words regex
6.3.0 - 2023-06-28​
Added​
- Support MacOS ARM
6.2.0 - 2023-03-04​
Added​
- Add Text to Speech API (#668)
- Provide training script for word segmentation, pos tagging, and NER (#666)
- Create UTS_Dictionary v1.0 datasets (#663)
6.1.4 - 2023-02-26​
Added​
- Support underthesea_core with Python 3.11 (#659)
6.1.2 - 2023-02-15​
Added​
- Add option fixed_words to tokenize and word_tokenize API (#649)
6.0.0 - 2023-01-01​
Changed​
- Version bump for 2023
1.4.1 - 2022-12-17​
Added​
- Create underthesea app
- Add viet2ipa module
- Training NER model with VLSP2016 dataset using BERT
Removed​
- Remove unidecode as a dependency
1.3.5 - 2022-10-31​
Added​
- Add Text Normalization module
- Release underthesea_core version 0.0.5a2
- Support GLIBC_2.17
Changed​
- Update resources path
Fixed​
- Fix function word_tokenize
1.3.4 - 2022-01-08​
Added​
- Demo chatbot with rasa
- Lite version of underthesea
- Add build for Windows
Changed​
- Increase word_tokenize speed 1.5 times
1.3.3 - 2021-09-02​
Changed​
- Update torch and transformer dependency
1.3.2 - 2021-08-04​
Added​
- Publish two ABSA open datasets
- Add pipeline folder
Changed​
- Migrate from travis-ci to github actions
- Update ParserTrainer
1.3.1 - 2021-01-11​
Added​
- Add ClassifierTrainer
- Add 3 new datasets
Changed​
- Compatible with newer version of scikit-learn
- Retrain classification and sentiment models
1.3.0 - 2020-12-11​
Added​
- Dependency Parsing
Removed​
- Remove languageflow dependency
- Remove tabulate dependency
1.0.0 - 2017-03-01​
Added​
- First release on PyPI
- First release on ReadTheDocs