Skip to main content
Version: Next 🚧

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Unreleased​

Added​

  • underthesea chat — local chat TUI bridging the claude CLI (#1021)
    • Spawns claude --print --output-format=stream-json --verbose per turn and parses the JSONL event stream (system / assistant / result), so the user's existing claude login subscription provides the LLM with no API key or per-token cost
    • Tracks session_id from the first system event and replays it as --resume <id> on subsequent turns to keep conversation context
    • Each invocation starts a fresh timestamped session (YYYY-MM-DD_HH-MM-SS) under ~/.underthesea/assistant/<name>.md (format compatible with the upcoming MarkdownMemory abstraction); pass --session <name> to resume a saved chat
    • Textual chat UI inspired by OpenClaw: no header bar, no widget borders, a full-width slate bar (#2a3447) for user input, plain text for assistant replies streamed token-by-token, a single-line dim status footer showing state | workspace/session | model | session_id | tokens/200k
    • ClaudeBridge is async-iterable and used standalone of the TUI; Session provides atomic markdown round-trip
    • New [assistant] optional extra (textual>=0.50, rich>=13.0); the rest of the assistant module is import-safe without it
    • 19 unit tests (tests/agent/assistant/test_bridge.py + test_session.py) and 3 visual snapshot tests via pytest-textual-snapshot (tests/agent/assistant/test_tui_snapshot.py) under a new [test-tui] extra
    • End-to-end terminal tests live in a sibling workspace_underthesea/e2e_tests/ Node project driven by @microsoft/tui-test, covering both the underthesea chat TUI and the upstream openclaw chat against a parallel persona workspace

Changed​

  • Rename CLI commands so the verb form matches the upstream OpenClaw surface (#1021)
    • underthesea chat — was underthesea assistant (the chat TUI introduced in this release)
    • underthesea agent — was underthesea chat (the Next.js + Node web app already shipped in v9.4)

Documentation​

  • Pivot the docs/research/personal-ai-os/ notes to a global growth positioning (#1020)
    • Roadmap rewritten as three milestones — M1 Smallest Agent Framework + Free TUI Assistant, M2 Personal AI OS in Pure Python, M3 Growth Amplification
    • Drop the multi-agent orchestrator phase from the 6-month plan; drop Vietnamese-specific channel work (Zalo) in favor of global ones (Telegram / Slack / Email / Discord); add explicit non-goal about not using Vietnamese identity as a marketing lever

[9.5.0] - 2026-05-17​

Added​

  • A2A-compatible HTTP server adapter underthesea.agent.server (#1016)
    • make_app(agent, path=...) — raw ASGI callable (no web framework dep at the library level); plug into uvicorn / hypercorn / daphne / granian
    • serve(agent, ui=True) — one-line uvicorn entrypoint with bundled chat UI at {path}/ui, JSON-RPC message/stream over HTTP+SSE, and a discoverable AgentCard at {path}/.well-known/agent-card.json
    • Per-session Agent clone per A2A contextId so each conversation keeps isolated history
    • Tool calls stream live as tool_call artifacts via a ContextVar hook around each Tool.func — the Agent class itself is unchanged
    • Sync Agent.__call__ offloaded via asyncio.to_thread to keep the event loop responsive
    • Bundled chat UI is generic: title from AgentCard, RPC URL auto-discovered from window.location, modal-based agent card viewer
    • New [agent-server] extra (uvicorn + starlette + httpx, version ranges aligned with google-adk-python)
    • 17 tests via httpx.ASGITransport; new ci-agent workflow runs them on every PR

Fixed​

  • agent/wiki.py ruff lint errors (unused os import, two B904 missing raise ... from) that previously blocked CI on any PR (#1016)

9.4.0 - 2026-04-11​

Added​

  • Agent tracing with local file and Langfuse support (#1003)
    • LocalTracer: auto-saves JSON traces to ~/.underthesea/traces/ with datetime-prefixed filenames
    • LangfuseTracer: sends traces to Langfuse v4 with full hierarchy (agent → generation → tool), token usage, and cost tracking
    • @trace decorator with contextvars-based nesting — inner calls become child spans automatically
    • Agent auto-traces by default (like Claude Code), disable with UNDERTHESEA_TRACE_DISABLED=1
    • Agent inherits @trace() context — no explicit tracer needed when called inside a decorated function
    • 48 tests covering LocalTracer, LangfuseTracer, decorator, Agent integration, context inheritance, env var toggle
  • Add Chat App for AI Agent (#893)

Changed​

  • Rebrand Underthesea as Agentic AI Toolkit from v9.3.0 (#997)

Documentation​

  • Add agent harness current state assessment and roadmap (#1002)
  • Update sponsor count (#998)

Security​

  • Bump Next.js 16.1.6 → 16.2.3 to resolve vulnerabilities (#1001)
  • Resolve Dependabot vulnerabilities (#993)

9.3.0 - 2026-04-11​

Added​

  • Multi-provider agent harness with zero external LLM SDK dependencies (#988)
    • Provider classes: OpenAI, AzureOpenAI, Anthropic, Gemini — each using raw HTTP (urllib + json)
    • LLM auto-detect: selects provider from env vars (Azure > OpenAI > Anthropic > Gemini)
    • Streaming: Agent.stream() + chat_stream() across all providers via SSE
    • Session manager: multi-session agents with context reset and structured handoff
    • 101 tests all passing
  • Add address converter module for post-merger admin units (#959)

Changed​

  • Switch license from GPL-3.0 to Apache-2.0 (#956)
  • Improve sent_tokenize with trained Punkt parameters (#970)
    • Replace hand-crafted abbreviation list (368) with data-trained parameters (1032 abbrevs, 378 sent_starters)
    • Add numeric period handling (1.500.000, 15.12.2025)
    • Add ellipsis handling (... mid-sentence)
    • Add Vietnamese academic abbreviations (PGS., TS., BS., GS.)
    • Accuracy on sentence-segmentation-1: 57.7% → 60.0%

Documentation​

  • Restructure README to focus on Agent, move NLP details to NLP.md
  • Rewrite README in English
  • Add technical reports for Language Identification and POS Tagging (#954)
  • Adopt Docusaurus-style versioning with version dropdown (#968)

Security​

9.2.10 - 2026-02-07​

Changed​

  • Remove VERSION files and use importlib.metadata for dynamic versioning (#950, #951)
  • Use Rust TextClassifier with .bin models for classification (#935)
  • Update sentiment models to use underthesea_core TextClassifier (#946)
  • Consolidate classification into single module (#935)

Added​

  • Add pure Rust FastText inference to underthesea_core (#947)
  • Add TextPreprocessor to underthesea_core for Vietnamese text preprocessing (#942)
  • Add underthesea_core API documentation (#948)
  • Add workflow to publish underthesea_core to crates.io (#943)

Documentation​

  • Separate sidebars for Technical Reports, API Reference, Datasets, and Changelog (#945)
  • Add blog posts for Rust-powered text classification and CRF (#934, #935)
  • Rename docusaurus folder to docs (#933)

Security​

  • Bump jsonpath, django, @isaacs/brace-expansion dependencies (#938, #939, #940, #944)

9.2.0 - 2026-01-31​

Added​

  • Add Agent class with custom tools support using OpenAI function calling (GH-712)
  • Add default tools: calculator, datetime, web_search, wikipedia, shell, python, file operations (GH-712)

Changed​

  • Upgrade underthesea_core to 2.0.0 with L-BFGS optimizer (#899)
    • 10x faster feature lookup with flat data structure
    • 1.24x faster than python-crfsuite for word segmentation
    • L-BFGS with OWL-QN for L1 regularization

9.1.5 - 2026-01-29​

Added​

  • Add Agent API with OpenAI and Azure OpenAI support (GH-745, #890)
  • Add ParserTrainer for dependency parsing (GH-392, #880)
  • Add POS tagger training pipeline (GH-423, #883)

Documentation​

  • Add Vietnamese News Dataset (UVN) documentation (GH-885, #888, #889)
  • Add UVB dataset documentation (GH-720, #887)
  • Add UUD-v0.1 dataset documentation (#886)
  • Add UTS Dictionary dataset documentation (GH-622, #884)

9.1.4 - 2026-01-24​

Added​

  • Implement Logistic Regression library in Rust (#878)
  • Implement CRF library in Rust (#876)

Changed​

  • Remove NLTK dependency (#879)

Security​

  • Fix Dependabot security vulnerabilities (#874, #875)

9.1.3 - 2026-01-24​

Added​

  • Add dependency tree visualization (#867)

Changed​

  • Support PyTorch v2 for dependency parsing (#871)
  • Update CP_Vietnamese-VLC README with HuggingFace dataset (#872)

Fixed​

  • Fix ValueError when loading DependencyParser from non-existent path (#873)
  • Fix KeyError in Sentence.getattr (#870)
  • Fix TTS UnicodeDecodeError on Windows (#869)
  • Fix underthesea[voice] installation (#868)

9.1.2 - 2026-01-24​

Added​

  • Add labels property to classify and sentiment functions (#865)

Fixed​

  • Fix sklearn >= 1.5 compatibility for loaded models (#866)

9.1.1 - 2026-01-24​

Fixed​

  • Fix VERSION file to match pyproject.toml

9.1.0 - 2026-01-24​

Added​

  • Vietnamese-English translation module with translate() function (#856)
  • English to Vietnamese translation example in README (#858)

Changed​

  • Support Python 3.14, deprecate Python 3.9 (#862)
  • Migrate from Flake8/Pylint to Ruff for linting (#857)

Fixed​

  • Fix missing sdist (tar.gz) on PyPI for underthesea_core (#859)

8.3.0 - 2025-09-28​

Added​

  • Train text classification model for dataset VNTC2017_BANK (#819)
  • Add datasets UTS2017_Bank (#822)
  • Add bank model (#824)
  • Build wheels for macOS x86-64 (#820)

Removed​

  • Remove flake8 as runtime dependency (#818)

8.2.0 - 2025-09-21​

Changed​

  • Update project structure, create extensions/lab folder (#812)
  • Create Sonar Core 1 - System Card (#813)
  • Update output format of model sonar_core_1 (#815)

8.1.0 - 2025-09-21​

Fixed​

  • Fix missing .pkl files (#809)

8.0.1 - 2025-09-21​

Fixed​

  • Fix missing .txt files (#806)

Changed​

  • Update publish distribution to PyPI workflow (#805)

Security​

  • Security updates for dependencies

8.0.0 - 2025-09-20​

Added​

  • Underthesea Languages v2 (#748)
  • Interactive Page for Most Frequently Used Vietnamese Words (#756)
  • Support Python 3.12, 3.13 (#777)

Changed​

  • Update PyO3 API usage (#768)
  • Update project structure (#790)

Fixed​

  • Fix wrong global var in sent_tokenize (#764)
  • Fix logo in Readme.rst (#761)

6.8.4 - 2024-06-22​

Added​

  • Add lang_detect module (#733)

Changed​

  • Optimize imports (#741)
  • Remove issue-manager workflow (#726)

6.8.0 - 2023-09-23​

Added​

  • Release Source Distribution for underthesea_core (#708)
  • Create docker image for underthesea (#711)

Changed​

  • Code refactoring (#713)

Fixed​

  • Fix permission errors on removing downloaded models (#715)

6.7.0 - 2023-07-28​

Added​

  • Zero shot classification with OpenAI API (#700)

6.6.0 - 2023-07-27​

Fixed​

  • Fix bug word_tokenize (#697)

6.5.0 - 2023-07-14​

Fixed​

  • Fix text_normalizer token rules

6.4.0 - 2023-07-14​

Fixed​

  • Fix fixed_words regex

6.3.0 - 2023-06-28​

Added​

  • Support MacOS ARM

6.2.0 - 2023-03-04​

Added​

  • Add Text to Speech API (#668)
  • Provide training script for word segmentation, pos tagging, and NER (#666)
  • Create UTS_Dictionary v1.0 datasets (#663)

6.1.4 - 2023-02-26​

Added​

  • Support underthesea_core with Python 3.11 (#659)

6.1.2 - 2023-02-15​

Added​

  • Add option fixed_words to tokenize and word_tokenize API (#649)

6.0.0 - 2023-01-01​

Changed​

  • Version bump for 2023

1.4.1 - 2022-12-17​

Added​

  • Create underthesea app
  • Add viet2ipa module
  • Training NER model with VLSP2016 dataset using BERT

Removed​

  • Remove unidecode as a dependency

1.3.5 - 2022-10-31​

Added​

  • Add Text Normalization module
  • Release underthesea_core version 0.0.5a2
  • Support GLIBC_2.17

Changed​

  • Update resources path

Fixed​

  • Fix function word_tokenize

1.3.4 - 2022-01-08​

Added​

  • Demo chatbot with rasa
  • Lite version of underthesea
  • Add build for Windows

Changed​

  • Increase word_tokenize speed 1.5 times

1.3.3 - 2021-09-02​

Changed​

  • Update torch and transformer dependency

1.3.2 - 2021-08-04​

Added​

  • Publish two ABSA open datasets
  • Add pipeline folder

Changed​

  • Migrate from travis-ci to github actions
  • Update ParserTrainer

1.3.1 - 2021-01-11​

Added​

  • Add ClassifierTrainer
  • Add 3 new datasets

Changed​

  • Compatible with newer version of scikit-learn
  • Retrain classification and sentiment models

1.3.0 - 2020-12-11​

Added​

  • Dependency Parsing

Removed​

  • Remove languageflow dependency
  • Remove tabulate dependency

1.0.0 - 2017-03-01​

Added​

  • First release on PyPI
  • First release on ReadTheDocs