Skip to main content
Version: 9.2.11

tts

Convert Vietnamese text to speech.

!!! note "Requires Additional Setup" This function requires extra dependencies and model download:

pip install "underthesea[voice]"
underthesea download-model VIET_TTS_V0_4_1

Usage​

from underthesea.pipeline.tts import tts

text = "Xin chào Việt Nam"
tts(text)
# Generates sound.wav in current directory

Function Signature​

def tts(text: str, outfile: str = "sound.wav", play: bool = False) -> tuple[int, np.ndarray]

Parameters​

ParameterTypeDefaultDescription
textstrThe Vietnamese text to convert to speech
outfilestr"sound.wav"Output audio file path
playboolFalseWhether to play audio after generation

Returns​

TypeDescription
tuple[int, np.ndarray]Sample rate (16000) and audio waveform array

Examples​

Basic Usage​

from underthesea.pipeline.tts import tts

# Generate speech
tts("Xin chào Việt Nam")
# Creates sound.wav

# Custom output file
tts("Hà Nội là thủ đô của Việt Nam", outfile="hanoi.wav")

# Generate and play immediately
tts("Xin chào", play=True)

Command Line Usage​

underthesea tts "Xin chào Việt Nam"
# Creates sound.wav and plays it

Generating Multiple Audio Files​

from underthesea.pipeline.tts import tts

sentences = [
("Xin chào", "hello.wav"),
("Tạm biệt", "goodbye.wav"),
("Cảm ơn", "thanks.wav")
]

for text, filename in sentences:
tts(text, outfile=filename)
print(f"Generated: {filename}")

Playing Audio (with external library)​

from underthesea.pipeline.tts import tts
import subprocess

# Generate audio
tts("Xin chào Việt Nam")

# Play audio (macOS)
subprocess.run(["afplay", "sound.wav"])

# Play audio (Linux with aplay)
# subprocess.run(["aplay", "sound.wav"])

Notes​

  • Uses the VietTTS model for high-quality Vietnamese speech synthesis
  • Output format is WAV audio at 16kHz sample rate
  • First call may take longer due to model loading
  • Requires downloading the TTS model before first use
  • Credits: Based on NTT123/vietTTS

Troubleshooting​

Model Not Found​

If you get a model not found error:

underthesea download-model VIET_TTS_V0_4_1

Audio Quality Issues​

  • Ensure input text is in Vietnamese
  • Longer sentences produce smoother audio
  • Punctuation affects prosody