Skip to main content

Fast Mode (LMDeploy)

GPU-accelerated inference using LMDeploy for maximum throughput.

Requirements

NVIDIA GPU with 4GB+ VRAM
CUDA 12.8+ and NVIDIA GPU Computing Toolkit

Usage

from vieneu import Vieneu

tts = Vieneu(
    mode="fast",
    backbone_repo="pnnbao-ump/VieNeu-TTS",  # or VieNeu-TTS-0.3B
)

audio = tts.infer(text="Xin chào bạn")
tts.save(audio, "output.wav")

When to Use

High-volume batch processing
Server-side deployment
Maximum inference speed on GPU

For CPU-only usage, use standard mode with GGUF models instead.

Requirements
Usage
When to Use