Skip to main content

Standard Mode

The default mode. Supports both GGUF quantized models (CPU) and PyTorch models (GPU/CPU).

GGUF (CPU Optimized)

from vieneu import Vieneu

# Default: loads 0.3B Q4 GGUF on CPU
tts = Vieneu()

# Or explicitly:
tts = Vieneu(
backbone_repo="pnnbao-ump/VieNeu-TTS-0.3B-q4-gguf",
backbone_device="cpu",
)

Available GGUF Models

ModelRepo ID
0.3B Q4 (default)pnnbao-ump/VieNeu-TTS-0.3B-q4-gguf
0.3B Q8pnnbao-ump/VieNeu-TTS-0.3B-q8-gguf
0.5B Q4pnnbao-ump/VieNeu-TTS-q4-gguf
0.5B Q8pnnbao-ump/VieNeu-TTS-q8-gguf

GGUF on GPU

tts = Vieneu(
backbone_repo="pnnbao-ump/VieNeu-TTS-0.3B-q8-gguf",
backbone_device="cuda",
)

PyTorch (Full Precision)

tts = Vieneu(
backbone_repo="pnnbao-ump/VieNeu-TTS-0.3B",
backbone_device="cuda", # or "cpu", "mps"
)

Batch Processing (PyTorch only)

texts = ["Câu một.", "Câu hai.", "Câu ba."]
audios = tts.infer_batch(texts)

Codec Options

CodecRepo IDNotes
DistillNeuCodec (default)neuphonic/distill-neucodecLightweight
NeuCodecneuphonic/neucodecFull quality
ONNX Int8neuphonic/neucodec-onnx-decoder-int8CPU only
tts = Vieneu(
codec_repo="neuphonic/neucodec-onnx-decoder-int8",
codec_device="cpu",
)