Skip to main content

System Requirements

Minimum Requirements

ComponentRequirement
Python3.10+
RAM2 GB (GGUF Q4)
Disk~500 MB (model auto-downloaded)
eSpeak NGRequired
ComponentRecommendation
CPUModern i5/i7/M1+
RAM4 GB+
ModelGGUF Q4 or Q8

Streaming latency: under 300ms first chunk on modern i3/i5.

ComponentRecommendation
GPUNVIDIA with 4GB+ VRAM
Driver>= 570.65 (CUDA 12.8+)
ModelPyTorch 0.5B or 0.3B
BackendLMDeploy for maximum speed

Intel Arc GPU

Supported via PyTorch XPU (tested on Arc B580, A770 on Windows):

run setup_xpu_uv.bat
run run_xpu.bat

Tip: Intel Arc has high memory bandwidth — keep batch size high and minimize characters per chunk.

Model Sizes

ModelDiskRAM Usage
0.5B PyTorch~2 GB~3 GB
0.3B PyTorch~1.2 GB~2 GB
0.3B GGUF Q8~350 MB~500 MB
0.3B GGUF Q4~200 MB~300 MB

Models are cached at ~/.cache/huggingface/hub/ after first download.