llm + tts
Language model and speech synthesis running side by side in your browser.
No server, no API key. Nothing leaves your machine.
Mobile mode: running a smaller model on CPU via WebAssembly. Expect slower responses. Try a desktop browser for a much better experience with WebGPU acceleration.
LLM: SmolLM2-1.7B-Instruct, single-turn chat via WebGPU with WebLLM.
TTS: Quantized Kyutai Pocket-TTS model, custom Rust inference compiled to WebAssembly.
LLM: SmolLM2-360M-Instruct Q4_K_M (~271MB) via llama.cpp compiled to WebAssembly with wllama.
TTS: Quantized Kyutai Pocket-TTS model, custom Rust inference compiled to WebAssembly.