A lightweight, premium, formant-based text-to-speech engine written entirely in TypeScript.
tts-synth models the human vocal tract by simulating resonant frequencies (formants) to generate speech. Unlike modern neural-network-based TTS, it requires zero machine learning weights, has a near-zero memory footprint, and generates audio in real-time with extreme efficiency.
- 🧩 Zero Dependencies: Core synthesis logic is pure TypeScript/JavaScript.
- 🎙️ Formant Synthesis: Models human speech using 5-formant cascades (F1–F5).
- 📏 Compact & Fast: Generates WAV audio in milliseconds; ideal for low-resource environments.
- 🎚️ Deep Customization: Adjust
pitch(f0),speaking rate, and vowel/consonant characteristics. - 🌐 Browser & Node.js Ready: Works anywhere Node.js or a modern browser is available.
- 📄 Advanced Phonetic Engine: Robust Letter-to-Sound (LTS) rules for English pronunciation.
# Clone the repository
git clone https://github.com/raghav/tts-synth.git
cd tts-synth
# Install dependencies (only for development/demos)
npm install
# Build the project
npm run buildimport { toWav } from 'tts-synth';
import { writeFileSync } from 'node:fs';
// Simple one-liner to generate a WAV buffer
const wav = toWav('Hello, this is a test of the formant synthesizer.');
writeFileSync('output.wav', wav);Customize your voice by passing an options object to the synthesizer:
import { synthesize, toWav } from 'tts-synth';
// 1. Adjusting Pitch & Rate
const slowHighPitch = toWav('I am speaking slowly with a high voice.', {
pitch: 200, // Hz (default: 120)
rate: 0.6 // multiplier (default: 1.0)
});
// 2. Getting raw PCM data (Float32Array)
const pcm = synthesize('Extracting raw audio data.', {
pitch: 110,
rate: 1.2
});
console.log(`Generated ${pcm.length} samples at 22050Hz.`);The project includes several built-in scripts to help you get started:
| Script | Description |
|---|---|
npm run build |
Transpile TypeScript to the dist directory. |
npm run dev |
Watch mode for development. |
npm run demo |
Runs the Node.js example to generate several WAV files in the samples/ directory. |
npm run typecheck |
Run the TypeScript compiler in noEmit mode. |
- Node.js Example: Run
npm run demoto see the engine in action. It will savedemo-output.wav,demo-high-pitch.wav, anddemo-slow.wavin thesamples/root. - Browser Example: Open
examples/browser.htmlin any modern web browser to use the interactive synthesis UI.
Formant synthesis models the human vocal tract as a series of resonators.
- Excitation: A buzz (glottal pulse) or hiss (noise) is generated.
- Filtering: The signal passes through several band-pass filters (resonators) representing the oral and nasal cavities.
-
Formants: By shifting the frequencies (
$F_1, F_2, F_3$ ) of these filters, the engine transitions between different vowels and consonants.
This project implements a simplified version of the Klatt synthesis model, using rules to map textual phonemes to specific formant trajectories over time.
This project is licensed under the MIT License - feel free to use it for whatever purpose you like!