Category
page 1Speech synthesis
speech synthesis
artificial production of human speech
Daisy Bell
song written and composed by Harry Dacre
VoiceXML
VoiceXML (VXML) is a digital document standard for specifying interactive media and voice dialogs between humans and computers. It is used for developing audio and voice response applications, such as banking systems and automated customer service portals. VoiceXML applications are developed and deployed in a manner analogous to how a web browser interprets and visually renders the Hypertext Markup Language (HTML) it receives from a web server. VoiceXML documents are interpreted by a voice browser and in common deployment architectures, users interact with voice browsers via the public switche

15.ai
15.ai is a free non-commercial web application and research project that uses artificial intelligence to generate text-to-speech voices of fictional characters from popular media. Created by a pseudonymous artificial intelligence researcher known as 15, who began developing the technology as a freshman during their undergraduate research at the Massachusetts Institute of Technology (MIT), the application allows users to make characters from video games, television shows, and movies speak custom text with emotional inflections. The platform is able to generate convincing voice output using mini
ElevenLabs
ElevenLabs Inc. is a software company that specializes in developing natural-sounding speech synthesis software using deep learning.
Speech Synthesis Markup Language
XML-based markup language
Source–filter model
Represents speech as a combination of sound and linear filter
Google Wavenet
WaveNet is a deep neural network for generating raw audio. It was created by researchers at London-based AI firm DeepMind. The technique, outlined in a paper in September 2016, is able to generate relatively realistic-sounding human-like voices by directly modelling waveforms using a neural network method trained with recordings of real speech. Tests with US English and Mandarin reportedly showed that the system outperforms Google's best existing text-to-speech (TTS) systems, although as of 2016 its text-to-speech synthesis still was less convincing than actual human speech. WaveNet's ability
speech-generating device
augmenting speech device
Wolfgang von Kempelen's Speaking Machine
18th-century invention
Adobe VoCo
adobe prototype program for editing and generating audio in any voice
PSOLA
[[File:Analiza cech suprasegmentalnych języka polskiego Fig.7.1 (p.63).jpg|thumb|300px|Oscillograms, spectrograms and intonograms of Polish expression (a) "jajem" [egg] (b) "ja jem" [I'm eating] (c) "nawóz" [fertiliser] (d) "na wóz" [on a cart]]]
PSOLA (Pitch Synchronous Overlap and Add) is a digital signal processing technique used for speech processing and more specifically speech synthesis. It can be used to modify the pitch and duration of a speech signal. It was invented around 1986.
Euphonia
Musical instrument
Phase vocoder
vocoder algorithm
MBROLA
MBROLA is speech synthesis software as a worldwide collaborative project. The MBROLA project web page provides diphone databases for many spoken languages.
Mockingboard
thumb|Mockingboard v1 clone
thumb|Korean Mockingboard clone