Text-to-speech technology has evolved and become an essential tool for anyone involved in any media. Developers writing scripts for digital assistants and websites that require the text to be read aloud by a virtual assistant are essential applications of text-to-speech technology. In this article, we will explore how to convert text to speech in Python.
Python Speech Engines
Several speech engines have been developed and tuned for the Python programming language which provide us with incredibly realistic and natural-sounding audio. Following are some of the most commonly used Python speech engines:
1. Google Text-to-Speech (gTTS)
gTTS, the Google Text-to-Speech system, is a simple Python program for handling all computer-generated text-to-speech client requests. The program supports several languages and uses the Google Text-to-Speech service for its natural-sounding speech functionality.
2. Pyttsx3
pyttsx3 is a comprehensive text-to-speech library that is cross-platform, functioning on Windows, macOS, and Linux operating systems. It supports various text formats and can be easily integrated with the Python environment, making it a popular choice for many developers.
3. Festival
Festival is a multicore speech synthesis tool managed and supported by the Institute for Language and Speech Technology at the University of Edinburgh. It is written in C and can be implemented in any programming language as it has a direct command line interface and a library called SPTK.
Converting Text-To-Speech in Python
Now that we have a good understanding of the most commonly used speech engines, let’s explore how they can be integrated with Python to convert text into speech. Here is a sample Python code snippet that uses the gTTS and pyttsx3 Python libraries:
from gtts import gTTS
import os
#Converting Text to Speech using gTTs
mytext = 'Python is becoming an increasingly popular programming language!'
language = 'en'
myobj = gTTS(text=mytext, lang=language, slow=False)
myobj.save("welcome.mp3")
os.system("mpg321 welcome.mp3")
#Converting Text to Speech using pyttsx3
import pyttsx3
engine = pyttsx3.init()
engine.say("Python is becoming an increasingly popular programming language!")
engine.runAndWait()
The above code defines how you can convert text to audio files in MP3 format and save them to be played later.
Your text-to-speech output can also be configured in Python with the help of pyttsx3. The engine takes input in a text format and generates audio in real-time. The generated audio can be heard through your speakers or from a connected audio device attached to the computer.
Q: Do we need an internet connection for using the gTTS engine?
A: Yes, gTTS is an online text-to-speech engine that uses the Google Text-to-Speech service to generate audio from your text. It requires an active internet connection to function.
Q: Can we use a sound file to test the pyttsx3 engine?
A: Yes, you can use any sound file in WAV format for testing the pyttsx3 engine. You need to specify the path and name of the sound file in the engine.say() function.
Conclusion
There’s an ever-increasing demand for effortless and easy-to-use speech synthesis solutions that can help meet the needs of a wide range of industries, from gaming to web development. With the help of Python, we can easily convert text to speech and bring our text to life, thereby providing an enhanced user experience.