🎙️ Build Your Own AI Voice Generator with edge-tts
Have you ever wanted your Python app to speak like a human? Not just robotic beeps or lifeless monotone voices—but actual realistic, human-like speech?
In this post, I’ll show you how to use a powerful library called edge-tts to convert any text into ultra-realistic AI speech, using just 5 lines of Python code. It’s fast, free, and ridiculously simple.
Let’s get started!
What You’ll Need
Before we jump in, here’s what you need:
Python 3.7+
Internet connection (since the TTS happens via Microsoft’s online engine)
A few seconds of your time
Step 1: Install the edge-tts Library
We’re using the edge-tts package, which is a wrapper around Microsoft Edge’s neural TTS (Text-to-Speech) service. It gives you access to the same ultra-realistic voices used in Azure, but with zero API keys or cost.
Open your terminal and run:
pip install edge-tts
That’s it. You’re ready to go!
✨ Step 2: Write the Magic Code
Here’s the full working code:
import asyncio import edge_tts
async def main(): tts = edge_tts.Communicate(“This is a test”, “en-US-JennyNeural”) await tts.save(“test.mp3”)
asyncio.run(main())
Let’s break this down:
edge_tts.Communicate(text, voice) creates the speech object.
await tts.save("filename.mp3") generates and saves the speech as an MP3 file.
asyncio.run(main()) kicks off the process.
Save the file as main.py and run:
python main.py
💡 Output: You’ll get a file called test.mp3 in the same folder. Open it and listen—it’s incredibly realistic!
🎤 Step 3: Customize the Voice
Microsoft offers dozens of voices, covering different languages, regions, genders, and tones.
To list all available voices, use:
python -m edge_tts –list-voices
This will output a huge list. Some popular ones include:
This simple TTS engine opens up tons of possibilities:
🎧 Narrate articles or blogs
📚 Generate audiobooks
🤖 Voice assistants
🎙️ Podcast automation
📢 Alert systems or voice UIs
Want to generate speech from long scripts or text files? Simply read from a file:
with open(“script.txt”, “r”, encoding=”utf-8″) as f: text = f.read()
Then pass text to the Communicate() function.
🛠️ Common Errors
If you see this error:
NoAudioReceived: No audio was received. Please verify that your parameters are correct.
Here are the usual fixes:
Check your internet connection.
Use a valid voice name (--list-voices is your friend).
Upgrade edge-tts to the latest version:
pip install –upgrade edge-tts
📦 Wrapping Up
In just a few lines of Python, you’ve built a fully working AI voice generator using Microsoft’s neural speech engine. The quality is good enough for production-level use—no joke.
Now that you’ve unlocked the power of speech, what will you build next?
💬 Got questions or cool ideas?
Drop a comment below or share this post with your fellow devs! And if you’re into Python AI projects like this, stay tuned—more tutorials are on the way.
And you can also watch the following tutorial video.