🎙️ Build Your Own AI Voice Generator with edge-tts
Have you ever wanted your Python app to speak like a human? Not just robotic beeps or lifeless monotone voices—but actual realistic, human-like speech?
In this post, I’ll show you how to use a powerful library called edge-tts to convert any text into ultra-realistic AI speech, using just 5 lines of Python code. It’s fast, free, and ridiculously simple.
Let’s get started!
What You’ll Need
Before we jump in, here’s what you need:
Python 3.7+
Internet connection (since the TTS happens via Microsoft’s online engine)
A few seconds of your time
Step 1: Install the edge-tts Library
We’re using the edge-tts package, which is a wrapper around Microsoft Edge’s neural TTS (Text-to-Speech) service. It gives you access to the same ultra-realistic voices used in Azure, but with zero API keys or cost.
Open your terminal and run:
pip install edge-tts
That’s it. You’re ready to go!
✨ Step 2: Write the Magic Code
Here’s the full working code:
import asyncio import edge_tts
async def main(): tts = edge_tts.Communicate(“This is a test”, “en-US-JennyNeural”) await tts.save(“test.mp3”)
asyncio.run(main())
Let’s break this down:
edge_tts.Communicate(text, voice) creates the speech object.
await tts.save("filename.mp3") generates and saves the speech as an MP3 file.
asyncio.run(main()) kicks off the process.
Save the file as main.py and run:
python main.py
💡 Output: You’ll get a file called test.mp3 in the same folder. Open it and listen—it’s incredibly realistic!
🎤 Step 3: Customize the Voice
Microsoft offers dozens of voices, covering different languages, regions, genders, and tones.
To list all available voices, use:
python -m edge_tts –list-voices
This will output a huge list. Some popular ones include:
This simple TTS engine opens up tons of possibilities:
🎧 Narrate articles or blogs
📚 Generate audiobooks
🤖 Voice assistants
🎙️ Podcast automation
📢 Alert systems or voice UIs
Want to generate speech from long scripts or text files? Simply read from a file:
with open(“script.txt”, “r”, encoding=”utf-8″) as f: text = f.read()
Then pass text to the Communicate() function.
🛠️ Common Errors
If you see this error:
NoAudioReceived: No audio was received. Please verify that your parameters are correct.
Here are the usual fixes:
Check your internet connection.
Use a valid voice name (--list-voices is your friend).
Upgrade edge-tts to the latest version:
pip install –upgrade edge-tts
📦 Wrapping Up
In just a few lines of Python, you’ve built a fully working AI voice generator using Microsoft’s neural speech engine. The quality is good enough for production-level use—no joke.
Now that you’ve unlocked the power of speech, what will you build next?
💬 Got questions or cool ideas?
Drop a comment below or share this post with your fellow devs! And if you’re into Python AI projects like this, stay tuned—more tutorials are on the way.