Voice Sample Recording
If you don’t have existing high-quality audio for voice cloning, follow these instructions to record a new voice sample.
Recording Setup
Before you begin:
- Use a good microphone and keep a consistent distance from it throughout
- Record in a quiet environment with no background sounds or music
- Avoid long pauses or breaks
- Speak at your natural pace and volume
- Use your authentic speaking style and personality
- Read through the entire script once before recording
- If you have trouble with anything, just skip it or say something else instead, but stick to the script as much as possible
Estimated speaking time: 5 minutes at natural pace (for Standard Clone)
Equipment Recommendations
Minimum
- Smartphone with good microphone
- Quiet room
- Pop filter (or improvise with a sock over the mic)
Recommended
- USB condenser microphone (Blue Yeti, Audio-Technica AT2020)
- Closed room with soft furnishings
- Microphone stand
- Pop filter
Professional
- Studio-quality condenser microphone
- Treated recording space
- Audio interface
- Professional editing software
Recording Environment
- Quiet Location: No background noise, traffic, or HVAC sounds
- Minimal Echo: Avoid empty rooms; use soft furnishings to absorb sound
- Consistent Acoustics: Record all audio in the same location
- Eliminate Interruptions: Turn off notifications, close windows, etc.
Recording Technique
- Consistent Distance: Stay 6-12 inches from the microphone
- Natural Speaking: Use your normal conversational voice
- Steady Pace: Not too fast or too slow
- Clear Articulation: Enunciate words clearly without over-pronouncing
- Varied Emotion: Include questions, statements, emphasis, warmth
Voice Cloning Script
Use this script to record your voice sample. It’s designed to capture your tone, rhythm, and natural style across different speaking contexts.
Section 1: Introduction and Natural Conversation
Hello, and welcome to this voice recording session. My name is [speaker states name], and I’m excited to be part of this digital twin project. Today, I’ll read a script designed to capture not just my words, but my tone, rhythm, and natural style.
[Pause]
When I first heard about AI voice technology, I was curious—and a little skeptical. How could a computer capture the details of speech? The pauses, the emphasis, the subtle changes in tone? [Slight chuckle] But the truth is, the technology is now advanced enough to reflect not just what we say, but how we say it—whether our voice brightens with excitement, softens with care, or speeds up with passion.
Section 2: Professional and Technical Topics
A core concept in artificial intelligence is pattern recognition. Computers are excellent at finding patterns in massive amounts of data—faces in photos, text on a page, or the pitch and timing of speech.
[Thoughtful pause]
This is why industries like healthcare, finance, and education are using AI voices to deliver natural, helpful experiences—whether in customer support or online learning.
But with these possibilities come responsibilities: permission, honesty, and transparency. The goal is to enhance communication, not replace it.
Section 3: Emotional Range and Storytelling
[Warm, conversational tone]
Let me share a story. Years ago, I was part of a team struggling with a project. We had good skills and careful plans, but progress stalled.
[Building energy]
Then one teammate spoke with real passion—not about deadlines, but about the impact we could make. Their voice lit up, and suddenly the room shifted.
[Reflective]
That moment taught me that information alone is just data. But when emotion enters, communication becomes connection.
[Gentle, caring tone]
I think about this when mentoring. Often, the most valuable thing isn’t advice—it’s listening, showing care, and being fully present.
Section 4: Questions and Reflection
Here are a few questions I often ask: What does success really mean? Is it money, happiness, service to others? [Pause] Maybe it’s all of these.
How do we balance new technology with old wisdom? These aren’t easy questions, but the best answers come when we’re willing to hold different perspectives together.
Section 5: Practical Examples
[Helpful tone]
“Thanks for calling. The number I have is 532-467-4819—correct?”
[Professional]
“Our support line is 1-800-555-0123, extension 456, weekdays 9 to 6.”
[Casual]
“Call me back when you can. My direct line is 415-789-2634.”
Numbers and business terms matter too:
“The budget is $2,400,000.”
“Quarterly revenue rose 16%, reaching $847,000 in Q3.”
“Key launch date: March 15th, 2025.”
Technical phrases: machine learning, cloud computing, digital banking, patient care, investment management.
Section 6: Closing and Vision
[Warm, forward-looking]
We’ve covered technical ideas, emotional stories, numbers, and natural conversation. Each adds richness to this voice model.
[Hopeful]
Imagine education delivered globally while keeping a teacher’s real personality. Picture customer service that feels genuinely human, even when powered by AI.
[Balanced]
With this power comes responsibility. We must use it to enhance—not erode—connection.
[Personal, reflective]
For me, this project is about extending my ability to share ideas and support others beyond time and place.
[Warm conclusion]
Thank you for joining me today. I hope this recording helps create a voice clone that represents my style authentically. The future of AI and human collaboration is bright.
[Final pause]
This ends our session. Thank you.
After Recording
Once you have your recording:
- Upload to Dashboard: Go to dashboard.steno.ai and upload your audio file
- Create Voice Clone: The Steno Twin Engine will process your audio and create your voice clone
- Test: Review the voice clone in your dashboard testing area
Technical Specifications
Supported Audio Formats: MP3, WAV, M4A, FLAC
Sample Rate: Minimum 16kHz (44.1kHz or 48kHz recommended)
Bit Depth: Minimum 16-bit (24-bit recommended)
File Size: No strict limit, but larger files take longer to process
Need Help?
Questions about recording your voice sample? Contact support@steno.ai.