Voice Cloning
Steno.ai uses advanced voice cloning technology to create a digital replica of your voice, allowing your AI Twin to speak with your authentic voice during conversations.
How Voice Cloning Works
Voice cloning analyzes audio samples of your voice to create a digital model that can generate natural-sounding speech. The AI Twin uses this voice clone during real-time voice conversations with users.
What Makes a Good Voice Sample
For the best results, your audio sample should be:
- High Quality: Clear audio with minimal background noise
- Single Speaker: Only your voice, no other speakers
- Natural Speech: Conversational tone, not scripted reading
- Varied Content: Different phrases, questions, emotions
- Consistent Audio: Same recording environment and equipment
Voice Clone Types
Standard Voice Clone
Audio Required: 2 minutes of clean audio
Quality: High-fidelity voice replication suitable for most use cases
Included In: All plans (Starter, Growth, Scale, Enterprise)
Best For:
- Most conversational AI applications
- Text-to-speech with your voice
- General customer engagement
- Initial launches and testing
Characteristics:
- Accurate voice replication
- Natural conversational tone
- Standard expressiveness
- Fast generation time
Professional Voice Clone
Audio Required: 1 hour of clean audio
Quality: Higher level of expressiveness and emotion
Included In: Scale plan and above
Best For:
- Premium customer experiences
- Emotionally nuanced conversations
- Coaching and personal development applications
- Longer conversation sessions
Characteristics:
- Enhanced emotional range
- More natural prosody and intonation
- Better handling of complex sentences
- Subtle voice variations for emphasis
Voice Sample Preparation
If You Have Existing Audio
Great sources for voice samples include:
- Podcast Episodes: High-quality conversational audio
- Video Recordings: Extract audio from YouTube videos or courses
- Webinar Recordings: Clear speaking with varied content
- Professional Recordings: Studio-quality audio from books or courses
If You Don’t Have Audio
No problem! Our team will help you create a suitable voice sample:
For Standard Clone (2 minutes):
- We’ll provide a script to read
- Record on your phone or computer
- Use a quiet room with minimal echo
- Follow our recording guidelines
For Professional Clone (1 hour):
- We’ll provide varied scripts and prompts
- May require multiple recording sessions
- More emphasis on recording quality
- May recommend professional recording setup
Recording Best Practices
Equipment
Minimum:
- Smartphone with good microphone
- Quiet room
- Pop filter (or improvise with a sock over the mic)
Recommended:
- USB condenser microphone (Blue Yeti, Audio-Technica AT2020)
- Closed room with soft furnishings
- Microphone stand
- Pop filter
Professional:
- Studio-quality condenser microphone
- Treated recording space
- Audio interface
- Professional editing software
Recording Environment
- Quiet Location: No background noise, traffic, or HVAC sounds
- Minimal Echo: Avoid empty rooms; use soft furnishings to absorb sound
- Consistent Acoustics: Record all audio in the same location
- Eliminate Interruptions: Turn off notifications, close windows, etc.
Recording Technique
- Consistent Distance: Stay 6-12 inches from the microphone
- Natural Speaking: Use your normal conversational voice
- Steady Pace: Not too fast or too slow
- Clear Articulation: Enunciate words clearly without over-pronouncing
- Varied Emotion: Include questions, statements, emphasis, warmth
Voice Clone Delivery
Timeline
- Standard Clone: Ready within your 10-day demo delivery timeline
- Professional Clone: May add 2-3 days to demo delivery
Testing Your Voice Clone
When you receive your demo AI Twin, test the voice clone for:
- Accuracy: Does it sound like you?
- Naturalness: Does it sound conversational and not robotic?
- Clarity: Are words clear and easy to understand?
- Emotion: Does it convey appropriate warmth and tone?
If the voice clone doesn’t meet your expectations, we’ll iterate based on your feedback.
Voice Clone Updates
When to Update Your Voice Clone
Consider updating your voice clone if:
- Your voice has changed significantly
- You want to adjust the tone or energy level
- The original recording quality was poor
- You’re upgrading from Standard to Professional clone
How to Request an Update
Contact support@steno.ai with:
- Description of what you’d like to change
- New audio samples (if applicable)
- Timeline for the update
Updates may incur additional fees depending on the scope of changes required.
Multilingual Capabilities
Voice Clone Language Flexibility
Even if your voice clone is created from English audio, your AI Twin can speak in over 23 supported languages.
How It Works:
- Voice characteristics transfer across languages
- AI maintains your vocal tone and style
- Accent may vary by language
- Quality is consistent across languages
Example: Your English voice clone can speak fluent Mandarin while maintaining your vocal characteristics.
Voice Clone Ownership
Your Rights
You own the recordings you provide for voice cloning. The voice clone itself is created as part of our service to you.
Our Commitments
- We will never use your voice clone for other customers
- Your voice data is stored securely and separately from other customers
- Upon contract termination, we will delete your voice clone per our data deletion policy
Technical Specifications
Supported Audio Formats: MP3, WAV, M4A, FLAC
Sample Rate: Minimum 16kHz (44.1kHz or 48kHz recommended)
Bit Depth: Minimum 16-bit (24-bit recommended)
File Size: No strict limit, but larger files take longer to process
Questions about voice cloning? Contact support@steno.ai.