Voice

The Voice section lets you configure how your Twin sounds during voice conversations. You can clone your own voice, design a custom voice, or choose from the voice library.

Voice Options

Instant Clone

Quick voice cloning from audio samples. Upload existing audio or record directly in the dashboard.

Requirements: 2-4 minutes of clean audio

Best for: Most use cases, fast setup

Pro Voice

Advanced voice training with higher quality and emotional range. Requires identity verification.

Requirements: 1-2 hours of clean audio

Best for: Premium experiences, emotionally nuanced conversations

Availability: Add-on or included with Enterprise plans

Design Voice

AI-generated voices with custom parameters. Create a unique voice without recording samples.

Voice Library

Pre-built professional voices ready to use. Preview and select from available options.

Saved Voices

Your previously created voices. You can save up to 5 voices and switch between them.

Voice Settings

Fine-tune your voice with these controls:

Setting	Range	Description
Stability	0-100	How consistent the voice sounds
Similarity Boost	0-100	How close to the original voice
Style	0-100	Expressiveness level
Speed	Adjustable	Speaking pace
Speaker Boost	On/Off	Enhanced voice clarity

Preview

Listen to any voice before selecting. Test how it sounds with sample text to ensure it matches your expectations.

Voice Sample Requirements

For the best clone results, your audio should be:

High quality — Clear audio with minimal background noise
Single speaker — Only your voice, no other speakers
Natural speech — Conversational tone, not stiff scripted reading
Varied content — Different phrases, questions, emotions
Consistent audio — Same recording environment throughout

Supported Formats

MP3, WAV, M4A, FLAC

Recommended: 44.1kHz or 48kHz sample rate, 24-bit depth

Multilingual Capabilities

Your voice clone works across all 23+ supported languages. Even if created from English audio, your Twin can speak fluently in other languages while maintaining your vocal characteristics.

Supported languages: Arabic, Chinese, Dutch, English, French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Korean, Polish, Portuguese, Romanian, Russian, Spanish, Swedish, Tamil, Turkish, Ukrainian, Vietnamese, and more.

Voice Clone Ownership

You own the recordings you provide
Your voice clone is never used for other customers
Voice data is stored securely and separately
Upon contract termination, voice clones are deleted per our data policy

Recording a Voice Sample

If you don’t have existing high-quality audio for voice cloning, follow these instructions to record a new voice sample.

Recording Setup

Before you begin:

Use a good microphone and keep a consistent distance from it throughout
Record in a quiet environment with no background sounds or music
Avoid long pauses or breaks
Speak at your natural pace and volume
Use your authentic speaking style and personality
Read through the entire script once before recording
If you have trouble with anything, just skip it or say something else instead, but stick to the script as much as possible

Estimated speaking time: 5 minutes at natural pace (for Standard Clone)

Equipment Recommendations

Minimum:

Smartphone with good microphone
Quiet room
Pop filter (or improvise with a sock over the mic)

Recommended:

USB condenser microphone (Blue Yeti, Audio-Technica AT2020)
Closed room with soft furnishings
Microphone stand
Pop filter

Professional:

Studio-quality condenser microphone
Treated recording space
Audio interface
Professional editing software

Recording Environment

Quiet Location: No background noise, traffic, or HVAC sounds
Minimal Echo: Avoid empty rooms; use soft furnishings to absorb sound
Consistent Acoustics: Record all audio in the same location
Eliminate Interruptions: Turn off notifications, close windows, etc.

Recording Technique

Consistent Distance: Stay 6-12 inches from the microphone
Natural Speaking: Use your normal conversational voice
Steady Pace: Not too fast or too slow
Clear Articulation: Enunciate words clearly without over-pronouncing
Varied Emotion: Include questions, statements, emphasis, warmth

After Recording

Once you have your recording:

Upload to Dashboard: Go to dashboard.steno.ai and upload your audio file
Create Voice Clone: The Steno Twin Engine™ will process your audio and create your voice clone
Test: Review the voice clone in your dashboard testing area

Voice Cloning Script

Use this script to record your voice sample. It’s designed to capture your tone, rhythm, and natural style across different speaking contexts.

Section 1: Introduction and Natural Conversation

Hello, and welcome to this voice recording session. My name is [speaker states name], and I’m excited to be part of this digital twin project. Today, I’ll read a script designed to capture not just my words, but my tone, rhythm, and natural style.

[Pause]

When I first heard about AI voice technology, I was curious—and a little skeptical. How could a computer capture the details of speech? The pauses, the emphasis, the subtle changes in tone? [Slight chuckle] But the truth is, the technology is now advanced enough to reflect not just what we say, but how we say it—whether our voice brightens with excitement, softens with care, or speeds up with passion.

Section 2: Professional and Technical Topics

A core concept in artificial intelligence is pattern recognition. Computers are excellent at finding patterns in massive amounts of data—faces in photos, text on a page, or the pitch and timing of speech.

[Thoughtful pause]

This is why industries like healthcare, finance, and education are using AI voices to deliver natural, helpful experiences—whether in customer support or online learning.

But with these possibilities come responsibilities: permission, honesty, and transparency. The goal is to enhance communication, not replace it.

Section 3: Emotional Range and Storytelling

[Warm, conversational tone]

Let me share a story. Years ago, I was part of a team struggling with a project. We had good skills and careful plans, but progress stalled.

[Building energy]

Then one teammate spoke with real passion—not about deadlines, but about the impact we could make. Their voice lit up, and suddenly the room shifted.

[Reflective]

That moment taught me that information alone is just data. But when emotion enters, communication becomes connection.

[Gentle, caring tone]

I think about this when mentoring. Often, the most valuable thing isn’t advice—it’s listening, showing care, and being fully present.

Section 4: Questions and Reflection

Here are a few questions I often ask: What does success really mean? Is it money, happiness, service to others? [Pause] Maybe it’s all of these.

How do we balance new technology with old wisdom? These aren’t easy questions, but the best answers come when we’re willing to hold different perspectives together.

Section 5: Practical Examples

[Helpful tone]

“Thanks for calling. The number I have is 532-467-4819—correct?”

[Professional]

“Our support line is 1-800-555-0123, extension 456, weekdays 9 to 6.”

[Casual]

“Call me back when you can. My direct line is 415-789-2634.”

Numbers and business terms matter too:

“The budget is $2,400,000.”

“Quarterly revenue rose 16%, reaching $847,000 in Q3.”

“Key launch date: March 15th, 2025.”

Technical phrases: machine learning, cloud computing, digital banking, patient care, investment management.

Section 6: Closing and Vision

[Warm, forward-looking]

We’ve covered technical ideas, emotional stories, numbers, and natural conversation. Each adds richness to this voice model.

[Hopeful]

Imagine education delivered globally while keeping a teacher’s real personality. Picture customer service that feels genuinely human, even when powered by AI.

[Balanced]

With this power comes responsibility. We must use it to enhance—not erode—connection.

[Personal, reflective]

For me, this project is about extending my ability to share ideas and support others beyond time and place.

[Warm conclusion]

Thank you for joining me today. I hope this recording helps create a voice clone that represents my style authentically. The future of AI and human collaboration is bright.

[Final pause]

This ends our session. Thank you.

Questions? Contact support@steno.ai.