Skip to Content

Voice

The Voice section lets you configure how your Twin sounds during voice conversations. You can clone your own voice, design a custom voice, or choose from the voice library.

Voice Options

Instant Clone

Quick voice cloning from audio samples. Upload existing audio or record directly in the dashboard.

Requirements: 2-4 minutes of clean audio

Best for: Most use cases, fast setup

Pro Voice

Advanced voice training with higher quality and emotional range. Requires identity verification.

Requirements: 1-2 hours of clean audio

Best for: Premium experiences, emotionally nuanced conversations

Availability: Add-on or included with Enterprise plans

Design Voice

AI-generated voices with custom parameters. Create a unique voice without recording samples.

Voice Library

Pre-built professional voices ready to use. Preview and select from available options.

Saved Voices

Your previously created voices. You can save up to 5 voices and switch between them.

Voice Settings

Fine-tune your voice with these controls:

SettingRangeDescription
Stability0-100How consistent the voice sounds
Similarity Boost0-100How close to the original voice
Style0-100Expressiveness level
SpeedAdjustableSpeaking pace
Speaker BoostOn/OffEnhanced voice clarity

Preview

Listen to any voice before selecting. Test how it sounds with sample text to ensure it matches your expectations.

Voice Sample Requirements

For the best clone results, your audio should be:

  • High quality — Clear audio with minimal background noise
  • Single speaker — Only your voice, no other speakers
  • Natural speech — Conversational tone, not stiff scripted reading
  • Varied content — Different phrases, questions, emotions
  • Consistent audio — Same recording environment throughout

Supported Formats

MP3, WAV, M4A, FLAC

Recommended: 44.1kHz or 48kHz sample rate, 24-bit depth

Multilingual Capabilities

Your voice clone works across all 23+ supported languages. Even if created from English audio, your Twin can speak fluently in other languages while maintaining your vocal characteristics.

Supported languages: Arabic, Chinese, Dutch, English, French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Korean, Polish, Portuguese, Romanian, Russian, Spanish, Swedish, Tamil, Turkish, Ukrainian, Vietnamese, and more.

Voice Clone Ownership

  • You own the recordings you provide
  • Your voice clone is never used for other customers
  • Voice data is stored securely and separately
  • Upon contract termination, voice clones are deleted per our data policy

Recording a Voice Sample

If you don’t have existing high-quality audio for voice cloning, follow these instructions to record a new voice sample.

Recording Setup

Before you begin:

  • Use a good microphone and keep a consistent distance from it throughout
  • Record in a quiet environment with no background sounds or music
  • Avoid long pauses or breaks
  • Speak at your natural pace and volume
  • Use your authentic speaking style and personality
  • Read through the entire script once before recording
  • If you have trouble with anything, just skip it or say something else instead, but stick to the script as much as possible

Estimated speaking time: 5 minutes at natural pace (for Standard Clone)

Equipment Recommendations

Minimum:

  • Smartphone with good microphone
  • Quiet room
  • Pop filter (or improvise with a sock over the mic)

Recommended:

  • USB condenser microphone (Blue Yeti, Audio-Technica AT2020)
  • Closed room with soft furnishings
  • Microphone stand
  • Pop filter

Professional:

  • Studio-quality condenser microphone
  • Treated recording space
  • Audio interface
  • Professional editing software

Recording Environment

  • Quiet Location: No background noise, traffic, or HVAC sounds
  • Minimal Echo: Avoid empty rooms; use soft furnishings to absorb sound
  • Consistent Acoustics: Record all audio in the same location
  • Eliminate Interruptions: Turn off notifications, close windows, etc.

Recording Technique

  • Consistent Distance: Stay 6-12 inches from the microphone
  • Natural Speaking: Use your normal conversational voice
  • Steady Pace: Not too fast or too slow
  • Clear Articulation: Enunciate words clearly without over-pronouncing
  • Varied Emotion: Include questions, statements, emphasis, warmth

After Recording

Once you have your recording:

  1. Upload to Dashboard: Go to dashboard.steno.ai  and upload your audio file
  2. Create Voice Clone: The Steno Twin Engine™ will process your audio and create your voice clone
  3. Test: Review the voice clone in your dashboard testing area

Voice Cloning Script

Use this script to record your voice sample. It’s designed to capture your tone, rhythm, and natural style across different speaking contexts.

Section 1: Introduction and Natural Conversation

Hello, and welcome to this voice recording session. My name is [speaker states name], and I’m excited to be part of this digital twin project. Today, I’ll read a script designed to capture not just my words, but my tone, rhythm, and natural style.

[Pause]

When I first heard about AI voice technology, I was curious—and a little skeptical. How could a computer capture the details of speech? The pauses, the emphasis, the subtle changes in tone? [Slight chuckle] But the truth is, the technology is now advanced enough to reflect not just what we say, but how we say it—whether our voice brightens with excitement, softens with care, or speeds up with passion.

Section 2: Professional and Technical Topics

A core concept in artificial intelligence is pattern recognition. Computers are excellent at finding patterns in massive amounts of data—faces in photos, text on a page, or the pitch and timing of speech.

[Thoughtful pause]

This is why industries like healthcare, finance, and education are using AI voices to deliver natural, helpful experiences—whether in customer support or online learning.

But with these possibilities come responsibilities: permission, honesty, and transparency. The goal is to enhance communication, not replace it.

Section 3: Emotional Range and Storytelling

[Warm, conversational tone]

Let me share a story. Years ago, I was part of a team struggling with a project. We had good skills and careful plans, but progress stalled.

[Building energy]

Then one teammate spoke with real passion—not about deadlines, but about the impact we could make. Their voice lit up, and suddenly the room shifted.

[Reflective]

That moment taught me that information alone is just data. But when emotion enters, communication becomes connection.

[Gentle, caring tone]

I think about this when mentoring. Often, the most valuable thing isn’t advice—it’s listening, showing care, and being fully present.

Section 4: Questions and Reflection

Here are a few questions I often ask: What does success really mean? Is it money, happiness, service to others? [Pause] Maybe it’s all of these.

How do we balance new technology with old wisdom? These aren’t easy questions, but the best answers come when we’re willing to hold different perspectives together.

Section 5: Practical Examples

[Helpful tone]

“Thanks for calling. The number I have is 532-467-4819—correct?”

[Professional]

“Our support line is 1-800-555-0123, extension 456, weekdays 9 to 6.”

[Casual]

“Call me back when you can. My direct line is 415-789-2634.”

Numbers and business terms matter too:

“The budget is $2,400,000.”

“Quarterly revenue rose 16%, reaching $847,000 in Q3.”

“Key launch date: March 15th, 2025.”

Technical phrases: machine learning, cloud computing, digital banking, patient care, investment management.

Section 6: Closing and Vision

[Warm, forward-looking]

We’ve covered technical ideas, emotional stories, numbers, and natural conversation. Each adds richness to this voice model.

[Hopeful]

Imagine education delivered globally while keeping a teacher’s real personality. Picture customer service that feels genuinely human, even when powered by AI.

[Balanced]

With this power comes responsibility. We must use it to enhance—not erode—connection.

[Personal, reflective]

For me, this project is about extending my ability to share ideas and support others beyond time and place.

[Warm conclusion]

Thank you for joining me today. I hope this recording helps create a voice clone that represents my style authentically. The future of AI and human collaboration is bright.

[Final pause]

This ends our session. Thank you.


Questions? Contact support@steno.ai.

Last updated on