Skip to Content
Training & ContentVoice Cloning

Voice Cloning

The Steno Twin Engine uses advanced voice cloning technology to create a digital replica of your voice. You simply upload a few minutes of audio directly to your dashboard, and your voice clone is created instantly. Your AI Twin then speaks with your authentic voice during real-time conversations.

How Voice Cloning Works

Voice cloning analyzes audio samples of your voice to create a digital model that can generate natural-sounding speech. Your AI Twin uses this voice clone during real-time voice conversations with users—it sounds authentically like you, not like a robot.

What Makes a Good Voice Sample

For the best results, your audio sample should be:

  • High Quality: Clear audio with minimal background noise
  • Single Speaker: Only your voice, no other speakers
  • Natural Speech: Conversational tone, not stiff scripted reading
  • Varied Content: Different phrases, questions, emotions
  • Consistent Audio: Same recording environment and equipment

Voice Clone Types

Standard Voice Clone

Audio Required: 2-4 minutes of clean audio

Quality: High-fidelity voice replication suitable for most use cases

Included In: All plans (Starter, Growth, Scale, Enterprise)

Best For:

  • Most conversational AI applications
  • Text-to-speech with your voice
  • General customer engagement
  • Initial launches and testing

Characteristics:

  • Accurate voice replication
  • Natural conversational tone
  • Standard expressiveness
  • Instant creation through dashboard

Pro Voice (Professional Clone)

Audio Required: 1-2 hours of clean audio

Quality: Higher level of expressiveness and emotion

Availability: Available as an add-on, or included with Enterprise plans

Best For:

  • Premium customer experiences
  • Emotionally nuanced conversations
  • Coaching and personal development applications
  • Longer conversation sessions

Characteristics:

  • Enhanced emotional range
  • More natural prosody and intonation
  • Better handling of complex sentences
  • Subtle voice variations for emphasis

Voice Sample Preparation

If You Have Existing Audio

Great sources for voice samples include:

  • Podcast Episodes: High-quality conversational audio
  • Video Recordings: Extract audio from YouTube videos or courses
  • Webinar Recordings: Clear speaking with varied content
  • Professional Recordings: Studio-quality audio from books or courses

Upload your audio directly to your dashboard at dashboard.steno.ai .

If You Don’t Have Audio

No problem! If you don’t have a high-quality sample ready to go and are looking to record a new sample, we provide detailed instructions and a script you can use.

See Voice Sample Recording Instructions →

Testing Your Voice Clone

After uploading your audio and creating your voice clone, test it for:

  1. Accuracy: Does it sound like you?
  2. Naturalness: Does it sound conversational and not robotic?
  3. Clarity: Are words clear and easy to understand?
  4. Emotion: Does it convey appropriate warmth and tone?

Use the testing features in your dashboard to refine and adjust as needed.

Voice Clone Updates

When to Update Your Voice Clone

Consider updating your voice clone if:

  • Your voice has changed significantly
  • You want to adjust the tone or energy level
  • The original recording quality was poor
  • You’re upgrading from Standard to Pro Voice

How to Request an Update

Upload a new audio sample through your dashboard, or contact support@steno.ai for assistance.

Multilingual Capabilities

Voice Clone Language Flexibility

Even if your voice clone is created from English audio, your AI Twin can speak in over 23 supported languages. The AI Twin detects the incoming language (text or voice) and responds fluently without you needing to configure anything.

How It Works:

  • Voice characteristics transfer across languages
  • AI Twin maintains your vocal tone and style
  • Accent may vary by language
  • Quality is consistent across languages

Example: Your English voice clone can speak fluent Mandarin while maintaining your vocal characteristics.

Voice Clone Ownership

Your Rights

You own the recordings you provide for voice cloning. The voice clone itself is created as part of our service to you.

Our Commitments

  • We will never use your voice clone for other customers
  • Your voice data is stored securely and separately from other customers
  • Upon contract termination, we will delete your voice clone per our data deletion policy

Technical Specifications

Supported Audio Formats: MP3, WAV, M4A, FLAC

Sample Rate: Minimum 16kHz (44.1kHz or 48kHz recommended)

Bit Depth: Minimum 16-bit (24-bit recommended)

File Size: No strict limit, but larger files take longer to process

Questions about voice cloning? Contact support@steno.ai.

Last updated on