Build Text to Speech App in JavaScript Using the Speech Synthesis API

In today’s digital world, user interaction has evolved thanks to innovations such as voice recognition and speech synthesis. JavaScript, a versatile language, plays a key role in this transformation by providing the ability to integrate voice-based features onto websites. Whether you’re developing a text-to-speech feature or extending accessibility with voice navigation, JavaScript’s Speech Synthesis API provides powerful tools for creating engaging, interactive, and inclusive web applications.

Understanding Voice in JavaScript

When we talk about “voice in JavaScript” or “JavaScript voice”, we are primarily referring to the Speech Synthesis API, which is part of the Web Speech API. This API enables text-to-speech functionality directly in the browser, allowing developers to create web applications that can convert text to spoken voice output without relying on third-party services. This feature is particularly valuable for applications that aim to improve accessibility or provide hands-free interaction.

How does JavaScript voice synthesis work?

Speech Synthesis API allows developers to add voices to their applications. It is useful for creating applications such as virtual assistants, text readers, and interactive learning tools. To use this API, you simply define the text you want to speak, select a voice, and start speech synthesis. Here is a description of its main components:

SpeechSynthesis Interface: The main interface, accessible through window.speechSynthsis, serves as a controller to manage various speech-related functions.
SpeechSynthesisUtterance: It represents spoken text and provides settings for voice, pitch, rate, and volume.
Voice Selection: Different voices are available based on language preferences and browser capabilities, allowing custom experiences based on users’ linguistic needs.

Creating a Text-to-Voice Converter in JavaScript

To explain how JavaScript voice synthesis works, let’s create a simple text-to-voice converter using the Speech Synthesis API. The tool enables users to input text, select a voice, and convert text to speech.

HTML Structure and Design

The basic HTML structure uses Bootstrap for styling and layout. Here’s an example setup:

				
					<link data-minify="1" rel="stylesheet" href="https://codingtutorials.in/wp-content/cache/min/1/npm/bootstrap@4.0.0/dist/css/bootstrap.min.css?ver=1732262914" crossorigin="anonymous">
<div class="container">
    <div class="row justify-content-center">
        <div class="col-lg-5">
            <div class="card" style="padding: 15px;">
                <h1>Text to Voice Converter</h1>
                <form>
                    <div class="form-group">
                        <label for="text">Enter your text:</label>
                        <textarea name="text" class="content form-control form-control-lg" rows="6"></textarea>
                    </div>
                    <div class="form-group">
                        <label for="voices">Choose your language:</label>
                        <select class="select-voices form-control form-control-lg" name="voices"></select>
                    </div>
                    <button type="button" class="convert btn btn-primary">🔊 Convert Text to Voice</button>
                </form>
            </div>
        </div>
    </div>
</div>

JavaScript Functionality

The JavaScript code below utilizes the Speech Synthesis API to enable text-to-speech functionality:

				
					const optionsContainer = document.querySelector(".select-voices");
const convertBtn = document.querySelector(".convert");

const synthesis = window.speechSynthesis;

function populateVoices() {
    const voices = synthesis.getVoices();
    optionsContainer.innerHTML = "";
    voices.forEach((voice) => {
        const option = document.createElement("option");
        option.value = voice.name;
        option.textContent = `${voice.name} (${voice.lang})`;
        optionsContainer.appendChild(option);
    });
}

synthesis.addEventListener("voiceschanged", populateVoices);
populateVoices();

convertBtn.addEventListener("click", function () {
    const convertText = document.querySelector(".content").value;

    if (convertText === "") {
        alert("Please provide some text");
        return;
    }

    const selectedVoice = optionsContainer.value;
    convertToSpeech(convertText, selectedVoice);
});

function convertToSpeech(text, voiceName) {
    if (!("speechSynthesis" in window)) {
        alert("Your browser does not support speech synthesis");
        return;
    }

    const utterance = new SpeechSynthesisUtterance(text);
    const voice = synthesis.getVoices().find(v => v.name === voiceName);
    if (voice) {
        utterance.voice = voice;
    }
    synthesis.speak(utterance);
}

Customizing Voice Properties

The SpeechSynthesisUtterance interface also allows you to control the voice characteristics:

Pitch: Adjusts the tone of the voice.
Rate: Controls the speed of the voice.
Volume: Sets the volume level.

				
					utterance.pitch = 1.5; // Higher values increase pitch
utterance.rate = 1; // Speed up or slow down the voice
utterance.volume = 0.8;

Use Case for JavaScript Voice

Integrating voice features opens a world of possibilities:

Accessibility: Voice output can assist visually impaired users by reading website content aloud.
Learning Tools: Educational websites can use text-to-speech for language learning or reading assistance.
Voice-Driven Apps: Interactive applications like virtual assistants and smart devices benefit from voice capabilities.
E-commerce and Customer Service: Businesses can use voice synthesis to guide customers, improving the user experience.

Limitations and Browser Support

While voice synthesis in JavaScript is incredibly useful, there are a few limitations:

Browser Support: Not all browsers support the Speech Synthesis API equally, with some browsers offering limited voice options.
Dependence on System Voices: Voice availability may differ depending on the user’s operating system and browser.

Conclusion

“JavaScript Voice” or “Voice in JavaScript” capabilities are powerful tools that bring voice synthesis directly into web applications. Using speech synthesis APIs, developers can create immersive and interactive experiences, increasing accessibility and user engagement. Whether you want to build a simple text-to-speech app or implement complex voice-driven applications, the Speech Synthesis API provides a strong foundation for integrating voice into JavaScript. With JavaScript, creating an accessible and sound-friendly web application has never been easier.