Speech to Text Conversion in React Native – Voice Recognition

React Native Speech to Text Conversion

This is an example to show how to do Speech to Text Conversion in React Native – Voice Recognition. This is a very demanding feature from many of the customers after the success of intelligent voice assistances like Google Home and Amazon Alexa. To make your app different from another app you can implement voice recognition features while making search filters. In the current scenario, many applications have voice search with text input search where they provide either search by text input or can search by voice.

Here we are going to see how can we convert voice to text and can get the texts as a result with the help of voice recognition.

Voice Recognition in React Native

For the voice recognition in React Native or we can say that for the speech to text conversion we are going to use Voice component provided by react-native-voice library which has a number of events which can be used to start or stop the voice recognition and to get the status and of voice recognition.

When we initialize the screen we set some event callbacks in the constructor which looks like the code snippet given below. You can see we are providing a function for the SpeechStart or for the Speech End and so on. So this is the callbacks basically which will be called automatically when the event happens.

Voice.onSpeechStart = this.onSpeechStart;
Voice.onSpeechEnd = this.onSpeechEnd;
Voice.onSpeechError = this.onSpeechError;
Voice.onSpeechResults = this.onSpeechResults;
Voice.onSpeechPartialResults = this.onSpeechPartialResults;
Voice.onSpeechVolumeChanged = this.onSpeechVolumeChanged;

So these were the callback events to get the status of voice recognition. Now let’s see how to start, stop, cancel and destroy the voice recognition process.

Start Voice Recognition

const startRecognizing = async () => {
  this.setState({
    pitch: '',
    error: '',
    started: '',
    results: [],
    partialResults: [],
    end: '',
  });

  try {
    await Voice.start('en-US');
  } catch (e) {
    //eslint-disable-next-line
    console.error(e);
  }
};

Stop Voice Recognition

const stopRecognizing = async () => {
  try {
    await Voice.stop();
  } catch (e) {
    //eslint-disable-next-line
    console.error(e);
  }
};

Cancel Voice Recognition

const cancelRecognizing = async () => {
  try {
    await Voice.cancel();
  } catch (e) {
    //eslint-disable-next-line
    console.error(e);
  }
};

Destroy the session of Voice Recognition

const destroyRecognizer = async () => {
  try {
    await Voice.destroy();
  } catch (e) {
    //eslint-disable-next-line
    console.error(e);
  }
  this.setState({
    pitch: '',
    error: '',
    started: '',
    results: [],
    partialResults: [],
    end: '',
  });
};

I think this is enough to know about the library, now let’s move towards the code. In this example, we are going to make a screen with a mike icon which will be the clickable button for us and after clicking onto that we will start voice recognition. In the whole process, we will get the status of everything in the functions we have set for the callback. We can also stop the speech to text conversion by using stop,  cancel and destroy button.

One more point I want to add in this is we will get two types of results while and after voice recognition which are:

  1. Result: It comes when Speech Recognizer finishes recognization.
  2. Partial Results: It comes during the computation of results, so this is the kind of words that voice recognizer recognizes before the final result. Partial results can be many for a single recognition as it is an intermediate result.

Now let’s get started with the example and see how to convert speech to text.

To Make a React Native App

Getting started with React Native will help you to know more about the way you can make a React Native project. We are going to use react-native init to make our React Native App. Assuming that you have node installed, you can use npm to install the react-native-cli command line utility. Open the terminal and go to the workspace and run

npm install -g react-native-cli

Run the following commands to create a new React Native project

react-native init ProjectName

If you want to start a new project with a specific React Native version, you can use the --version argument:

react-native init ProjectName --version X.XX.X
react-native init ProjectName --version react-native@next

This will make a project structure with an index file named App.js in your project directory.

Installation of Dependency

To use Voice component we have to install react-native-voice dependency. To install the dependency open the terminal and jump into your project

cd ProjectName

Now install the dependency

npm install react-native-voice --save

CocoaPods Installation

After the updation of React Native 0.60, they have introduced autolinking so we do not require to link the libraries but need to install pods. In this example, we need to install the pods for react-native-voice.

Please use the following command to install CocoaPods

cd ios && pod install && cd ..

Permission to use microphone and speech recognition for IOS

Please follow the below steps to add the permission in the iOS project to use the microphone and speech recognition.

Open the project SpeechToTextExample -> ios -> yourprj.xcworkspace in Xcode.

1. After opening the project in Xcode click on the project from the left sidebar and you will see multiple options in the right workspace.

2. Select info tab which is info.plist

3. Now add two permissions key “Privacy-Microphone Usage Description” and “Privacy-Speech Recognition Usage Description”. You can also set the value which will be visible when permission dialog pops up. Here are the screenshots below for the reference.

Code to Convert Speech to Text

Now Open App.js in any code editor and replace the code with the following code

App.js

// Speech to Text Conversion in React Native – Voice Recognition
// https://aboutreact.com/speech-to-text-conversion-in-react-native-voice-recognition/

// import React in our code
import React, {useState, useEffect} from 'react';

// import all the components we are going to use
import {
  SafeAreaView,
  StyleSheet,
  Text,
  View,
  Image,
  TouchableHighlight,
  ScrollView,
} from 'react-native';

// import Voice
import Voice from 'react-native-voice';

const App = () => {
  const [pitch, setPitch] = useState('');
  const [error, setError] = useState('');
  const [end, setEnd] = useState('');
  const [started, setStarted] = useState('');
  const [results, setResults] = useState([]);
  const [partialResults, setPartialResults] = useState([]);

  useEffect(() => {
    //Setting callbacks for the process status
    Voice.onSpeechStart = onSpeechStart;
    Voice.onSpeechEnd = onSpeechEnd;
    Voice.onSpeechError = onSpeechError;
    Voice.onSpeechResults = onSpeechResults;
    Voice.onSpeechPartialResults = onSpeechPartialResults;
    Voice.onSpeechVolumeChanged = onSpeechVolumeChanged;

    return () => {
      //destroy the process after switching the screen
      Voice.destroy().then(Voice.removeAllListeners);
    };
  }, []);

  const onSpeechStart = (e) => {
    //Invoked when .start() is called without error
    console.log('onSpeechStart: ', e);
    setStarted('√');
  };

  const onSpeechEnd = (e) => {
    //Invoked when SpeechRecognizer stops recognition
    console.log('onSpeechEnd: ', e);
    setEnd('√');
  };

  const onSpeechError = (e) => {
    //Invoked when an error occurs.
    console.log('onSpeechError: ', e);
    setError(JSON.stringify(e.error));
  };

  const onSpeechResults = (e) => {
    //Invoked when SpeechRecognizer is finished recognizing
    console.log('onSpeechResults: ', e);
    setResults(e.value);
  };

  const onSpeechPartialResults = (e) => {
    //Invoked when any results are computed
    console.log('onSpeechPartialResults: ', e);
    setPartialResults(e.value);
  };

  const onSpeechVolumeChanged = (e) => {
    //Invoked when pitch that is recognized changed
    console.log('onSpeechVolumeChanged: ', e);
    setPitch(e.value);
  };

  const startRecognizing = async () => {
    //Starts listening for speech for a specific locale
    try {
      await Voice.start('en-US');
      setPitch('');
      setError('');
      setStarted('');
      setResults([]);
      setPartialResults([]);
      setEnd('');
    } catch (e) {
      //eslint-disable-next-line
      console.error(e);
    }
  };

  const stopRecognizing = async () => {
    //Stops listening for speech
    try {
      await Voice.stop();
    } catch (e) {
      //eslint-disable-next-line
      console.error(e);
    }
  };

  const cancelRecognizing = async () => {
    //Cancels the speech recognition
    try {
      await Voice.cancel();
    } catch (e) {
      //eslint-disable-next-line
      console.error(e);
    }
  };

  const destroyRecognizer = async () => {
    //Destroys the current SpeechRecognizer instance
    try {
      await Voice.destroy();
      setPitch('');
      setError('');
      setStarted('');
      setResults([]);
      setPartialResults([]);
      setEnd('');
    } catch (e) {
      //eslint-disable-next-line
      console.error(e);
    }
  };

  return (
    <SafeAreaView style={styles.container}>
      <View style={styles.container}>
        <Text style={styles.titleText}>
          Speech to Text Conversion in React Native |
          Voice Recognition
        </Text>
        <Text style={styles.textStyle}>
          Press mike to start Recognition
        </Text>
        <View style={styles.headerContainer}>
          <Text style={styles.textWithSpaceStyle}>
            {`Started: ${started}`}
          </Text>
          <Text style={styles.textWithSpaceStyle}>
            {`End: ${end}`}
          </Text>
        </View>
        <View style={styles.headerContainer}>
          <Text style={styles.textWithSpaceStyle}>
            {`Pitch: \n ${pitch}`}
          </Text>
          <Text style={styles.textWithSpaceStyle}>
            {`Error: \n ${error}`}
          </Text>
        </View>
        <TouchableHighlight onPress={startRecognizing}>
          <Image
            style={styles.imageButton}
            source={{
              uri:
                'https://raw.githubusercontent.com/AboutReact/sampleresource/master/microphone.png',
            }}
          />
        </TouchableHighlight>
        <Text style={styles.textStyle}>
          Partial Results
        </Text>
        <ScrollView>
          {partialResults.map((result, index) => {
            return (
              <Text
                key={`partial-result-${index}`}
                style={styles.textStyle}>
                {result}
              </Text>
            );
          })}
        </ScrollView>
        <Text style={styles.textStyle}>
          Results
        </Text>
        <ScrollView style={{marginBottom: 42}}>
          {results.map((result, index) => {
            return (
              <Text
                key={`result-${index}`}
                style={styles.textStyle}>
                {result}
              </Text>
            );
          })}
        </ScrollView>
        <View style={styles.horizontalView}>
          <TouchableHighlight
            onPress={stopRecognizing}
            style={styles.buttonStyle}>
            <Text style={styles.buttonTextStyle}>
              Stop
            </Text>
          </TouchableHighlight>
          <TouchableHighlight
            onPress={cancelRecognizing}
            style={styles.buttonStyle}>
            <Text style={styles.buttonTextStyle}>
              Cancel
            </Text>
          </TouchableHighlight>
          <TouchableHighlight
            onPress={destroyRecognizer}
            style={styles.buttonStyle}>
            <Text style={styles.buttonTextStyle}>
              Destroy
            </Text>
          </TouchableHighlight>
        </View>
      </View>
    </SafeAreaView>
  );
};

export default App;

const styles = StyleSheet.create({
  container: {
    flex: 1,
    flexDirection: 'column',
    alignItems: 'center',
    padding: 5,
  },
  headerContainer: {
    flexDirection: 'row',
    justifyContent: 'space-between',
    paddingVertical: 10,
  },
  titleText: {
    fontSize: 22,
    textAlign: 'center',
    fontWeight: 'bold',
  },
  buttonStyle: {
    flex: 1,
    justifyContent: 'center',
    marginTop: 15,
    padding: 10,
    backgroundColor: '#8ad24e',
    marginRight: 2,
    marginLeft: 2,
  },
  buttonTextStyle: {
    color: '#fff',
    textAlign: 'center',
  },
  horizontalView: {
    flexDirection: 'row',
    position: 'absolute',
    bottom: 0,
  },
  textStyle: {
    textAlign: 'center',
    padding: 12,
  },
  imageButton: {
    width: 50,
    height: 50,
  },
  textWithSpaceStyle: {
    flex: 1,
    textAlign: 'center',
    color: '#B0171F',
  },
});

To Run the React Native App

Open the terminal again and jump into your project using.

cd ProjectName

To run the project on an Android Virtual Device or on real debugging device

react-native run-android

or on the iOS Simulator by running (macOS only)

react-native run-ios

Download Source Code

Output Screenshots

Android

Image   Image   Image   Image

iOS

Image   Image   Image

This is how you can do speech to text conversion / Voice Recognition in React Native. If you have any doubts or you want to share something about the topic you can comment below or contact us here. There will be more posts coming soon. Stay tuned!

Hope you liked it. 🙂

19 thoughts on “Speech to Text Conversion in React Native – Voice Recognition”

  1. Hey,
    Can you please update this code with react native latest version as i see this code is old using class not functional component.

    Reply
      • Thanks.
        Can you help me to achieve my requirement ?
        I am creating an app similar like find my phone.
        I want my app should always run in background and the microphone should continuously read the input if microphone found exact input then it should stop the listening…
        Here in above example i notice when i click on microphone it start listening(in my case it should start as soon i open the app i dont want to click on microphone) but when i don’t say anything thing it stop listening…so i don’t want it to stop listening untill it will receive the exact input it should continue listening for example “where is my phone” then it should stop listening.

        Thanks in advance

        Reply
        • Haha Sorry, For this you should go for Native development (although I am not sure you can do that in native too or not but you can give it a try)

          Running the service continuously in background is the most difficult task. I don’t think that can be a simple development but let’s see how people out there respond to your query.

          Does anybody have any idea about it?

          Reply
          • Oh! Actually i haven’t raise this question any where so was just asking you. I am absolute new in React Native as i am from Microsoft Dot Net background. Let me know if you found anything similar or any trick.

            Two more questions
            1. Do you have youtube channel relate to react navigation 5
            2. Any plan to creact a tutorial project in react native which can cover all the aspect like
            *Firebase (i saw some article for this on your website but asking for Database part both Realtime and Non Realtime)
            *AdMob
            *Payment Gateway
            *Redux
            *Animations and UI

            I mean a kind of shopping app development in which you can cover all this and many more topics all at once, As i found your website cool i stop searching else were because you are having a good collection but now i am hopping from you to make some app as i mention. It will be a great help for all of us.

            Cheers!!!!

  2. I tried to run this code on my macOS. The app runs fine with ios emulator after making the configuration changes in Xcode as you suggested. But, when tried to run on android emulator (on macOS only), I am getting message:9 insufficient permission error. Would you please help?

    Reply
  3. Hi,
    I am making a similar app but I want my app to listen to different languages not voice.start(‘en-US’) but instead this voice.start should select a different language everytime and convert the text into the language selected in it, so that I can copy the text and send that text as an SMS to anyone using an application like Instagram, Facebook, WhatsApp etc. Can you please help me so I could add the functionality to my application.

    Reply
  4. I would like to show exact word or sentence from my speech(not more than 2 to 3 words) from the result of words or sentences, I am able see correct word or sentence from result list. Is there any way to show only that correct word ? Please advice. Thank you

    Reply
    • Every word is correct according to the system. The concept it uses is to identify the voice and get any correct word similar to what you said(it may be possible that the identified word can not be the correct word in your context).

      Reply
  5. Hi Snehal,

    I tried the above the code, app loads perfectly but not able to record the speech, it keeps throwing this error.
    {“message”: “2/Network error”}.
    do let me know your inputs.

    Reply

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.