Cosc 54730 Text to Speech And Speech To
- Slides: 25
Cosc 5/4730 Text to Speech And Speech To Text
Android TEXT TO SPEECH
Text to Speech • In Android 1. 6+ there is a native Text-to-speech built into the Android OS. – In 4. X+, Settings-> Language and Input -> Text-to-speech output • You can use the “Listen to an example” to see how it works. – android 10/11 system->Languages & input -> text-to-speech-output
How it works. • The Text-to-Speech (TTS) uses a the Pico engine – It sends the “speech” to the audio output. • There is only one TTS engine and it is share across all the activities on the device. – Other activities maybe using it – The user may have overridden the settings in the preferences as well.
Using the TTS • First we need to check if the TTS engine is available. – We can do this with a Intent to with ACTION_CHECK_TTS_DATA • Using start. Activity. For. Result, we then find out if the TTS engine is working and available. – In API 30+ (android 11) you need to add the manifest file queries <queries> <intent> <action android: name="android. intent. action. TTS_SERVICE" /> </intent> </queries>
Android. speech. tts • To use the TTS we need get access to it using the constructor • Text. To. Speech(Context context, Text. To. Speech. On. Init. Listener listener) – The constructor for the Text. To. Speech class. m. Tts = new Text. To. Speech(this, this); – First this, use the context of our application – Likely US-EN – Second this, the listener. • … Activity implements On. Init. Listener – @override public void on. Init(int status)
On. Init. Listener • on. Init(int status) – Called to signal the completion of the Text. To. Speech engine initialization. – Status is either • Text. To. Speech. SUCCESS – You can use it. – or • Text. To. Speech. ERROR – Failure, you can’t use it.
Using the TTS • To have it speak words – speak(String text, int queue. Mode, Hash. Map<String, String> params) • To stop, call stop() • Shutdown() to release everything
Example • m. Tts. speak(“Test”, Text. To. Speech. QUEUE_ADD, null, utterance. Id); • Where utterance. ID is a string, like "myid" – Can be used for the set. On. Utterance. Progress. Listener(Utterance. Progress. Listener)) – You should hear the word test spoken. – Text. To. Speech. FLUSH if you want to stop speaking and just say the current.
Other methods. • You can change the pitch and speech rate with – set. Pitch(float pitch) – set. Speech. Rate(float speech. Rate) • To find out if “it” is still speaking – Boolean is. Speaking() • To have the speech written to a file – synthesize. To. File(String text, Hash. Map<String, String> params, String filename) • Remember permission for writing to the file system.
Note • In the On. Pause() method – You should put at least a stop() call – You app has lost focus
Example code • text 2 speech example in github – Simple text box and button. Type in the words you want to speak and then press play. – If you are running the example on a phone • For fun, use the voice input (microphone on the keyboard) for the input and then have it read it back to you.
Android SPEECH TO TEXT
Speech To Text • Like Text to speech, we are going to call on another Google's voice recognition software. – Android. speech package – The simple version uses an intent and there is a dialog box for the users to know when to speech. • Recognizer. Intent – With a on. Activity. Result • A Note speech recognition doesn’t work in the emulators.
Simple version code • First get the recognize intent Intent intent = new Intent(Recognizer. Intent. ACTION_RECOGNIZE_SPEECH); • Specify the calling package to identify your application (this one is generic for any class you use) intent. put. Extra(Recognizer. Intent. EXTRA_CALLING_PACKAGE, get. Class(). get. Package(). get. Name()); • Display an hint to the user in the dialog box intent. put. Extra(Recognizer. Intent. EXTRA_PROMPT, "Say Something!"); • Given an hint to the recognizer about what the user is going to say intent. put. Extra(Recognizer. Intent. EXTRA_LANGUAGE_MODEL, Recognizer. Intent. LANGUAGE_MODEL_FREE_FORM); • Specify how many results you want to receive. The results will be sorted where the first result is the one with higher confidence. In this case max of 5 results intent. put. Extra(Recognizer. Intent. EXTRA_MAX_RESULTS, 5); • Now launch the activity for a result start. Activity. For. Result(intent, VOICE_RECOGNITION_REQUEST_CODE);
Simple version code (2) • When the recognition is done, results are returned to on. Activity. Result protected void on. Activity. Result(int request. Code, int result. Code, Intent data) { if (request. Code == VOICE_RECOGNITION_REQUEST_CODE && result. Code == RESULT_OK) { • Fill the list view with the strings the recognizer thought it could have heard, there should be at most 5, based on the call Array. List<String> matches = data. get. String. Array. List. Extra(Recognizer. Intent. EXTRA_RESULTS); • Now you deal with results in matches array. } • lastly send other results to the super since we are not dealing with them. super. on. Activity. Result(request. Code, result. Code, data); }
Speech. Recognizer class • A second version is more complex, but also removes the dialog box • Which many people want implement their own or just not have one. • You will need record_audio permission and internet. – <uses-permission android: name="android. permission. RECORD_AUDIO"/> – <uses-permission android: name="android. permission. INTERNET"/> • Get the speech recognizer and a Recognition. Listener – This still uses an intent as well. • Remember the recognition is done by Google's “cloud”.
Speech. Recognizer • First get the recognizer sr = Speech. Recognizer. create. Speech. Recognizer(this); • Set your listener. set. Recognition. Listener(new Recognitionlistener()); – Listener is on the next slide.
Recognition. Listener • create a Recognitionlistener and implement the following methods – void on. Beginning. Of. Speech() • The user has started to speak. – void on. Buffer. Received(byte[] buffer) • More sound has been received. – void on. End. Of. Speech() • Called after the user stops speaking. – void on. Error(int error) • A network or recognition error occurred. • Error codes are covered here – void on. Event(int event. Type, Bundle params) • Reserved for adding future events. – void on. Partial. Results(Bundle partial. Results) • Called when partial recognition results are available. – void on. Ready. For. Speech(Bundle params) • Called when the endpointer is ready for the user to start speaking. – void on. Results(Bundle results) • Called when recognition results are ready. – void on. Rms. Changed(float rmsd. B) • The sound level in the audio stream has changed.
Recognition. Listener (2) • on. Results methods – This is where you would pull out the results from the bundle – Array. List results = results. get. String. Array. List(Speech. Recognizer. RESULTS_RECOGNITION );
Start the recognition • As in the simple version we need an intent to start the recognition, but we are sending the intent through the Speech. Recognizer object, we declared in the beginning. – get the recognize intent Intent intent = new Intent(Recognizer. Intent. ACTION_RECOGNIZE_SPEECH); – Specify the calling package to identify your application intent. put. Extra(Recognizer. Intent. EXTRA_CALLING_PACKAGE, get. Class(). get. Package(). get. Name()); – Given an hint to the recognizer about what the user is going to say intent. put. Extra(Recognizer. Intent. EXTRA_LANGUAGE_MODEL, Recognizer. Intent. LANGUAGE_MODEL_FRE E_FORM); – Specified the max number of results intent. put. Extra(Recognizer. Intent. EXTRA_MAX_RESULTS, 5); – Use our Speech. Recognizer to send the intent. sr. start. Listening(intent); • The listener will now get the results.
Code Examples • Speak 2 Text demo shows you more information on using other languages for voice recognition, plus will speak the results back to you. • speech 2 txt. Demo is simplified voice recognition • speech 2 txt. Demo 2 is uses the Recognition. Listener.
i. Speech • There have SDK and API for blackberry, android, and iphone as well. – Text to speech • With many voice options as well – Speech to text • Limited to 100 word demo key per application launch. – License key removes the 100 word limit. • http: //www. ispeech. org/
References • http: //developer. android. com/resources/samples/Api. Demos/s rc/com/example/android/apis/app/Text. To. Speech. Activity. html • http: //developer. android. com/reference/android/speech/Spe ech. Recognizer. html • http: //developer. android. com/resources/samples/Api. Demos/s rc/com/example/android/apis/app/Voice. Recognition. html • http: //stackoverflow. com/questions/6316937/how-can-i-usespeech-recognition-without-the-annoying-dialog-in-androidphones
Q&A