For example, if you pronounce "elephant", it has a good chance of getting it right. At the same time, it understands practically everything when my girlfriend is speaking.ĭictionary: if you pronounce words which actually exist, speech recognition engine may improve its process by using a dictionary of words. Pronunciation: for instance, I'm not a native English speaker and have a poor accent, and when I tried to use Google's speech recognition, half of the time, Google understands something else. Studio microphones will give the best results, I imagine.Įnvironment: you'll have hard time making speech recognition work in a noisy environment compared to a quiet one (ideally a studio). Microphone: as you noted, a headset microphone is better than the one in your laptop. The quality of speech recognition depends on many parameters: Private void button3_Click(object sender, EventArgs e)īasically is there a way to create a program that can improve the application's UNDERSTANDING of what I am speaking? Like Windows Speech Recognition does by making us read text and then understanding how I speak words or whatever, except that that is too tedious :P TextBox1.Text = textBox1.Text + " " + e.Result.Text Void sRecognize_SpeechRecognized(object sender, SpeechRecognizedEventArgs e) SRecognize.RecognizeAsync(RecognizeMode.Multiple) SRecognize.SetInputToDefaultAudioDevice() SRecognize.SpeechRecognized += sRecognize_SpeechRecognized Grammar gr = new Grammar(new GrammarBuilder(sList)) Private void button2_Click(object sender, EventArgs e) Private void button1_Click(object sender, EventArgs e) Private void Form1_Load(object sender, EventArgs e) SpeechRecognitionEngine sRecognize = new SpeechRecognitionEngine() PromptBuilder pBuilder = new PromptBuilder() SpeechSynthesizer sSynth = new SpeechSynthesizer() Is there a way to improve my program? using System It's worse with the default microphone on my Lenovo laptop.īut it is weird that the Google speech recognition thing on the Google Search Engine works perfectly, with or without mike. This is extremely irritating and though it works with the microphone on my Logitech headset, it still doesn't recognize what I am saying sometimes. The Virtual Keyboard part and linking it to Speech Recognition will be done easily but the problem I am suffering is that the Speech Recognition is inefficient!įor example I say 'c' and it takes it as 'v' or something. Speech_recognizer.I want to create a Virtual Keyboard that can catch whatever key you 'speak' and send the keystroke to the active application. # stop continuous recognition on either session stopped or canceled events Speech_(lambda evt: bug('RECOGNIZING: '.format(evt))) # Connect callbacks to the events fired by the speech recognizer # Do something with the combined responses Speech_recognizer.stop_continuous_recognition() # Service callback that stops continuous recognition upon receiving an event `evt` Max_confidence_index = confidence_list_temp.index(max(confidence_list_temp))Ĭonfidence_list.append(response) Transcript_display_list.append(response)Ĭonfidence_list_temp = ] Speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config) # Creates a recognizer with the given settings Speech_config.output_format = speechsdk.OutputFormat(1) Speech_config.speech_recognition_language = locale Speech_config.request_word_level_timestamps() Speech_config = speechsdk.SpeechConfig(subscription=, region=) Locale = "en-US" # Change as per requirementĪudio_config = (filename=audio_filepath) Some error handling might be needed at places where speech to text could fail. This statement would allow you get the detailed json object from the azure sdk.īelow is a sample code. speech_config.output_format = speechsdk.OutputFormat(1) In the speech config of azure sdk will allow you to get the transcripts along with the timestamps for each word. Please set: speech_config.request_word_level_timestamps() Could you please try below? Let me know if you have more questions.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |