Real-Time Speech Recognition Using Python
A step-by-step guidance to create a simple real-time speech recogniser
1. Environment
1.1 Pyaudio
- Windows:
pip install pyaudio
- Linux:
sudo apt-get install python-pyaudio python3-pyaudio
- Mac OSX:
brew install portaudio
(need to install Homebrew on your mac first); thenpip install pyaudio
1.2 SpeechRecognition package
pip install SpeechRecognition
2. Example code
Here is an example for recognising ‘yes’ and ’no’:
import speech_recognition as sr
import os
# obtain audio from the microphone
r = sr.Recognizer()
t = True
while t:
with sr.Microphone() as source:
r.adjust_for_ambient_noise(source, duration = 0.5)
print('say something')
audio = r.record(source, duration = 2) # listen for 2 seconds
output = r.recognize_google(audio, show_all = True)
if (len(output) < 1):
print("Say louder") # if the recogniser did not recognise anything from the microphone, ask speaker to say louder
else:
possible = [word['transcript'] for word in output['alternative']] # extract all the possible phrase from return dictionary
if ("yes" in possible):
print("yes")
t = False
if ("no" in possible):
print("no")
t = False
else:
print("Say it again")