【小沐学Python】Python实现语音识别(SpeechRecognition)

这篇具有很好参考价值的文章主要介绍了【小沐学Python】Python实现语音识别(SpeechRecognition)。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

1、简介

https://pypi.org/project/SpeechRecognition/
https://github.com/Uberi/speech_recognition

SpeechRecognition用于执行语音识别的库,支持多个引擎和 API,在线和离线。

【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi

Speech recognition engine/API 支持如下接口:
【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi

recognize_bing():Microsoft Bing Speech
recognize_google(): Google Web Speech API
recognize_google_cloud():Google Cloud Speech - requires installation of the google-cloud-speech package
recognize_houndify(): Houndify by SoundHound
recognize_ibm():IBM Speech to Text
recognize_sphinx():CMU Sphinx - requires installing PocketSphinx
recognize_wit():Wit.ai

以上几个中只有 recognition_sphinx()可与CMU Sphinx 引擎脱机工作, 其他六个都需要连接互联网。另外,SpeechRecognition 附带 Google Web Speech API 的默认 API 密钥,可直接使用它。其他的 API 都需要使用 API 密钥或用户名/密码组合进行身份验证。

2、安装和测试

  • Python 3.8+ (required)

  • PyAudio 0.2.11+ (required only if you need to use microphone input, Microphone)

  • PocketSphinx (required only if you need to use the Sphinx recognizer, recognizer_instance.recognize_sphinx)

  • Google API Client Library for Python (required only if you need to use the Google Cloud Speech API, recognizer_instance.recognize_google_cloud)

  • FLAC encoder (required only if the system is not x86-based Windows/Linux/OS X)

  • Vosk (required only if you need to use Vosk API speech recognition recognizer_instance.recognize_vosk)

  • Whisper (required only if you need to use Whisper recognizer_instance.recognize_whisper)

  • openai (required only if you need to use Whisper API speech recognition recognizer_instance.recognize_whisper_api)

2.1 安装python

https://www.python.org/downloads/
【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi

2.2 安装SpeechRecognition

安装库SpeechRecognition:

#python -m pip install --upgrade pip
#pip install 包名 -i https://pypi.tuna.tsinghua.edu.cn/simple/
#pip install 包名 -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com
#pip install 包名 -i https://pypi.org/simple
pip install SpeechRecognition

【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi
【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi

import speech_recognition as sr
print(sr.__version__)

【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi
麦克风的特定于硬件的索引获取:

import speech_recognition as sr
for index, name in enumerate(sr.Microphone.list_microphone_names()):
    print("Microphone with name \"{1}\" found for `Microphone(device_index={0})`".format(index, name))

【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi

2.3 安装pyaudio

pip install pyaudio

【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi

2.4 安装pocketsphinx(offline)

pip install pocketsphinx

或者https://www.lfd.uci.edu/~gohlke/pythonlibs/#pocketsphinx找到编译好的本地库文件进行安装。
【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi
在这里使用的是recognize_sphinx()语音识别器,它可以脱机工作,但是必须安装pocketsphinx库.
若要进行中文识别,还需要两样东西。
1、语音文件(SpeechRecognition对文件格式有要求);
SpeechRecognition支持语音文件类型:

WAV: 必须是 PCM/LPCM 格式
AIFF
AIFF-C
FLAC: 必须是初始 FLAC 格式;OGG-FLAC 格式不可用

2、中文声学模型、语言模型和字典文件;
pocketsphinx需要安装的中文语言、声学模型。

https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/Mandarin/

【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi
下载cmusphinx-zh-cn-5.2.tar.gz并解压:
【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi
【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi
在python安装目录下找到Lib\site-packages\speech_recognition:
【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi

点击进入pocketsphinx-data文件夹,并新建文件夹zh-CN:
在这个文件夹中添加进入刚刚解压的文件,需要注意:把解压出来的zh_cn.cd_cont_5000文件夹重命名为acoustic-model、zh_cn.lm.bin命名为language-model.lm.bin、zh_cn.dic中dic改为pronounciation-dictionary.dict格式。

【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi
编写脚本测试:

import speech_recognition as sr

r = sr.Recognizer()    #调用识别器
test = sr.AudioFile("chinese.flac")   #导入语音文件
with test as source:       
	# r.adjust_for_ambient_noise(source)
    audio = r.record(source) #使用 record() 从文件中获取数据
type(audio)
# c=r.recognize_sphinx(audio, language='zh-cn')     #识别输出
c=r.recognize_sphinx(audio, language='en-US')     #识别输出
print(c)
import speech_recognition as sr

# obtain path to "english.wav" in the same folder as this script
from os import path
AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), "english.wav")
# AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), "french.aiff")
# AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), "chinese.flac")

# use the audio file as the audio source
r = sr.Recognizer()
with sr.AudioFile(AUDIO_FILE) as source:
    audio = r.record(source)  # read the entire audio file

# recognize speech using Sphinx
try:
    print("Sphinx thinks you said " + r.recognize_sphinx(audio))
except sr.UnknownValueError:
    print("Sphinx could not understand audio")
except sr.RequestError as e:
    print("Sphinx error; {0}".format(e))

【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi

import speech_recognition as sr

recognizer = sr.Recognizer()
with sr.Microphone() as source:
	# recognizer.adjust_for_ambient_noise(source)
    audio = recognizer.listen(source)
c=recognizer.recognize_sphinx(audio, language='zh-cn')     #识别输出
# c=r.recognize_sphinx(audio, language='en-US')     #识别输出
print(c)
import speech_recognition as sr

# obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)

# recognize speech using Sphinx
try:
    print("Sphinx thinks you said " + r.recognize_sphinx(audio))
except sr.UnknownValueError:
    print("Sphinx could not understand audio")
except sr.RequestError as e:
    print("Sphinx error; {0}".format(e))

【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi

2.5 安装Vosk (offline)

python3 -m pip install vosk

【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi
您还必须安装 Vosk 模型:
以下是可供下载的模型。您必须将它们放在项目的模型文件夹中,例如“your-project-folder/models/your-vosk-model”
https://alphacephei.com/vosk/models

【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi
在测试脚本的所在文件夹,新建model子文件夹,然后把上面下载的模型解压到里面如下:
【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi
【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi
编写脚本:

import speech_recognition as sr
from vosk import KaldiRecognizer, Model

r = sr.Recognizer()
with sr.Microphone() as source:
    audio = r.listen(source, timeout=3, phrase_time_limit=3)

r.vosk_model = Model(model_name="vosk-model-small-cn-0.22")
text=r.recognize_vosk(audio, language='zh-cn') 
print(text)

【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi

2.6 安装Whisper(offline)

pip install zhconv
pip install whisper
pip install -U openai-whisper
pip3 install wheel
pip install soundfile

【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi
编写脚本:

import speech_recognition as sr
from vosk import KaldiRecognizer, Model

r = sr.Recognizer()
with sr.Microphone() as source:
    audio = r.listen(source, timeout=3, phrase_time_limit=5)

# recognize speech using whisper
try:
    print("Whisper thinks you said: " + r.recognize_whisper(audio, language="chinese"))
except sr.UnknownValueError:
    print("Whisper could not understand audio")
except sr.RequestError as e:
    print("Could not request results from Whisper")

【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi

3 测试

3.1 命令

python -m speech_recognition

【小沐学Python】Python实现语音识别(SpeechRecognition),Python,AI,python,语音识别,ai,vosk,whisper,pyaudio,fastapi

3.2 fastapi

import json
import os
from pprint import pprint

import speech_recognition
import torch
import uvicorn
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import soundfile
import whisper
import vosk

class ResponseModel(BaseModel):
    path: str


app = FastAPI()


def get_path(req: ResponseModel):
    path = req.path
    if path == "":
        raise HTTPException(status_code=400, detail="No path provided")

    if not path.endswith(".wav"):
        raise HTTPException(status_code=400, detail="Invalid file type")

    if not os.path.exists(path):
        raise HTTPException(status_code=404, detail="File does not exist")

    return path


@app.get("/")
def root():
    return {"message": "speech-recognition api"}


@app.post("/recognize-google")
def recognize_google(req: ResponseModel):
    path = get_path(req)
    r = speech_recognition.Recognizer()

    with speech_recognition.AudioFile(path) as source:
        audio = r.record(source)

    return r.recognize_google(audio, language='ja-JP', show_all=True)


@app.post("/recognize-vosk")
def recognize_vosk(req: ResponseModel):
    path = get_path(req)
    r = speech_recognition.Recognizer()

    with speech_recognition.AudioFile(path) as source:
        audio = r.record(source)

    return json.loads(r.recognize_vosk(audio, language='ja'))


@app.post("/recognize-whisper")
def recognize_whisper(req: ResponseModel):
    path = get_path(req)
    r = speech_recognition.Recognizer()

    with speech_recognition.AudioFile(path) as source:
        audio = r.record(source)

    result = r.recognize_whisper(audio, language='ja')
    try:
        return json.loads(result)
    except:
        return {"text": result}


if __name__ == "__main__":
    host = os.environ.get('HOST', '0.0.0.0')
    port: int = os.environ.get('PORT', 8080)

    uvicorn.run("main:app", host=host, port=int(port))

3.3 google

import speech_recognition as sr
import webbrowser as wb
import speak

chrome_path = 'C:/Program Files (x86)/Google/Chrome/Application/chrome.exe %s'

r = sr.Recognizer()

with sr.Microphone() as source:
    print ('Say Something!')
    audio = r.listen(source)
    print ('Done!')
 
try:
    text = r.recognize_google(audio)
    print('Google thinks you said:\n' + text)
    lang = 'en'

    speak.tts(text, lang)

    f_text = 'https://www.google.co.in/search?q=' + text
    wb.get(chrome_path).open(f_text)
 
except Exception as e:
    print (e)

3.4 recognize_sphinx


import logging
import speech_recognition as sr


def audio_Sphinx(filename):
    logging.info('开始识别语音文件...')
    # use the audio file as the audio source
    r = sr.Recognizer()
    with sr.AudioFile(filename) as source:
        audio = r.record(source)  # read the entire audio file

    # recognize speech using Sphinx
    try:
        print("Sphinx thinks you said: " + r.recognize_sphinx(audio, language='zh-cn'))
    except sr.UnknownValueError:
        print("Sphinx could not understand audio")
    except sr.RequestError as e:
        print("Sphinx error; {0}".format(e))    

if __name__ == "__main__":
    logging.basicConfig(level=logging.INFO)

    wav_num = 0
    while True:
        r = sr.Recognizer()
        #启用麦克风
        mic = sr.Microphone()
        logging.info('录音中...')
        with mic as source:
            #降噪
            r.adjust_for_ambient_noise(source)
            audio = r.listen(source)
        with open(f"00{wav_num}.wav", "wb") as f:
            #将麦克风录到的声音保存为wav文件
            f.write(audio.get_wav_data(convert_rate=16000))
        logging.info('录音结束,识别中...')

        target = audio_Sphinx(f"00{wav_num}.wav")
        wav_num += 1

3.5 语音生成音频文件

  • 方法1:

import speech_recognition as sr
 
# Use SpeechRecognition to record 使用语音识别包录制音频
def my_record(rate=16000):
    r = sr.Recognizer()
    with sr.Microphone(sample_rate=rate) as source:
        print("please say something")
        audio = r.listen(source)
 
    with open("voices/myvoices.wav", "wb") as f:
        f.write(audio.get_wav_data())
    print("录音完成!")
 
my_record()
  • 方法2:

import wave
from pyaudio import PyAudio, paInt16
 
framerate = 16000  # 采样率
num_samples = 2000  # 采样点
channels = 1  # 声道
sampwidth = 2  # 采样宽度2bytes
FILEPATH = 'voices/myvoices.wav'
 
 
def save_wave_file(filepath, data):
    wf = wave.open(filepath, 'wb')
    wf.setnchannels(channels)
    wf.setsampwidth(sampwidth)
    wf.setframerate(framerate)
    wf.writeframes(b''.join(data))
    wf.close()
 
 
#录音
def my_record():
    pa = PyAudio()
    #打开一个新的音频stream
    stream = pa.open(format=paInt16, channels=channels,
                     rate=framerate, input=True, frames_per_buffer=num_samples)
    my_buf = [] #存放录音数据
 
    t = time.time()
    print('正在录音...')
 
    while time.time() < t + 10:  # 设置录音时间(秒)
    	#循环read,每次read 2000frames
        string_audio_data = stream.read(num_samples)
        my_buf.append(string_audio_data)
    print('录音结束.')
    save_wave_file(FILEPATH, my_buf)
    stream.close()

结语

如果您觉得该方法或代码有一点点用处,可以给作者点个赞,或打赏杯咖啡;╮( ̄▽ ̄)╭
如果您感觉方法或代码不咋地//(ㄒoㄒ)//,就在评论处留言,作者继续改进;o_O???
如果您需要相关功能的代码定制化开发,可以留言私信作者;(✿◡‿◡)
感谢各位大佬童鞋们的支持!( ´ ▽´ )ノ ( ´ ▽´)っ!!!文章来源地址https://www.toymoban.com/news/detail-759009.html

到了这里,关于【小沐学Python】Python实现语音识别(SpeechRecognition)的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • 【小沐学Python】Python实现Web服务器(Flask打包部署上线)

    🍺基于Python的Web服务器系列相关文章编写如下🍺: 🎈【Web开发】Python实现Web服务器(Flask快速入门)🎈 🎈【Web开发】Python实现Web服务器(Flask案例测试)🎈 🎈【Web开发】Python实现Web服务器(Flask部署上线)🎈 🎈【Web开发】Python实现Web服务器(Tornado入门)🎈 🎈【Web开

    2024年02月12日
    浏览(60)
  • 【小沐学Python】Python实现在线电子书(Sphinx + readthedocs + github + Markdown)

    Sphinx 是一个 文档生成器 ,您也可以把它看成一种工具,它可以将一组纯文本源文件转换成各种输出格式,并且自动生成交叉引用、索引等。也就是说,如果您的目录包含一堆 reStructuredText 或 Markdown 文档,那么 Sphinx 就能生成一系列HTML文件,PDF文件(通过LaTeX),手册页等。

    2024年02月10日
    浏览(101)
  • 【小沐学NLP】Python实现聊天机器人(Selenium、七嘴八舌)

    🍺NLP开发系列相关文章编写如下🍺: 1 🎈【小沐学NLP】Python实现词云图🎈 2 🎈【小沐学NLP】Python实现图片文字识别🎈 3 🎈【小沐学NLP】Python实现中文、英文分词🎈 4 🎈【小沐学NLP】Python实现聊天机器人(ELIZA))🎈 5 🎈【小沐学NLP】Python实现聊天机器人(ALICE)🎈 6

    2024年02月08日
    浏览(76)
  • 【小沐学Python】Python实现在线电子书制作(Sphinx + readthedocs + github + Markdown)

    Sphinx 是一个 文档生成器 ,您也可以把它看成一种工具,它可以将一组纯文本源文件转换成各种输出格式,并且自动生成交叉引用、索引等。也就是说,如果您的目录包含一堆 reStructuredText 或 Markdown 文档,那么 Sphinx 就能生成一系列HTML文件,PDF文件(通过LaTeX),手册页等。

    2024年02月13日
    浏览(53)
  • 【小沐学NLP】Python实现聊天机器人(微软Azure机器人服务)

    🍺NLP开发系列相关文章编写如下🍺: 1 🎈【小沐学NLP】Python实现词云图🎈 2 🎈【小沐学NLP】Python实现图片文字识别🎈 3 🎈【小沐学NLP】Python实现中文、英文分词🎈 4 🎈【小沐学NLP】Python实现聊天机器人(ELIZA))🎈 5 🎈【小沐学NLP】Python实现聊天机器人(ALICE)🎈 6

    2024年02月12日
    浏览(75)
  • 【小沐学Python】Python实现Web服务器(Flask框架扩展:Flask-Admin)

    flask作为一个微框架,Flask 允许您以很少的开销构建 Web 服务。 它为您(设计师)提供了自由,以适合您的方式实施您的项目 特定应用。 一个最小的 Flask 应用如下: Flask-Admin是一个batteries-included,易于使用的Flask扩展,可让您 向 Flask 应用程序添加管理界面。它的灵感来自 d

    2024年02月02日
    浏览(89)
  • 【小沐学NLP】Python实现TF-IDF算法(nltk、sklearn、jieba)

    TF-IDF(term frequency–inverse document frequency)是一种用于信息检索与数据挖掘的常用加权技术。TF是词频(Term Frequency),IDF是逆文本频率指数(Inverse Document Frequency)。 TF-IDF是一种统计方法,用以评估一字词对于一个文件集或一个语料库中的其中一份文件的重要程度。字词的重要性随

    2024年02月03日
    浏览(75)
  • 【小沐学写作】免费在线AI辅助写作汇总

    自从chatgpt火了以后,AI工具爆发式增长,各种各样的AI工具层出不穷。有Ai写作、AI绘画、AI编程、AI视频、AI音频等等,今天为大家推荐的这几款AI辅助写作工具。 https://effidit.qq.com/demo 智能创作助手 Effidit(Efficient and Intelligent Editing) 是由腾讯 AI Lab 开发的一个研究性原型系统

    2024年02月04日
    浏览(57)
  • 【小沐学NLP】在线AI绘画网站(百度:文心一格)

    当下,越来越多AI领域前沿技术争相落地,逐步释放出极大的产业价值,其中最受关注的方向之一便是 大规模预训练模型(简称“大模型”),大模型不仅效果好、泛化能力强、通用性强,而且具有强大的生成能力。在此基础上,AIGC(Artificial Intelligence Generated Content,人工智

    2024年02月14日
    浏览(39)
  • 【小沐学NLP】在线AI绘画网站(网易云课堂:AI绘画工坊)

    Stable Diffusion是一种强大的图像生成AI,它可以根据输入的文字描述词(prompt)来绘制图像。在Stable Diffusion上完成优秀图像的制作需要有正确的模型+准确的提示词+参数调整+后期处理技术。 网易云课堂云课堂stable diffusion上线。 参与方式一 ① 进入网易云课(https://study.163.com

    2024年02月13日
    浏览(41)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包