如何使用OpenAI的whisper

这篇具有很好参考价值的文章主要介绍了如何使用OpenAI的whisper。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

一、安装ffmpeg

yum localinstall --nogpgcheck https://download1.rpmfusion.org/free/el/rpmfusion-free-release-7.noarch.rpm
yum install ffmpeg ffmpeg-devel

二、安装torch等相关组件

conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia

conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=10.2 -c pytorch

三、安装Whisper

pip install git+https://github.com/openai/whisper.git

如果上述报错，就改为下面的方法：

pip install --upgrade pip
git clone git@github.com:openai/whisper.git
cd whisper/
pip install setuptools-rust
pip install -r requirements.txt
python setup.py develop

四、下载模型

import whisper
model = whisper.load_model("large")  # 此处会下载模型

模型的默认下载路径在：~/.cache/whisper/large-v2.pt
如果网速不佳，可以先在网速好的服务器上先下载好模型，再拷贝到本机

五、测试效果

从下面cpu的结果看，tiny模型的结果不忍直视，而large_model的耗时，也无法忍受。

模型名称	cpu执行时间	结果	gpu执行时间	占显存
large_model	15.5456秒	喂王阳能听到我说话吗今天天气怎么样	超过16G	超16G
medium_model	9.1108秒	喂,王阳,想听到我说话吗?今天天气怎么样?	1.7336秒	10G
small_model	3.2420秒	喂,完了,那听到我说话吗?今天天气怎么样?	1.1716秒	3.3G
base_model	1.5984秒	喂王雅能聽到我說話嗎今天天氣怎麼樣	0.3483秒	1.6G
tiny_model	1.0238秒	喂玩呀那听到我说话吗今天听见怎么样	0.2637秒	1.3G

六、cpu与gpu解码的耗时对比

如何使用OpenAI的whisper 文章来源地址https://www.toymoban.com/news/detail-482721.html

参考文献

https://www.assemblyai.com/blog/how-to-run-openais-whisper-speech-recognition-model/
https://github.com/AppleHolic/chatgpt-streamlit
https://github.com/openai/whisper
https://github.com/Joooohan/audio-recorder-streamlit

到了这里，关于如何使用OpenAI的whisper的文章就介绍完了。如果您还想了解更多内容，请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章，希望大家以后多多支持TOY模板网！