用Azure认知服务开发一个语音翻译机，学英文很爽快

这篇具有很好参考价值的文章主要介绍了用Azure认知服务开发一个语音翻译机，学英文很爽快。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

最近CSDN开展了《0元试用微软 Azure人工智能认知服务，精美礼品大放送》，当前目前活动还在继续，热心的我已经第一时间报名参与，只不过今天才有时间实际的试用。

目前活动要求博文形式分享试用语音转文本、文本转语音、语音翻译、文本分析、文本翻译、语言理解中三项以上的服务。

目前我在试用了语音转文本、文本转语音、语音翻译功能后，决定做一个实时语音翻译机，使用后效果是真不错。

下面我们看看如何操作吧，首先我们进入：https://portal.azure.cn/并登录。

获取密钥

在搜索框输入认知服务并确认：

用Azure认知服务开发一个语音翻译机，学英文很爽快

然后可以创建语音服务：

用Azure认知服务开发一个语音翻译机，学英文很爽快

然后输入名称，选择位置，选择免费定价，新增资源组并选择：

用Azure认知服务开发一个语音翻译机，学英文很爽快

之后，点击创建。创建过程中会显示正在部署：

用Azure认知服务开发一个语音翻译机，学英文很爽快

部署完成后，点击转到资源：

用Azure认知服务开发一个语音翻译机，学英文很爽快

然后我们点击密钥和终结点，查看密钥和位置/区域：

用Azure认知服务开发一个语音翻译机，学英文很爽快

有两个密钥任选一个即可，位置/区域也需要记录下来，后面我们的程序就需要通过密钥和位置来调用。

Azure 认知服务初体验

Azure 认知服务文档：https://docs.azure.cn/zh-cn/cognitive-services/

按文档要求，我们首先安装Azure 语音相关的python库：

pip install azure-cognitiveservices-speech

首先我们体验一下语音转文本：

测试语音转文本

文档：https://docs.azure.cn/zh-cn/cognitive-services/speech-service/get-started-speech-to-text?tabs=windowsinstall&pivots=programming-language-python

复制官方的代码后，简单修改下实现从麦克风识别语音：

import azure.cognitiveservices.speech as speechsdk

speech_key, service_region = "59392xxxxxxxxxx559de", "chinaeast2"
speech_config = speechsdk.SpeechConfig(
    subscription=speech_key, region=service_region, speech_recognition_language="zh-cn")
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config)

print("说：", end="")
result = speech_recognizer.recognize_once()
print(result.text)

speech_recognition_language决定了语言，这里我设置为中文。

我运行后，对麦克风说了一句话，程序已经准确的识别出我说的内容：

说：微软人工智能服务非常好用。

测试文本转语音

文档：https://docs.azure.cn/zh-cn/cognitive-services/speech-service/get-started-text-to-speech?tabs=script%2Cwindowsinstall&pivots=programming-language-python

借助文档我们还可以实现将转换完成的语音保存起来，但这里我只演示直接声音播放出来:

from azure.cognitiveservices.speech import AudioDataStream, SpeechConfig, SpeechSynthesizer, SpeechSynthesisOutputFormat
from azure.cognitiveservices.speech.audio import AudioOutputConfig

speech_config.speech_synthesis_language = "zh-cn"
audio_config = AudioOutputConfig(use_default_speaker=True)
speech_synthesizer = SpeechSynthesizer(
    speech_config=speech_config, audio_config=audio_config)

text_words = "微软人工智能服务非常好用。"
result = speech_synthesizer.speak_text_async(text_words).get()
if result.reason != speechsdk.ResultReason.SynthesizingAudioCompleted:
    print(result.reason)

感觉转换效果很好。

测试语音翻译功能

文档地址：https://docs.azure.cn/zh-cn/cognitive-services/speech-service/get-started-speech-translation?tabs=script%2Cwindowsinstall&pivots=programming-language-python

经测试，语音翻译同时包含了语音转文本和翻译功能：

from_language, to_language = 'zh-cn', 'en'
translation_config = speechsdk.translation.SpeechTranslationConfig(
    subscription=speech_key, region=service_region, speech_recognition_language=from_language)
translation_config.add_target_language(to_language)
recognizer = speechsdk.translation.TranslationRecognizer(
    translation_config=translation_config)


def speakAndTranslation():
    result = recognizer.recognize_once()
    if result.reason == speechsdk.ResultReason.TranslatedSpeech:
        return result.text, result.translations[to_language]
    elif result.reason == speechsdk.ResultReason.RecognizedSpeech:
        return result.text, None
    elif result.reason == speechsdk.ResultReason.NoMatch:
        print(result.no_match_details)
    elif result.reason == speechsdk.ResultReason.Canceled:
        print(result.cancellation_details)


speakAndTranslation()

这里执行后并说一句话，结果：

('大家好才是真的好。', 'Everyone is really good.')

可以同时获取原始文本和译文，所以我们后面的语音翻译工具，也都使用该接口。

语音翻译机开发

程序的大致逻辑结构：

用Azure认知服务开发一个语音翻译机，学英文很爽快

完整代码：

"""
小小明的代码
CSDN主页：https://blog.csdn.net/as604049322
"""
__author__ = '小小明'
__time__ = '2021/10/30'

import azure.cognitiveservices.speech as speechsdk

from azure.cognitiveservices.speech.audio import AudioOutputConfig

speech_key, service_region = "59xxxxde", "chinaeast2"
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region,
                                       speech_recognition_language="zh-cn")
speech_config.speech_synthesis_language = "zh-cn"
audio_config = AudioOutputConfig(use_default_speaker=True)
speech_synthesizer = speechsdk.SpeechSynthesizer(
    speech_config=speech_config, audio_config=audio_config)

from_language, to_language = 'zh-cn', 'en'
translation_config = speechsdk.translation.SpeechTranslationConfig(
    subscription=speech_key, region=service_region, speech_recognition_language=from_language)
translation_config.add_target_language(to_language)
recognizer = speechsdk.translation.TranslationRecognizer(
    translation_config=translation_config)


def speakAndTranslation():
    result = recognizer.recognize_once()
    if result.reason == speechsdk.ResultReason.TranslatedSpeech:
        return result.text, result.translations[to_language]
    elif result.reason == speechsdk.ResultReason.RecognizedSpeech:
        return result.text, None
    elif result.reason == speechsdk.ResultReason.NoMatch:
        print(result.no_match_details)
    elif result.reason == speechsdk.ResultReason.Canceled:
        print(result.cancellation_details)


def speak(text_words):
    result = speech_synthesizer.speak_text_async(text_words).get()
    #     print(result.reason)
    if result.reason == speechsdk.ResultReason.Canceled:
        cancellation_details = result.cancellation_details
        print("识别取消:", cancellation_details.reason)
        if cancellation_details.reason == speechsdk.CancellationReason.Error:
            if cancellation_details.error_details:
                print("错误详情：", cancellation_details.error_details)


while True:
    print("说：", end=" ")
    text, translation_text = speakAndTranslation()
    print(text)
    print("译文：", translation_text)
    if "退出" in text:
        break
    if text:
        speak(translation_text)

简单的运行了一下，中间的打印效果如下：

说： 我只想进转过山和大海。
译文： I just want to go in and out of the mountains and the sea.
说： 也穿越，人山人海。
译文： Also through, the sea of people and mountains.
说： 我曾经目睹这一切全部都随风飘然。
译文： I've seen it all blow in the wind.
说： 转眼成空。
译文： It's empty.
说： 问，世间能有几多愁？
译文： Q, how much worry can there be in the world?
说： 退出。
译文： quit.

最终的语音功能也只有各位亲自体验了噢。文章来源地址https://www.toymoban.com/news/detail-446730.html

到了这里，关于用Azure认知服务开发一个语音翻译机，学英文很爽快的文章就介绍完了。如果您还想了解更多内容，请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章，希望大家以后多多支持TOY模板网！