微软Azure的TTS接口调用-Toy模板网

这篇具有很好参考价值的文章主要介绍了微软Azure的TTS接口调用。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

微软Azure的TTS批处理接口调用

实习让我学会了很多orz

由于要做公司产品和微软产品的对比，故尝试写一个python代码调用微软Azure语音合成API来实现批处理功能。
要实现批处理功能首先得有一个Azure的账号，且要使用标准计费档，免费档是不能使用批处理功能的，它会返回forbidden，拒绝访问。
首先是引入的库。

import subprocess 
import requests
import json
from time import sleep

然后是服务的key，要在微软Azure的平台上获取。把下面的SPEECH_KEY赋值为你的key就行，SPEECH_REGION也要换成相应的服务区域，例如我的就在eastus。

SPEECH_KEY = ""
SPEECH_REGION = "eastus"
text_path=""

get_data函数是我用来获取本地要合成的语音文本的，text_path为输入的文件路径。

def get_data():
    global text_path
    text_path=input()
    with open(text_path,'rb')as f:
        lines=[line[:].decode('utf-8') for line in f]
    data=''.join(lines)
    return data

send_data是post请求发送给Azure。

def send_data():
    url="https://"+SPEECH_REGION+".customvoice.api.speech.microsoft.com/api/texttospeech/3.1-preview1/batchsynthesis"
    data=get_data() #获取要合成文本数据
    headers = {
        "Content-Type": "application/json", # post提交数据的方式
        "Ocp-Apim-Subscription-Key": SPEECH_KEY, 
        "Connection":"Keep-Alive"
    }
    data = {
        "displayName": "batch synthesis sample",
        "description": "my test", 
        "textType": "PlainText", # 负载的数据为纯文本，还可以用SSML格式
        "inputs": [
            {
                "text": data # 待合成的数据文本
            }
        ],
        "properties": {
            "outputFormat": "riff-24khz-16bit-mono-pcm",
            "wordBoundaryEnabled": False,
            "sentenceBoundaryEnabled": False,
            "concatenateResult": True,
            "decompressOutputFiles": False
        },
        "synthesisConfig":{
            "voice": "zh-CN-XiaoxiaoNeural" # 这个是合成的语音音色
        }
    }
    response = requests.post(url, json=data, headers=headers)
    print(response.text) #得到返回信息
    print(response.status_code) #得到返回状态
    synthesis_id=response.json()['id'] #获取此次处理的id，用于后续的查询与语音的获取
    print("synthesis_id",synthesis_id)
    return synthesis_id

request_data是用来查询语音合成的状态的，看看有没有处理完毕。


def request_data(synthesis_id):
    url="https://"+SPEECH_REGION+".customvoice.api.speech.microsoft.com/api/texttospeech/3.1-preview1/batchsynthesis/"+synthesis_id
    #synthesis_id就是刚刚发送请求时得到的此次处理的id
    headers={
        "Ocp-Apim-Subscription-Key": SPEECH_KEY,
        "Connection":"Keep-Alive"
    }
    response = requests.get(url,headers=headers)
    print(response.text)
    print(response.status_code)
    status=response.json()['status'] #得到处理的状态
    while status!="Succeeded": #如果不是succeded，即此次处理为not start或者running，说明还没处理好
        response = requests.get(url,headers=headers) #再次查询
        status=response.json()['status'] #获取状态
        print(response.text)
        print(response.status_code)
        sleep(1) #等待一秒钟
    #print(response.json()['outputs'])
    download_url=response.json()['outputs']['result'] #得到下载语音的网址
    print(download_url)
    return download_url

download_data是下载处理好的语音用的。

def download_data(download_url):
    url=download_url #这是之前得到的网址
    # print(text_path)
    filename=text_path.split(".")
    # print(filename)
    filename=filename[0]+"_result.zip" # 数据名称
    # print(filename)
    response=requests.get(url)
    if response.status_code == 200: #如果获取成功
        with open(filename,'wb') as file:
            file.write(response.content) #把数据保存下来
        print("success")
    else:
        print(response.status_code)

下面是main函数。

def main():
    synthesis_id=send_data() #调用各个函数完成请求的发送查询与数据下载
    download_url=request_data(synthesis_id)
    download_data(download_url)

if __name__ == '__main__':
    main()