java - 如何使用 Amazon Polly 在 Java 中启用神经文本转语音 (NTTS)
问题描述
我正在尝试使用 Amazon Polly 使用 Java API 将文本转换为语音。正如亚马逊所描述的,有几种支持神经的美国英语语音。https://docs.aws.amazon.com/polly/latest/dg/voicelist.html
我在Java应用程序中运行的代码如下:
package com.amazonaws.demos.polly;
import java.io.IOException;
import java.io.InputStream;
import com.amazonaws.ClientConfiguration;
import com.amazonaws.auth.DefaultAWSCredentialsProviderChain;
import com.amazonaws.regions.Region;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.polly.AmazonPollyClient;
import com.amazonaws.services.polly.model.DescribeVoicesRequest;
import com.amazonaws.services.polly.model.DescribeVoicesResult;
import com.amazonaws.services.polly.model.OutputFormat;
import com.amazonaws.services.polly.model.SynthesizeSpeechRequest;
import com.amazonaws.services.polly.model.SynthesizeSpeechResult;
import com.amazonaws.services.polly.model.Voice;
import javazoom.jl.player.advanced.AdvancedPlayer;
import javazoom.jl.player.advanced.PlaybackEvent;
import javazoom.jl.player.advanced.PlaybackListener;
public class PollyDemo {
private final AmazonPollyClient polly;
private final Voice voice;
private static final String JOANNA="Joanna";
private static final String KENDRA="Kendra";
private static final String MATTHEW="Matthew";
private static final String SAMPLE = "Congratulations. You have successfully built this working demo of Amazon Polly in Java. Have fun building voice enabled apps with Amazon Polly (that's me!), and always look at the AWS website for tips and tricks on using Amazon Polly and other great services from AWS";
public PollyDemo(Region region) {
// create an Amazon Polly client in a specific region
polly = new AmazonPollyClient(new DefaultAWSCredentialsProviderChain(),
new ClientConfiguration());
polly.setRegion(region);
// Create describe voices request.
DescribeVoicesRequest describeVoicesRequest = new DescribeVoicesRequest();
// Synchronously ask Amazon Polly to describe available TTS voices.
DescribeVoicesResult describeVoicesResult = polly.describeVoices(describeVoicesRequest);
//voice = describeVoicesResult.getVoices().get(0);
voice = describeVoicesResult.getVoices().stream().filter(p -> p.getName().equals(MATTHEW)).findFirst().get();
}
public InputStream synthesize(String text, OutputFormat format) throws IOException {
SynthesizeSpeechRequest synthReq =
new SynthesizeSpeechRequest().withText(text).withVoiceId(voice.getId())
.withOutputFormat(format);
SynthesizeSpeechResult synthRes = polly.synthesizeSpeech(synthReq);
return synthRes.getAudioStream();
}
public static void main(String args[]) throws Exception {
//create the test class
PollyDemo helloWorld = new PollyDemo(Region.getRegion(Regions.US_WEST_1));
//get the audio stream
InputStream speechStream = helloWorld.synthesize(SAMPLE, OutputFormat.Mp3);
//create an MP3 player
AdvancedPlayer player = new AdvancedPlayer(speechStream,
javazoom.jl.player.FactoryRegistry.systemRegistry().createAudioDevice());
player.setPlayBackListener(new PlaybackListener() {
@Override
public void playbackStarted(PlaybackEvent evt) {
System.out.println("Playback started");
System.out.println(SAMPLE);
}
@Override
public void playbackFinished(PlaybackEvent evt) {
System.out.println("Playback finished");
}
});
// play it!
player.play();
}
}
默认情况下,它采用马修的声音标准。请建议需要更改哪些内容,以使语音对马修的声音具有神经性。
谢谢
解决方案
感谢@ASR 的反馈。
我能够按照您的建议找到引擎参数。
我必须解决的方法是:
- 在 pom.xml 中将 aws-java-sdk-polly 版本从 1.11.77(如他们的文档中所包含的)更新到最新的 1.11.762 并构建 Maven 项目。这带来了 SynthesizeSpeechRequest 类的最新类定义。在 1.11.77 中,我无法在其定义中看到withEngine函数。
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-polly</artifactId>
<version>1.11.762</version>
</dependency>
- 更新了 withEngine("neural") 如下:
SynthesizeSpeechRequest synthReq =
new SynthesizeSpeechRequest().withText(text).withVoiceId(voice.getId())
.withOutputFormat(format).withEngine("neural");
- 如https://docs.aws.amazon.com/polly/latest/dg/NTTS-main.html中所定义,神经语音仅在特定地区可用。所以我不得不选择如下:
PollyDemo helloWorld = new PollyDemo(Region.getRegion(Regions.US_WEST_2));
在这个神经声音完美运行之后。
推荐阅读
- wordpress - Dart Flutter - 使用 Chopper 获取 WordPress 自定义帖子类型
- bash - 无法将 fil 中的两个参数提供给 bash 脚本:找不到命令
- excel - Excel,每第 N 行累积求和,但 N 的随机长度
- azure - 当 signInAudience = AzureADMyOrg 或 AzureADMultipleOrgs 时 Azure AD B2C 应用程序不可用
- php - CURLOPT_STDERR 总是被执行?
- javascript - 选择班级+任何结局
- excel - 如何在 Excel VBA 中使用单元格偏移重复代码
- azure - Azure B2C 自定义 saml 策略:每个应用程序的不同声明
- nlp - 如何在 PyTorch 中正确实现 Seq2Seq LSTM 的填充?
- flutter - 如何将数据传递给 Flutter 中的小部件?