使用运行在Android上的Vosk(Kaldi)提高语音识别的准确性



我正在开发一个应用程序,使用语音识别在Android设备上收集现场数据。有五个";目标词";,以及被识别的几个数字(零、一、十、一百等(。

我通过添加同音词(同音词(和白话同义词来提高目标词的准确性。目标词是Chinook、sockeye、coho、pink和chum。这是相关代码,

public void parseWords() {
List<String> szlNumbers = Arrays.asList(new String[]{"ONE", "TEN", "ONE HUNDRED", "ONE THOUSAND", "TEN THOUSAND"});
//species with phonemes and vernacular names
List<String> szlChinook = Arrays.asList("CHINOOK", "CHINOOK SALMON", "KING", "KINGS", "KING SALMON", "KING SALMAN");
List<String> szlSockeye = Arrays.asList("SOCKEYE", "SOCCER", "SOCKEYE SALMON", "SOCK ICE", "SOCCER ICE", "SOCK I SAID", "SOCCER IS", "OKAY SALMON", "RED SALMON", "READ SALMON", "RED", "REDS");
List<String> szlCoho = Arrays.asList("COHO", "COHO SALMON", "COVER SALMON", "SILVER SALMON", "SILVER", "SILVERS", "CO", "KOBO", "GO HOME", "COMO", "COVER", "GO");
List<String> szlPink = Arrays.asList("PINK", "A PINK", "PINKS", "PINK SALMON", "HANK SALMON", "EXAMINE", "HUMPY", "HOBBY", "HUMPIES", "HUM BE", "HUM P", "BE", "HUMPTY", "HOBBIES", "HUMVEE", "THE HUMVEES", "POMPEY");
List<String> szlChum = Arrays.asList("CHUM", "JOHN", "JUMP", "SHARMA", "CHARM", "COME", "CHARM SALMON", "COME SALMON", "CHUM SALMON", "JUMP SALMON", "TRUMP SALMON", "KETA SALMON", "KETA", "DOG", "DOGS", "DOG SALMON", "GATOR", "GATORS", "CALICO", "A CALICO");
//Collections.sort(szlChinook); //what is this?
szVoskOutput=szVoskOutput.toUpperCase();
if (szVoskOutput.compareTo("")==0){
//do nothing, this is a blank string
return;
}
if(szVoskOutput==null){//...and this is a null string
return;
}
//pink
if (szlPink.contains(szVoskOutput)) {
szSpecies = "Pink";
populateSpecies();
return;
}
//chum
if (szlChum.contains(szVoskOutput)) {
szSpecies = "Chum";
populateSpecies();
return;
}
//sockeye
if (szlSockeye.contains(szVoskOutput)) {
szSpecies = "Sockeye";
populateSpecies();
return;
}
//coho
if (szlCoho.contains(szVoskOutput)) {
szSpecies = "Coho";
populateSpecies();
return;
}
//Chinook
if (szlChinook.contains(szVoskOutput)) {
szSpecies = "Chinook";
populateSpecies();
return;
}
if(szlNumbers.contains(szVoskOutput)) {//then this is a number, put in count txt box
tvCount.setText(szVoskOutput);
return;
}else{
Toast.makeText(this, "Please repeat clearly. Captured string is:" + szVoskOutput, Toast.LENGTH_SHORT).show();
}
}//end parseWords()

我在GitHub上有一个精简版的应用程序源代码:https://github.com/portsample/salmonTalkerLite以及Google Play上的最新完整版本:https://play.google.com/store/apps/details?id=net.blepsias.salmontalker

使用目标词和同音异义词,我可以在四到五秒内获得命中率。我想把它做得更快。我能做些什么来进一步调整速度?

这有很大帮助。现在,识别时间始终约为1.5秒。

private void recognizeMicrophone() {
if (speechService != null) {
setUiState(iSTATE_DONE);
speechService.stop();
speechService = null;
} else {
setUiState(iSTATE_MIC);
try {
Recognizer rec = new Recognizer(model, 16000.f, "["sockeye pink coho chum chinook atlantic salmon","[unk]"]");
speechService = new SpeechService(rec, 16000.0f);
speechService.startListening(this);
} catch (IOException e) {
setErrorState(e.getMessage());
}
}
}

这清除了上游的异常Vosk输出,只留下指定的目标词。这将消除对原始帖子中显示的复杂的同音异义词排序条件的需要。感谢Nickolay Shmyrev。我仍在寻找其他方法来加快识别速度,或以其他方式改进这一过程。

更新和改进将反映在GitHub上的源代码中:https://github.com/portsample/salmonTalkerLite

最新更新