使用 CMU 狮身人面像进行语音识别 - 无法正常工作 - speech recognition with cmu sphinx

我正在尝试在java中使用CMU Sphinx进行语音识别，但我得到的结果不正确，我不知道为什么。

我有一个.wav文件，我用我的声音用英语说了一些句子。

这是我在 Java 中的代码：

            Configuration configuration = new Configuration();
        // Set path to acoustic model.
        configuration.setAcousticModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us");
        // Set path to dictionary.
        configuration.setDictionaryPath("resource:/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict");
        // Set language model.
        configuration.setLanguageModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us.lm.dmp");
        StreamSpeechRecognizer recognizer = new StreamSpeechRecognizer(configuration);
        recognizer.startRecognition(new FileInputStream("assets/voice/some_wav_file.wav"));
        SpeechResult result = null;
        while ((result = recognizer.getResult()) != null) {
            System.out.println("~~ RESULTS: " + result.getHypothesis());
        }
        recognizer.stopRecognition();
    }
    catch(Exception e){
        System.out.println("ERROR: " + e.getMessage());
    }

我在 Android 中还有另一个代码也不起作用：

Assets assets = new Assets(context);
                File assetDir = assets.syncAssets();
                String prefix = assetDir.getPath();
                Config c = Decoder.defaultConfig();
                c.setString("-hmm", prefix + "/en-us-ptm");
                c.setString("-lm", prefix + "/en-us.lm");
                c.setString("-dict", prefix + "/cmudict-en-us.dict");
                Decoder d = new Decoder(c);
                InputStream stream = context.getResources().openRawResource(R.raw.some_wav_file);

                d.startUtt();
                byte[] b = new byte[4096];
                try {
                    int nbytes;
                    while ((nbytes = stream.read(b)) >= 0) {
                        ByteBuffer bb = ByteBuffer.wrap(b, 0, nbytes);
                        short[] s = new short[nbytes/2];
                        bb.asShortBuffer().get(s);
                        d.processRaw(s, nbytes/2, false, false);
                    }
                } catch (IOException e) {
                    Log.d("ERROR: ", "Error when reading file" + e.getMessage());
                }
                d.endUtt();
                Log.d("TOTAL RESULT: ", d.hyp().getHypstr());
                for (Segment seg : d.seg()) {
                    Log.d("RESULT: ", seg.getWord());
                }

我使用这个网站将 wav 文件转换为 16 位、16khz、单声道和小端序（尝试了它的所有选项）。

任何想法都是行不通的。我使用内置的词典和指控模型，我的英语口音并不完美（不知道这是否重要）。

编辑：

这是我的文件。我录制了自己说："我的宝宝很可爱"，这就是我期望的输出。在纯Java代码中，我得到："我有amy的青春"，在Android代码中，我得到："它"

这是包含日志的文件。

您的音频因转换而有些损坏。您应该最初录制到 wav 或其他无损格式中。你的发音也远非美式英语。对于格式之间的转换，您可以使用 sox 而不是外部网站。您的安卓样本似乎是正确的，但感觉就像您使用安卓解码不同的文件。您可以检查资源中是否有实际正确的文件。

使用 CMU 狮身人面像进行语音识别 - 无法正常工作

相关内容

最新更新

热门标签：