为什么android的语音识别器会自动停止



第一期:

以下设置将被完全忽略

putExtra(
RecognizerIntent.EXTRA_SPEECH_INPUT_MINIMUM_LENGTH_MILLIS,
(5000).toLong()
)
putExtra(
RecognizerIntent.EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS,
(5000).toLong()
)
putExtra(
RecognizerIntent.EXTRA_SPEECH_INPUT_POSSIBLY_COMPLETE_SILENCE_LENGTH_MILLIS,
(5000).toLong()
)

我不知道还能说些什么,除了这些设置绝对不做任何事情,不管我如何设置我的逻辑(见下面我当前的实现)。

第2期:

onBeginningOfSpeech()在没有通话时被调用。然后导致识别过早停止(在800-1500ms内),因为它随后将调用onEndOfSpeech()onError(),其中错误是ERROR_NO_MATCH对此有一些警告;首先,这只发生在手机有互联网连接的时候。换句话说,如果您下载了离线使用的语言包,然后将手机强制设置为飞行模式,则不会出现此问题。其次,在手机有互联网连接的情况下,当任何微小的背景声音发生时,onBeginningOfSpeech()似乎被调用。例如,在死寂的情况下,它不会被调用,但如果你用手指轻敲附近的表面,它就会触发,然后引起问题。它能听到一根针掉在地上的声音。就像他们说的。注:这与Issue number 1 (see above)相结合,使SpeechRecognizer难以使用。

测试上述问题

上述所有问题已被证实发生在一个功能齐全,无根,像素4a设备运行Android 11。

以下是我的语音识别实现:
object SpeechRecognitionUtil {
const val logTag = "SpeechRecognitionUtil"
val speechRecognizerIntent = Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH).apply {
putExtra(
RecognizerIntent.EXTRA_LANGUAGE_MODEL,
RecognizerIntent.LANGUAGE_MODEL_FREE_FORM
)
putExtra(RecognizerIntent.EXTRA_PARTIAL_RESULTS, true)
putExtra(RecognizerIntent.EXTRA_LANGUAGE, "es-US")
putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 20)
putExtra(
RecognizerIntent.EXTRA_SPEECH_INPUT_MINIMUM_LENGTH_MILLIS,
(5000).toLong()
)
putExtra(
RecognizerIntent.EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS,
(5000).toLong()
)
putExtra(
RecognizerIntent.EXTRA_SPEECH_INPUT_POSSIBLY_COMPLETE_SILENCE_LENGTH_MILLIS,
(5000).toLong()
)
}
fun setupSpeechRecognizer(
context: Context,
onError: (Int) -> Unit = {},
onReadyForSpeech: (Bundle) -> Unit = {},
onEndOfSpeech: () -> Unit = {},
onResults: (Bundle) -> Unit = {},
onEvent: (Bundle) -> Unit = {},
onPartialResults: (Bundle) -> Unit = {},
onBufferReceived_: (ByteArray) -> Unit = {},
onEvent_: (Int, Bundle) -> Unit = { _, _ -> }
): SpeechRecognizer? {
return when {
SpeechRecognizer.isRecognitionAvailable(context) -> {
val speechRecognizer = SpeechRecognizer.createSpeechRecognizer(context)
speechRecognizer.setRecognitionListener(SimpleRecognitionListener(
onError_ = { code: Int ->
onError(code)
},
onBufferReceived_ = {
onBufferReceived_(it)
},
onReadyForSpeech_ = {
onReadyForSpeech(it)
},
onBeginningOfSpeech_ = {
},
onEvent_ = { integer: Int, bundle: Bundle ->
onEvent_(integer, bundle)
},
onEndOfSpeech_ = {
onEndOfSpeech()
},
onPartialResults_ = {
onPartialResults(it)
},
onResults_ = {
onResults(it)
}
))
speechRecognizer
}
else -> {
Log.i("SpeechRecognizer", "Recognition is NOT available!")
null
}
}
}
}
fun Bundle.recognizedWords() = getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION)
fun Bundle.confidenceScores() = getFloatArray(SpeechRecognizer.CONFIDENCE_SCORES)
class SimpleRecognitionListener(
val onReadyForSpeech_: (Bundle) -> Unit = {},
val onBeginningOfSpeech_: () -> Unit = {},
val onRmsChanged_: (Float) -> Unit = {},
val onBufferReceived_: (ByteArray) -> Unit = {},
val onEndOfSpeech_: () -> Unit = {},
val onError_: (Int) -> Unit = {},
val onResults_: (Bundle) -> Unit = {},
val onPartialResults_: (Bundle) -> Unit = {},
val onEvent_: (Int, Bundle) -> Unit = { _, _ -> }
) : RecognitionListener {
private var performingSpeechSetup = true
override fun onReadyForSpeech(bundle: Bundle) {
Log.d(SpeechRecognitionUtil.logTag, "onReadyForSpeech()::Bundle=$bundle")
performingSpeechSetup = false
onReadyForSpeech_(bundle)
}
override fun onBeginningOfSpeech() {
Log.d(SpeechRecognitionUtil.logTag, "onBeginningOfSpeech()")
onBeginningOfSpeech_()
}
override fun onRmsChanged(rms: Float) {
//        Log.i(logTag, "onRmsChanged()::Rms=$rms")
onRmsChanged_(rms)
}
override fun onBufferReceived(byteArray: ByteArray) {
Log.d(
SpeechRecognitionUtil.logTag,
"onBufferReceived()::byteArray=$byteArray, nwords="
)
onBufferReceived_(byteArray)
}
override fun onEndOfSpeech() {
Log.d(SpeechRecognitionUtil.logTag, "onEndOfSpeech()")
onEndOfSpeech_()
}
override fun onError(code: Int) {
Log.d(SpeechRecognitionUtil.logTag, "onError()::code=$code")
if (performingSpeechSetup && code == SpeechRecognizer.ERROR_NO_MATCH) return
else {
onError_(code)
}
}
override fun onResults(bundle: Bundle) {
Log.d(
SpeechRecognitionUtil.logTag,
"onResults()::Bundle=$bundle, n" +
"Recognized words=${bundle.recognizedWords()}n" +
"Confidence scores=${bundle.confidenceScores()}"
)
onResults_(bundle)
}
override fun onPartialResults(bundle: Bundle) {
Log.d(
SpeechRecognitionUtil.logTag, "onPartialResults()::Bundle=$bundle, n" +
"Recognized words=${bundle.recognizedWords()}"
)
onPartialResults_(bundle)
}
override fun onEvent(event: Int, bundle: Bundle) {
Log.d(SpeechRecognitionUtil.logTag, "onEvent()::event=$event, nbundle=$bundle")
onEvent_(event, bundle)
}
}
class MainVM(app: Application) : AndroidViewModel(app) {
enum class RecordingState {
IDLE, RECORDING;
}

private var speechRecognizer: SpeechRecognizer? = null
private val _recording = MutableLiveData(RecordingState.IDLE)
val recordingState: LiveData<RecordingState> = _recording
private fun initSpeechRecoginizer() {
// Destroy old recognizer
speechRecognizer?.cancel()
speechRecognizer?.destroy()
speechRecognizer = null
// Create new recognizer
speechRecognizer = SpeechRecognitionUtil.setupSpeechRecognizer(getApplication(),
onPartialResults = { _ -> },
onEndOfSpeech = {
_recording.value = RecordingState.IDLE
},
onResults = {},
onError = {})
}
fun startRecording() {
initSpeechRecoginizer()
speechRecognizer?.let {
it.startListening(SpeechRecognitionUtil.speechRecognizerIntent)
_recording.value = RecordingState.RECORDING
}
}
fun stopRecording() {
speechRecognizer?.stopListening()
_recording.value = RecordingState.IDLE
}
override fun onCleared() {
super.onCleared()
speechRecognizer?.destroy()
speechRecognizer = null
}
}
class MainActivity : ComponentActivity() {
private val logTag = "MainActivity"
@ExperimentalAnimationApi
@ExperimentalPagerApi
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
setContent {
MaterialTheme {
ScreenMain()
}
}
}
@Composable
private fun ScreenMain() {
val vm = getViewModel<MainVM>()
val recordingState by vm.recordingState.observeAsState(RecordingState.IDLE)
Box(
modifier = Modifier
.fillMaxSize()
.padding(20.dp),
contentAlignment = Alignment.BottomCenter
) {
Button(modifier = Modifier
.fillMaxWidth(1f)
.height(50.dp),
onClick = {
when (recordingState) {
RecordingState.IDLE -> {
vm.startRecording()
}
RecordingState.RECORDING -> {
vm.stopRecording()
}
}
}) {
val textString = when (recordingState) {
RecordingState.RECORDING -> {
"Stop Recording"
}
RecordingState.IDLE -> {
"Start Recording"
}
}
Text(textString)
}
}
}
}

清单:

<manifest xmlns:android="http://schemas.android.com/apk/res/android"
package="some.example.app">
<queries>
<intent>
<action android:name="android.speech.RecognitionService" />
</intent>
</queries>
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE" />
<uses-permission android:name="android.permission.ACCESS_WIFI_STATE" />
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.MANAGE_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<application
android:allowBackup="false"
android:icon="@mipmap/ic_launcher"
android:label="EXAMPLE"
android:roundIcon="@mipmap/ic_launcher_round"
android:supportsRtl="true"
android:theme="@style/Theme.Example">
<activity
android:name=".MainActivity"
android:exported="true">
<intent-filter>
<action android:name="android.intent.action.MAIN" />
<category android:name="android.intent.category.LAUNCHER" />
</intent-filter>
</activity>
</application>
</manifest>

我在谷歌的问题跟踪器上留下了一个问题(在这里:https://issuetracker.google.com/issues/197284982),如果你在过去使用语音识别器有任何麻烦,我敦促你给它投票/打星,如果可以的话,甚至留下评论。谢谢你。

对于第一个问题,在文档中甚至提到了三个常量:

请注意,您很少希望在意图。如果你没有一个很好的理由去改变这些,你应该让他们保持原样。还要注意,某些值可能会导致不希望的或意想不到的结果-明智地使用!此外,根据识别器的实现,这些值可能没有效果。

第二个问题确实很烦人。关于互联网连接,我从来没有注意到这一点(尽管它可能是操作系统版本和/或设备特定的)。我怀疑除了处理所有事件并在错误事件上重新启用识别器之外,还有其他方法可以解决这个问题。

最新更新