Tensorflow:如何在java中使用用python训练的语音识别模型



我有一个用python训练的张量流模型,按照这篇文章。经过训练,我生成了冻结图。现在我需要使用此图并在基于JAVA的应用程序上生成识别。 为此,我正在查看以下示例.但是我不明白是如何收集我的输出。我知道我需要为图形提供 3 个输入。

从官方教程中给出的示例中,我阅读了基于 python 的代码。

def run_graph(wav_data, labels, input_layer_name, output_layer_name,
num_top_predictions):
"""Runs the audio data through the graph and prints predictions."""
with tf.Session() as sess:
# Feed the audio data as input to the graph.
#   predictions  will contain a two-dimensional array, where one
#   dimension represents the input image count, and the other has
#   predictions per class
softmax_tensor = sess.graph.get_tensor_by_name(output_layer_name)
predictions, = sess.run(softmax_tensor, {input_layer_name: wav_data})
# Sort to show labels in order of confidence
top_k = predictions.argsort()[-num_top_predictions:][::-1]
for node_id in top_k:
human_string = labels[node_id]
score = predictions[node_id]
print('%s (score = %.5f)' % (human_string, score))
return 0

有人可以帮助我理解 tensorflow java API 吗?

上面列出的 Python 代码的直译如下所示:

public static float[][] getPredictions(Session sess, byte[] wavData, String inputLayerName, String outputLayerName) {
try (Tensor<String> wavDataTensor = Tensors.create(wavData);
Tensor<Float> predictionsTensor = sess.runner()
.feed(inputLayerName, wavDataTensor)
.fetch(outputLayerName)
.run()
.get(0)
.expect(Float.class)) {
float[][] predictions = new float[(int)predictionsTensor.shape(0)][(int)predictionsTensor.shape(1)];
predictionsTensor.copyTo(predictions);
return predictions;
}
}

返回的predictions数组将具有每个预测的"置信度"值,并且您必须运行逻辑来计算其上的"top K",类似于Python代码如何使用numpy(.argsort()(来对返回的内容进行操作sess.run()

粗略阅读教程页面和代码,似乎predictions将有 1 行和 12 列(每个热词一个(。我从以下 Python 代码中得到了这个:

import tensorflow as tf
graph_def = tf.GraphDef()
with open('/tmp/my_frozen_graph.pb', 'rb') as f:
graph_def.ParseFromString(f.read())
output_layer_name = 'labels_softmax:0'
tf.import_graph_def(graph_def, name='')
print(tf.get_default_graph().get_tensor_by_name(output_layer_name).shape)

希望有帮助。

最新更新