回归网络的全整数量化后的误差非常高



我已经训练了一个具有64个节点的隐藏层的全连接神经网络。我正在使用医疗成本数据集进行测试。采用原始精度模型,平均绝对误差为0.22063259780406952。对于量化为float16integer quantization with float fallback的模型,原始误差和低精度模型之间的差永远不会超过0.1。但是,如果我执行full integer quantization,则错误会增加到不合理的数量。在这种特殊情况下,它跳到了近60。我不知道这是否是TensorFlow中的一个错误,或者我是否错误地使用了API,或者这是否是量化后的合理行为。感谢您的帮助。显示转换和推断的代码如下所示:

  • 预处理
import math
import pathlib
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
import pandas as pd
from sklearn import preprocessing as pr
from sklearn.metrics import mean_absolute_error
url = 'insurance.csv'
column_names = ["age", "sex", "bmi", "children", "smoker", "region", "charges"]
dataset = pd.read_csv(url, names=column_names, header=0, na_values='?')
dataset = dataset.dropna()  # Drop rows with missing values
dataset['sex'] = dataset['sex'].map({'female': 2, 'male': 1})
dataset['smoker'] = dataset['smoker'].map({'yes': 1, 'no': 0})
dataset = pd.get_dummies(dataset, prefix='', prefix_sep='', columns=['region'])
# this is a trick to convert a dataframe to 2d array, scale it and
# convert back to dataframe
scaled_np = pr.StandardScaler().fit_transform(dataset.values)
dataset = pd.DataFrame(scaled_np, index=dataset.index, columns=dataset.columns)
  • 训练和测试拆分
train_dataset = dataset.sample(frac=0.8, random_state=0)
test_dataset = dataset.drop(train_dataset.index)
train_features = train_dataset.copy()
test_features = test_dataset.copy()
train_labels = train_features.pop('charges')
test_labels = test_features.pop('charges')
  • 原始模型培训
def build_and_compile_model():
model = keras.Sequential([
layers.Dense(64,
activation='relu',
input_shape=(len(dataset.columns) - 1, )),
layers.Dense(1)
])
model.compile(loss='mean_absolute_error',
optimizer=tf.keras.optimizers.Adam(0.001))
return model

dnn_model = build_and_compile_model()
dnn_model.summary()
dnn_model.fit(train_features,
train_labels,
validation_split=0.2,
verbose=0,
epochs=100)
print("Original error = {}".format(
dnn_model.evaluate(test_features, test_labels, verbose=0)))
  • 转换为低精度模型
converter = tf.lite.TFLiteConverter.from_keras_model(dnn_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
def representative_data_gen():
for input_value in tf.data.Dataset.from_tensor_slices(
train_features.astype('float32')).batch(1).take(100):
yield [input_value]

converter.representative_dataset = representative_data_gen
# Full Integer Quantization
# Ensure that if any ops can't be quantized, the converter throws an error
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
# Set the input and output tensors to uint8 (APIs added in r2.3)
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
tflite_model_quant = converter.convert()
dir_save = pathlib.Path(".")
file_save = dir_save / "model_16.tflite"
file_save.write_bytes(tflite_model_quant)
  • 实例化TFLite模型
interpreter = tf.lite.Interpreter(model_path=str(file_save))
interpreter.allocate_tensors()
  • 评估较低精度的模型
def evaluate_model(interpreter, test_images, test_labels):
input_details = interpreter.get_input_details()[0]
input_index = interpreter.get_input_details()[0]["index"]
output_index = interpreter.get_output_details()[0]["index"]
# Run predictions on every image in the "test" dataset.
prediction_digits = []
for test_image in test_images:
if input_details['dtype'] == np.uint8:
input_scale, input_zero_point = input_details['quantization']
test_image = test_image / input_scale + input_zero_point
test_image = np.expand_dims(test_image,
axis=0).astype(input_details['dtype'])
interpreter.set_tensor(input_index, test_image)
# Run inference.
interpreter.invoke()
output = interpreter.get_tensor(output_index)
prediction_digits.append(output[0])

filtered_labels, correct_digits = map(
list,
zip(*[(x, y) for x, y in zip(test_labels, prediction_digits)
if not math.isnan(y)]))
return mean_absolute_error(filtered_labels, correct_digits)
print(evaluate_model(interpreter, test_features[:].values, test_labels))

在进行量化(和一般的机器学习(时,需要小心数据的外观。对于您所拥有的数据,应用一定程度的量化是否有意义?

在像你这样的回归问题的情况下,基本事实在[1121.8739;63770.42801]范围内,并且一些输入数据也是浮点的,用这些数据训练模型,然后用整数量化它可能不会产生好的结果。

您将模型训练为输出[1121.8739;63770.42801]范围内的值,在int8中量化后,它将只能输出[-127;128]范围内的数值,而不能输出小数点。显然,当你将量化模型的结果与实际情况进行比较时,误差会急剧上升。

如果你仍然想应用量化,你能做什么?您需要在量化集的域中移动数据。在您的情况下,将float32数据转换为int8,方法仍然合理。您将在实际用例中看到性能的大幅下降。毕竟,对于回归问题,您可以从大约有2500万可能输出值的域(假设尾数为23位和8位指数,请参阅单精度浮点和区间[0,1]中有多少浮点数?(转移到具有256(2^8(可能输出的域。

但是真正真正天真的方法可以应用以下转换:

def scale_down_data(data):
max_value = data.max()
min_value = data.min()
# normalizing between -128 and 127
scaled_down = 255*((data-min_value)/(max_value-min_value)) -128
return scaled_down.astype(np.int8)

在实践中,最好查看数据的分布,并进行转换,在数据密集的地方提供更大的范围。你也不想把回归的范围限制在训练集的范围内。你需要对每个不在量化域中的输入或输出进行分析。

最新更新