对于测试机器学习算法/存储库,我认为有三件事很重要:
- 它会崩溃吗
- 它是否有最低测试精度
- 够快吗
虽然(1(也许(2(是标准的单元测试,但我不太确定如何处理(3(。我可以用 pytest/tox 测试吗?
我找到了pytest-benchmark
,但是例如lidtk
我将如何做到这一点?
在伪代码中,我想执行以下操作:
def classifier_predict(input_features):
# do something smart, but maybe too time-consuming
return result
def input_generator():
# Generate something random which classifier_predict
# can work on - don't measure this time!
return input_features
class Agents(unittest.TestCase):
def test_classifier_predict():
self.assertMaxTime(classifier_predict,
input_generator,
max_time_in_ms=100)
手工解决方案
下面是一个相当手工制作的解决方案的伪代码:
def classifier_predict(input_features):
# do something smart, but maybe too time-consuming
return result
def input_generator():
# Generate something random which classifier_predict
# can work on - don't measure this time!
return input_features
class Agents(unittest.TestCase):
def test_classifier_predict():
nb_tests = 1000
total_time = 0.0
for _ in range(nb_tests):
input_ = input_generator()
t0 = time.time()
classifier_predict(input_)
t1 = time.time()
total_time += t1 - t0
self.assertLessEqual(total_time / nb_tests, 100)
缺点
- 没有漂亮的图表(就像pytest基准测试似乎生成(
- 通常,由于硬件不同以及外部工作负载不同,硬限制可能很困难