我使用opencv 3.1,并通过visual studio 2015更新3在windows 10中启用intel TBB。第一个转置时间需要100ms,而另一个转置只需要0.02-0.05ms。任何人都知道为什么第一次转置对于1*1矩阵需要这么多时间。
double ts = time_measure("start", 0);
Mat_<uchar> A = (Mat_<uchar>(1, 1) << 1);
Mat at = A.t();
cout << "transpose Times needed : " << time_measure("end", ts) * 1000 << " ms " << endl;
for (int i = 0; i < 10; i++) {
ts = time_measure("start", 0);
Mat_<uchar> B = (Mat_<uchar>(1, 1) << 1);
Mat bt = B.t();
cout << "transpose Times needed : " << time_measure("end", ts) * 1000 << " ms " << endl;
}
double time_measure(const string mode, double ts) {
double t = 0.0;
if (mode == "start") {
t = (double)getTickCount();
}
else {
t = ((double)getTickCount() - ts) / getTickFrequency();
}
return t;
}
The output
transpose A Times needed : 112.062 mstranspose B Times needed : 0.0337221 ms
transpose B Times needed : 0.0205265 ms
transpose B Times needed : 0.0195491 ms
transpose B Times needed : 0.0283461 ms
transpose B Times needed : 0.0234589 ms
transpose B Times needed : 0.0298123 ms
transpose B Times needed : 0.0249251 ms
transpose B Times needed : 0.0283461 ms
transpose B Times needed : 0.0273687 ms
transpose B Times needed : 0.02688 ms
我没有启用TBB,但衡量性能的方式似乎有问题:
- 不包括创建矩阵的时间
- 使用足够大的矩阵。无论如何,转置
1x1
矩阵是没有意义的 - 不要将字符串用于布尔值
你可以试试这样的方法,然后请告诉我你的执行时间:
double time_measure(bool start, double ts) {
double t = 0.0;
if (start) {
t = (double)getTickCount();
}
else {
t = ((double)getTickCount() - ts) / getTickFrequency();
}
return t;
}
int main()
{
for (int i = 0; i < 10; i++) {
// 1000 x 1000 random matrix
Mat_<uchar> B(1000, 1000);
randu(B, 0, 256);
double ts = time_measure(true, 0);
Mat bt = B.t();
cout << "transpose Times needed : " << time_measure(false, ts) * 1000 << " ms " << endl;
}
getchar();
return 0;
}
感谢您的评论。我试过你的代码,矩阵的创建时间是固定的,第一个转置时间也比另一个转置花费更多的时间。以下是测试结果。我修改代码以打印矩阵创建时间和转换时间。
int _tmain(int argc, _TCHAR* argv[]) {
for (int i = 0; i < 10; i++) {
double ts = time_measure(true, 0);
// 1000 x 1000 random matrix
Mat_<uchar> B(1000, 1000);
randu(B, 0, 256);
cout << "create matrix Times needed : " << time_measure(false, ts) * 1000 << " ms " << endl;
ts = time_measure(true, 0);
Mat bt = B.t();
cout << "transpose Times needed : " << time_measure(false, ts) * 1000 << " ms " << endl;
}
}
double time_measure(bool start, double ts) {
double t = 0.0;
if (start) {
t = (double)getTickCount();
}
else {
t = ((double)getTickCount() - ts) / getTickFrequency();
}
return t;
}
输出如下所示:
create matrix Times needed : 49.3267 ms
transpose Times needed : 427.299 ms
create matrix Times needed : 51.8431 ms
transpose Times needed : 0.889971 ms
create matrix Times needed : 51.8084 ms
transpose Times needed : 0.718917 ms
create matrix Times needed : 52.4946 ms
transpose Times needed : 0.742376 ms
create matrix Times needed : 45.5454 ms
transpose Times needed : 0.705721 ms
create matrix Times needed : 45.218 ms
transpose Times needed : 0.70621 ms
create matrix Times needed : 44.5748 ms
transpose Times needed : 0.713541 ms
create matrix Times needed : 46.2501 ms
transpose Times needed : 0.68715 ms
create matrix Times needed : 45.153 ms
transpose Times needed : 0.663691 ms
create matrix Times needed : 44.1892 ms
transpose Times needed : 0.584028 ms