将函数应用于熊猫系列时结果不正确

当我尝试将一个函数的两个版本(纯python与cython(应用于熊猫系列时，得到了两个不同的结果

import numpy as np
import pandas as pd
from libc.math cimport lround
def py_func(n):
return round(13 * n / 37)
cdef int cy_func(int n):
return lround(13 * n / 37)
arr = np.arange(1000, 12000, 2000)
series = pd.Series(arr)
print("Original series:")
print(series)
series1 = series.apply(py_func)
series2 = series.apply(cy_func)
print("nApplied with python function:")
print(series1)
print("nApplied with cython function:")
print(series2)

结果是

Original series:
0     1000
1     3000
2     5000
3     7000
4     9000
5    11000
dtype: int32
Applied with python function:
0     351
1    1054
2    1757
3     688
4    1391
5     322
dtype: int64
Applied with cython function:
0     351
1    1054
2    1757
3    2459
4    3162
5    3865
dtype: int64

我们可以看到，在应用 python 函数时，我们在最后三个数字中获得了不正确的结果。而cython函数产生正确的结果。

为什么 python 函数会产生不正确的结果？以及如何解决它？

更新

以上结果是在 Windows 10 64 位上获得的。但是，当我在同一台机器上尝试使用 Ubuntu 18.04 64 位(使用 WSL(时，两个系列都有正确的结果。在这两种情况下，我都有Cython==0.29, numpy==1.15.2, pandas==0.23.4，也用Cython==0.28.5, numpy==1.14.5, pandas==0.23.3进行了测试。

结果还有另一个区别：在Windows 10中，原始系列的dtype是int32，而Ubuntu 18.04上的dtype是int64。两个结果系列的dtype在两个操作系统中都int64。

使用 MinGW 时，未定义MS_WIN64，用于定义SIZEOF_VOID_P，如下所示pyconfig.h。

#if defined(MS_WIN64)
...
#   define SIZEOF_VOID_P 8
...
#elif defined(MS_WIN32)
...
#   define SIZEOF_VOID_P 4
...
#endif

然后将SIZEOF_VOID_P用于pyport.h

#ifndef PYLONG_BITS_IN_DIGIT
#if SIZEOF_VOID_P >= 8
#define PYLONG_BITS_IN_DIGIT 30
#else
#define PYLONG_BITS_IN_DIGIT 15
#endif
#endif

解决方案是在使用 MinGW 编译时传递-DMS_WIN64。

相关的Python票证和cython-users组的讨论。

相关内容

最新更新

热门标签：