Cython ValueError:dtype 不匹配,预期有'int'但' long long'



我刚刚开始学习Cython,以使我的代码更快。我不断收到同样的错误。 ValueError: Buffer dtype mismatch, expected 'int' but got 'long long' 我已经更改了类型,但错误仍然存在。

我认为这与音量变量有关,但我无法修复它。你知道怎么解决吗?我已经包含了代码和数据集。

我在Windows 2.7上使用python 8.1,64位。

谢谢你的帮助。

这是我的主要代码:

import pandas as pd
import pyximport; pyximport.install()
import BinData2
df1 = pd.read_table('SOdata.txt', sep=",", header = None)
df1.columns = ['Date', 'Time', 'Price', 'Volume' ]
bd2 = BinData2.BinData(df1,1000) 

这是给出错误的函数 BinData2.py:

def BinData(data, binSize):
    import numpy as np
    import pandas as pd
    volume = data['Volume'].values
    binIdxVector = np.zeros(len(volume))
    cdef int i = 0
    cdef int binIdx = 1
    cdef int totalVolume = 0
    cdef int [::1] Volume = volume
    cdef int[::1] binIdxVec = binIdxVector
    for i in range(len(Volume)):
        totalVolume = totalVolume + Volume[i]
        if totalVolume <= binSize:
            binIdxVec[i] = binIdx
        else:
            binIdx = binIdx + 1
            binIdxVec[i] = binIdx
            totalVolume = Volume[i]
    binIdxVec = pd.Series(binIdxVec)
    return binIdxVec

这是数据集:

02/07/2014,09:30:01,3,500
02/07/2014,09:30:29,3,42
02/07/2014,09:35:56,3,100
02/07/2014,09:37:17,3,100
02/07/2014,09:37:28,3.2,900
02/07/2014,09:37:35,3.2,4900
02/07/2014,09:37:51,3.2,1000
02/07/2014,09:42:11,3.2,500
02/07/2014,10:00:31,3,2400
02/07/2014,10:00:37,3.2,500
02/07/2014,10:00:44,3.2,3347
02/07/2014,10:07:33,3.2,1000
02/07/2014,10:31:42,3.24,1000
02/07/2014,10:33:44,3.24,200
02/07/2014,10:40:28,3.25,300
02/07/2014,10:49:57,3.25,600
02/07/2014,10:53:16,3.25,100
02/07/2014,10:53:32,3.4,1000
02/07/2014,10:54:13,3.4,500
02/07/2014,11:05:37,3.35,1000
02/07/2014,11:11:29,3.25,600
02/07/2014,11:15:26,3.3,60
02/07/2014,11:19:16,3.3,23
02/07/2014,11:21:14,3.25,100
02/07/2014,11:21:22,3.25,100
02/07/2014,11:21:30,3.2,500
02/07/2014,11:21:35,3.2,500
02/07/2014,11:21:43,3.2,500
02/07/2014,11:29:58,3.1,200
02/07/2014,11:35:42,3.19,360
02/07/2014,11:39:51,3.19,1000
02/07/2014,11:52:39,3.15,200
02/07/2014,11:53:51,3.15,100
02/07/2014,11:55:11,3.2,100
02/07/2014,12:17:32,3.2,1500
02/07/2014,12:35:42,3.24,1200
02/07/2014,12:37:53,3.24,100
02/07/2014,12:38:02,3.24,3500
02/07/2014,12:53:57,3.24,400
02/07/2014,13:10:57,3.239,100
02/07/2014,13:11:35,3.24,800
02/07/2014,13:13:41,3.24,1000
02/07/2014,13:39:40,3.24,450
02/07/2014,13:56:04,3.24,500
02/07/2014,14:09:49,3.24,600
02/07/2014,14:11:25,3.24,1000
02/07/2014,14:25:53,3.24,25
02/07/2014,14:30:58,3.24,30
02/07/2014,14:31:36,3.24,30
02/07/2014,14:32:12,3.24,30
02/07/2014,14:33:00,3.24,100
02/07/2014,14:34:49,3.24,1100
02/07/2014,14:36:02,3.24,2000
02/07/2014,14:37:07,3.22,1500
02/07/2014,14:42:30,3.22,3300
02/07/2014,14:42:46,3.22,100
02/07/2014,14:42:54,3.2,1000
02/07/2014,14:53:13,3.23,240
02/07/2014,14:53:27,3.24,500
02/07/2014,14:53:59,3.24,60
02/07/2014,14:54:46,3.2,1500
02/07/2014,14:57:45,3.2,160
02/07/2014,14:57:46,3.2,125
02/07/2014,14:57:54,3.2,100
02/07/2014,15:05:56,3.19,100
02/07/2014,15:22:21,3.19,300
02/07/2014,15:22:28,3.18,150
02/07/2014,15:23:09,3.19,2000
02/07/2014,15:35:23,3.18,1500
02/07/2014,15:44:36,3.18,600
02/10/2014,09:30:02,3.25,100
02/10/2014,09:30:02,3.25,25
02/10/2014,09:30:24,3.25,150
02/10/2014,09:30:40,3.25,100
02/10/2014,09:31:11,3.25,650
02/10/2014,09:35:32,3.24,200
02/10/2014,09:37:59,3.19,100
02/10/2014,09:38:01,3.2,2000
02/10/2014,09:38:09,3.18,185
02/10/2014,09:38:36,3.18,500
02/10/2014,09:39:13,3.18,1042
02/10/2014,09:39:18,3.18,156
02/10/2014,09:39:18,3.17,20
02/10/2014,09:41:24,3.15,100
02/10/2014,09:42:28,3.15,1000
02/10/2014,09:42:28,3.15,1000
02/10/2014,09:42:41,3.15,500
02/10/2014,09:42:57,3.15,100
02/10/2014,09:43:24,3.12,500
02/10/2014,09:43:29,3.12,100
02/10/2014,09:43:32,3.1,5000
02/10/2014,09:44:02,3.1,500
02/10/2014,09:44:19,3.1,500
02/10/2014,09:44:22,3.09,100
02/10/2014,09:44:22,3.09,96
02/10/2014,09:44:55,3.05,100
02/10/2014,09:45:11,3.05,676
02/10/2014,09:45:23,3,150
02/10/2014,09:45:44,2.95,1000
02/10/2014,09:45:53,2.95,1500
02/10/2014,09:47:17,2.95,100
02/10/2014,09:47:46,2.9,100
02/10/2014,09:48:24,2.9,500
02/10/2014,09:48:50,2.9,100
02/10/2014,09:49:11,2.85,386
02/10/2014,09:49:13,2.85,100
02/10/2014,09:49:14,2.8,200
02/10/2014,09:49:15,2.7,100
02/10/2014,09:49:22,2.7,100
02/10/2014,09:49:32,2.7,100
02/10/2014,09:50:09,2.65,2500
02/10/2014,09:50:44,2.66,2500
02/10/2014,09:50:49,2.6,100
02/10/2014,09:50:53,2.7,240
02/10/2014,09:50:54,2.61,1000
02/10/2014,09:50:58,2.65,414
02/10/2014,09:55:24,2.95,100
02/10/2014,09:57:22,2.95,400
02/10/2014,10:07:21,2.95,400
02/10/2014,10:16:28,2.95,250
02/10/2014,10:21:20,2.85,300
02/10/2014,10:32:40,2.94,100
02/10/2014,10:33:18,2.95,426
02/10/2014,10:33:38,2.95,70
02/10/2014,10:33:39,2.94,1900
02/10/2014,10:43:46,2.95,4500
02/10/2014,10:44:00,2.99,200
02/10/2014,10:44:20,2.99,505
02/10/2014,10:49:30,2.96,500
02/10/2014,10:57:22,2.95,2500
02/10/2014,10:57:25,2.95,500
02/10/2014,10:57:40,2.95,500
02/10/2014,11:38:29,3,500
02/10/2014,11:38:35,3.05,500
02/10/2014,11:38:45,3.1,1000
02/10/2014,11:45:08,3.05,100
02/10/2014,11:49:55,3.01,100
02/10/2014,11:50:14,3,1900
02/10/2014,11:50:18,3,100
02/10/2014,12:07:51,3,1000
02/10/2014,12:33:26,3,400
02/10/2014,13:57:20,3.1,150
02/10/2014,13:57:34,3,42
02/10/2014,14:21:42,3.15,500
02/10/2014,14:23:35,3.15,1000
02/10/2014,14:25:40,3.05,200
02/10/2014,14:26:01,3.15,100
02/10/2014,14:50:50,3.15,100
02/10/2014,14:51:00,3.1,100
02/10/2014,14:51:09,3.1,100

正如错误消息所示,DataFrame中的值是 int64 类型,但您使用的是int内存视图。

此外,默认情况下,np.zeros具有浮点型 dtype,并且您无法将binIdxVec重新分配给系列,因此工作版本可能如下所示。

In [201]: %%cython
     ...: def BinData(data, binSize):
     ...:     import numpy as np
     ...:     import pandas as pd
     ...: 
     ...:     volume = data['Volume'].values
     ...:     binIdxVector = np.zeros(len(volume), dtype='int64')
     ...: 
     ...:     cdef int i = 0
     ...:     cdef int binIdx = 1
     ...:     cdef int totalVolume = 0
     ...:     cdef long long [::1] Volume = volume
     ...:     cdef long long [::1] binIdxVec = binIdxVector
     ...: 
     ...:     for i in range(len(Volume)):
     ...: 
     ...:         totalVolume = totalVolume + Volume[i]
     ...: 
     ...:         if totalVolume <= binSize:
     ...:             binIdxVec[i] = binIdx
     ...: 
     ...:         else:
     ...:             binIdx = binIdx + 1
     ...:             binIdxVec[i] = binIdx
     ...:             totalVolume = Volume[i]
     ...: 
     ...:     binIdxVecS = pd.Series(np.asarray(binIdxVec))
     ...:     return binIdxVecS

最新更新