Python / numpy / pandas最快的方法应用算法来扩展计算

假设我有从 1/1/2000 到 1/1/2011 的时间序列，对于每个日期，我都有一些浮点值。这是在熊猫数据帧中。

我想执行一些计算。假设 N 是数据点的数量，i 是当前数据点。伪代码：

for i in n:
        some_calc(V0:Vi) + some_calc(Vi:Vn)

我可以轻松实现此计算，但看到性能问题，我认为对于大型集。我认为部分原因是由于数据容器是数据帧，切片会创建新的系列，并且在some_calc中，会发生更多的切片。

做这样的事情的有效方法是什么？我可以通过使用 numpy 来避免循环吗？

可以使用以下代码来提高代码性能：

result = []
for item in item_list:
    new_item = do_something_with(item)
    result.append(new_item)

请参阅以下示例：

# finding the max prior to the current item
a = [3, 4, 6, 2, 1, 9, 0, 7, 5, 8]
results = []
current_max = 0
for i in a:
    current_max = max(i, current_max)
    results.append(current_max)
# results = [3, 4, 6, 6, 6, 9, 9, 9, 9, 9]

相关内容

最新更新

热门标签：