我有一个情况,我想通过一行shift
多个数据帧。首先想到的是for
循环,但它实际上并不存储数据帧。其他SO帖子建议使用inline=True
选项,它适用于某些转换,但不是shift
作为内联不是一个选项。显然,我可以通过编写几行df = df.shift(1)
来手动完成它,但本着学习最python的方法的精神……下面是一个MRE,显示了一个标准的for
循环和一个基于函数的方法:
import pandas as pd
import numpy as np
def shifter(df):
df = df.shift(1)
return df
data1 = {'a':np.random.randint(0, 5, 20),
'b':np.random.randint(0, 5, 20),
'c':np.random.randint(0, 5, 20)}
data2 = {'foo':np.random.randint(0, 5, 20),
'bar':np.random.randint(0, 5, 20),
'again':np.random.randint(0, 5, 20)}
df1 = pd.DataFrame(data=data1)
df2 = pd.DataFrame(data=data2)
print(df1)
for x in [df1, df2]:
x = x.shift(1)
print(df1)
for x in [df1, df2]:
x = shifter(x)
print(df1)
您需要重新分配内容,而不是创建一个新变量:
def shifter(df):
return df.shift(1)
for x in [df1, df2]:
x[:] = shifter(x)
或者,在函数内:
def shifter(df):
df[:] = df.shift(1)
for x in [df1, df2]:
shifter(x)
另一个方法是:
import pandas as pd
import numpy as np
def shifter(df):
a = df.shift(1)
return a
data1 = {'a':np.random.randint(0, 5, 20),
'b':np.random.randint(0, 5, 20),
'c':np.random.randint(0, 5, 20)}
data2 = {'foo':np.random.randint(0, 5, 20),
'bar':np.random.randint(0, 5, 20),
'again':np.random.randint(0, 5, 20)}
df1 = pd.DataFrame(data=data1)
df2 = pd.DataFrame(data=data2)
Y = []
for x in [df1, df2]:
x = shifter(x)
Y.append(x)
Ydf = pd.concat(Y, axis = 1)
print(Ydf)
返回
a b c foo bar again
0 NaN NaN NaN NaN NaN NaN
1 2.0 3.0 3.0 3.0 2.0 1.0
2 3.0 1.0 1.0 4.0 3.0 0.0
3 0.0 1.0 4.0 2.0 2.0 4.0
4 0.0 3.0 0.0 3.0 2.0 1.0
5 4.0 4.0 4.0 2.0 2.0 2.0
6 0.0 1.0 3.0 4.0 2.0 3.0
7 1.0 0.0 3.0 4.0 3.0 3.0
8 4.0 2.0 3.0 0.0 0.0 2.0
9 1.0 1.0 4.0 4.0 3.0 2.0
10 4.0 3.0 4.0 0.0 1.0 4.0
11 0.0 3.0 1.0 1.0 2.0 4.0
12 2.0 0.0 4.0 4.0 3.0 0.0
13 3.0 3.0 4.0 4.0 1.0 2.0
14 4.0 1.0 4.0 0.0 1.0 0.0
15 3.0 4.0 0.0 3.0 0.0 3.0
16 1.0 4.0 2.0 0.0 3.0 1.0
17 3.0 2.0 0.0 0.0 2.0 0.0
18 1.0 1.0 1.0 3.0 3.0 4.0
19 1.0 3.0 2.0 1.0 1.0 0.0