我有一个pandas数据帧,我想对每一行应用一个简单的符号和乘法运算,并将行向后两个索引(偏移2(。例如,如果我们有
row_a = np.array([0.45, -0.78, 0.92])
row_b = np.array([1.2, -0.73, -0.46])
sgn_row_a = np.sign(row_a)
sgn_row_b = np.sign(row_b)
result = sgn_row_a * sgn_row_b
result
>>> array([1., 1., -1.])
我尝试过的
import pandas as pd
import numpy as np
np.random.seed(42)
df = pd.DataFrame(np.random.normal(0, 1, (100, 5)), columns=["a", "b", "c", "d", "e"])
def kernel(row_a, row_b):
"""Take the sign of both rows and multiply them"""
sgn_a = np.sign(row_a)
sgn_b = np.sign(row_b)
return sgn_a * sgn_b
def func(data):
"""Apply 'kernel' to the dataframe row-wise, axis=1"""
out = data.apply(lambda x: kernel(x, x.shift(2)), axis=1)
return out
但当我运行该函数时,我会得到以下输出,这是不正确的。它似乎是在移动列而不是行。但当我在轮班操作中尝试不同的axis
时,我只得到了错误(ValueError: No axis named 1 for object type Series
(
out = func(df)
out
>>>
a b c d e
0 NaN NaN 1.0 -1.0 -1.0
1 NaN NaN -1.0 -1.0 1.0
2 NaN NaN -1.0 1.0 -1.0
3 NaN NaN -1.0 1.0 -1.0
4 NaN NaN 1.0 1.0 -1.0
.. .. .. ... ... ...
我所期望的是
out = func(df)
out
>>>
a b c d e
0 -1. 1. 1. -1. 1.
1 1. -1. 1. 1. -1.
2 -1. 1. 1. 1. 1.
3 -1. 1. 1. 1. 1.
4 -1. -1. -1. 1. -1.
.. .. .. ... ... ...
我如何实现上面概述的移位行操作?
似乎最简单的方法是
df.apply(np.sign) * df.shift(2).apply(np.sign)
>>>
a b c d e
0 NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN
2 -1.0 1.0 1.0 -1.0 1.0
3 1.0 -1.0 1.0 1.0 -1.0
4 -1.0 1.0 1.0 1.0 1.0
.. ... ... ... ... ...
只要在这个转变上加一个负号,就可以转变成另一种方式。
apply
用于逐列循环,这里可以将DataFrame
传递给np.sign
函数:
df = np.sign(df) * np.sign(df.shift(2))
print (df)
a b c d e
0 NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN
2 -1.0 1.0 1.0 -1.0 1.0
3 1.0 -1.0 1.0 1.0 -1.0
4 -1.0 1.0 1.0 1.0 1.0
.. ... ... ... ... ...
95 1.0 1.0 1.0 -1.0 -1.0
96 1.0 1.0 1.0 1.0 -1.0
97 1.0 -1.0 -1.0 1.0 1.0
98 1.0 -1.0 -1.0 -1.0 -1.0
99 -1.0 1.0 1.0 -1.0 -1.0
[100 rows x 5 columns]
则如果需要移除第一个NaN
的行:
#df = df.dropna()
df = df.iloc[2:]
print (df)
a b c d e
2 -1.0 1.0 1.0 -1.0 1.0
3 1.0 -1.0 1.0 1.0 -1.0
4 -1.0 1.0 1.0 1.0 1.0
5 -1.0 1.0 1.0 1.0 1.0
6 -1.0 -1.0 -1.0 1.0 -1.0
.. ... ... ... ... ...
95 1.0 1.0 1.0 -1.0 -1.0
96 1.0 1.0 1.0 1.0 -1.0
97 1.0 -1.0 -1.0 1.0 1.0
98 1.0 -1.0 -1.0 -1.0 -1.0
99 -1.0 1.0 1.0 -1.0 -1.0
[98 rows x 5 columns]