我试图在python中创建一个基于条件减法的新列。我想首先按列A和D分组数据框,然后取B等于2的C的行值,并从C列的所有值中减去该值。
import pandas as pd
data = [
["R", 1, 2, "p"],
["R", 2, 4, "p"],
["R", 3, 6, "p"],
["R", 4, 8, "p"],
["R", 1, 6, "o"],
["R", 2, 3, "o"],
["R", 3, 1, "o"],
["R", 4, 2, "o"],
["S", 0, 5, "n"],
["S", 1, 4, "n"],
["S", 2, 1, "n"],
["S", 3, 3, "n"],
["S", 0, 3, "g"],
["S", 1, 2, "g"],
["S", 2, 9, "g"],
["S", 3, 7, "g"]]
df = pd.DataFrame(data=data, columns=["a", "b", "c", "d"])
df
Out[1]:
a b c d
0 R 1 2 p
1 R 2 4 p
2 R 3 6 p
3 R 4 8 p
4 R 1 6 o
5 R 2 3 o
6 R 3 1 o
7 R 4 2 o
8 S 0 5 n
9 S 1 4 n
10 S 2 1 n
11 S 3 3 n
12 S 0 3 g
13 S 1 2 g
14 S 2 9 g
15 S 3 7 g
希望生成:
的列'e'Out[2]:
a b c d e
0 R 1 2 p -2
1 R 2 4 p 0
2 R 3 6 p 2
3 R 4 8 p 4
4 R 1 6 o 3
5 R 2 3 o 0
6 R 3 1 o -2
7 R 4 2 o -1
8 S 0 5 n 4
9 S 1 4 n 3
10 S 2 1 n 0
11 S 3 3 n 2
12 S 0 3 g -6
13 S 1 2 g -7
14 S 2 9 g 0
15 S 3 7 g -2
如果您能告诉我如何使用变换或映射函数来解决这个问题,我将不胜感激。
在使用groupby.transform('first')
之前,您可以使用掩码:
df['e'] = df['c'] - (df['c'].where(df['b'].eq(2))
.groupby([df['a'], df['d']])
.transform('first')
.convert_dtypes()
)
输出:
a b c d e
0 R 1 2 p -2
1 R 2 4 p 0
2 R 3 6 p 2
3 R 4 8 p 4
4 R 1 6 o 3
5 R 2 3 o 0
6 R 3 1 o -2
7 R 4 2 o -1
8 S 0 5 n 4
9 S 1 4 n 3
10 S 2 1 n 0
11 S 3 3 n 2
12 S 0 3 g -6
13 S 1 2 g -7
14 S 2 9 g 0
15 S 3 7 g -2