如何根据条件创建差异列?

  • 本文关键字:创建 何根 条件 python
  • 更新时间 :
  • 英文 :


我有一个数据框架,需要添加一个显示差异的新列。每个样本的CurrentDay - PreviousDay。例如,对于R1, 15-13.2 = 1.8将是新列的结果。请看下面的示例数据和答案。

这是一个样本数据集

Date,color,Sample,Height
10/24/2021,red,R1,13.2
10/24/2021,red,R2,0
10/24/2021,red,R3,9
10/24/2021,red,R4,16
10/24/2021,red,R5,4
10/24/2021,red,R6,15
10/24/2021,red,R7,9
10/24/2021,red,R8,16.5
10/24/2021,orange,O1,12.5
10/24/2021,orange,O2,17.5
10/24/2021,orange,O3,16
10/24/2021,orange,O4,12.9
10/24/2021,orange,O5,.1
10/24/2021,orange,O6,3.5
10/24/2021,orange,O7,8.5
10/24/2021,orange,O8,0
10/24/2021,yellow,Y1,0
10/24/2021,yellow,Y2,8.5
10/24/2021,yellow,Y3,11
10/24/2021,yellow,Y4,16.5
10/24/2021,yellow,Y5,14.5
10/24/2021,yellow,Y6,15
10/24/2021,yellow,Y7,5.9
10/24/2021,yellow,Y8,13
10/25/2021,red,R1,15
10/25/2021,red,R2,0
10/25/2021,red,R3,15
10/25/2021,red,R4,17.5
10/25/2021,red,R5,4.5
10/25/2021,red,R6,18
10/25/2021,red,R7,9
10/25/2021,red,R8,18
10/25/2021,orange,O1,16
10/25/2021,orange,O2,19.9
10/25/2021,orange,O3,17.8
10/25/2021,orange,O4,16
10/25/2021,orange,O5,.1
10/25/2021,orange,O6,6.5
10/25/2021,orange,O7,13
10/25/2021,orange,O8,0
10/25/2021,yellow,Y1,0
10/25/2021,yellow,Y2,10.9
10/25/2021,yellow,Y3,12
10/25/2021,yellow,Y4,18
10/25/2021,yellow,Y5,16.5
10/25/2021,yellow,Y6,16
10/25/2021,yellow,Y7,8
10/25/2021,yellow,Y8,14.6

附加列的答案应该如下所示

R1  = 1.8
R2  = 0  
R3  = 6  
R4  = 1.5
R5  = .5
R6  = 3 
R7  = 0 
R8  = 1.5
O1  = 3.5
O2  = 2.4
O3  = 1.8 
O4  = 3.1 
O5  = 0  
O6  = 3  
O7  = 4.5
08  = 0
Y1  = 0
Y2  = 2.4
Y3  = 1
Y4  = 1.5
Y5  = 2
Y6  = 1
Y7  = 2.1
Y8  = 1.6

您可以使用groupbydiff:

df = pd.read_csv('filename.csv')
difference = df.groupby('Sample').Height.diff()
mask = ~difference.isnull()
print(pd.concat([df[mask].Sample, difference[mask]], 1))

Sample  Height
24     R1     1.8
25     R2     0.0
26     R3     6.0
27     R4     1.5
28     R5     0.5
29     R6     3.0
30     R7     0.0
31     R8     1.5
32     O1     3.5
33     O2     2.4
34     O3     1.8
35     O4     3.1
36     O5     0.0
37     O6     3.0
38     O7     4.5
39     O8     0.0
40     Y1     0.0
41     Y2     2.4
42     Y3     1.0
43     Y4     1.5
44     Y5     2.0
45     Y6     1.0
46     Y7     2.1
47     Y8     1.6

最后的for循环将以您想要的格式打印输出:

df = df.assign(difference = df.groupby("Sample")[["Height"]].diff())
df = df[~df['difference'].isnull()]
for _, line in df.iterrows():
print("{:<4}= {:>3s}".format(line["Sample"], str(round(line["difference"] * 100)/100)))

相关内容

  • 没有找到相关文章

最新更新