我想拖动一个相对于另一列为整数的列值



下面是我的数据框架。

df = pd.DataFrame({'vin':['aaa','aaa','aaa','bbb','bbb','bbb','ddd','eee','eee','fff'],
'c1':[35,'NA','NA','NA',4,'NA','NA','NA',56,406],
'c2':[35,'NA','NA','NA',43,'NA','NA','NA',67,77],
'c3':[36,'NA','NA','NA',104,'NA','NA','NA',44,306]})

我想相对于列"vin"拖动整数值"c1"、"c2"one_answers"c3"。

以下是预期输出:-

df = pd.DataFrame({'vin':['aaa','aaa','aaa','bbb','bbb','bbb','ddd','eee','eee','fff'],
'c1':[35,35,35,4,4,4,'NA',56,56,406],
'c2':[35,35,35,43,43,43,'NA',67,67,77],
'c3':[36,36,36,104,104,104,'NA',44,44,306]})

注意:"vin"列"ddd"的值将保持为"NA",因为没有可用于"ddd’的整数值。

我尝试了ffinill((或groupbyffinill(,但没有得到预期的输出。下面是我试过的代码行。

#(1)
df['c1'] = df.groupby(['vin'])['c1'].ffill()

#(2)
df[["c1","c2" "c3"]] = df[["c1", "c2","c3"]].fillna(df.groupby(['vin'])[["c1", "c2","c3"]].ffill())

#(3)
df["c1"] = df.groupby('vin')['c1'].transform(lambda x: x.ffill())

如果ffill第一个排序簇的每个组有一个非缺失值:

c = ["c1","c2", "c3"]
df[c] = df.replace('NA',np.nan).sort_values(['vin', 'c1']).groupby('vin')[c].ffill()
print (df)
vin     c1    c2     c3
0  aaa   35.0  35.0   36.0
1  aaa   35.0  35.0   36.0
2  aaa   35.0  35.0   36.0
3  bbb    4.0  43.0  104.0
4  bbb    4.0  43.0  104.0
5  bbb    4.0  43.0  104.0
6  ddd    NaN   NaN    NaN
7  eee   56.0  67.0   44.0
8  eee   56.0  67.0   44.0
9  fff  406.0  77.0  306.0

或聚合GroupBy.transform:中新列的第一个非缺失值

c = ["c1","c2", "c3"]
df[c] = df.replace('NA',np.nan).groupby('vin')[c].transform('first')
print (df)
vin     c1    c2     c3
0  aaa   35.0  35.0   36.0
1  aaa   35.0  35.0   36.0
2  aaa   35.0  35.0   36.0
3  bbb    4.0  43.0  104.0
4  bbb    4.0  43.0  104.0
5  bbb    4.0  43.0  104.0
6  ddd    NaN   NaN    NaN
7  eee   56.0  67.0   44.0
8  eee   56.0  67.0   44.0
9  fff  406.0  77.0  306.0

最新更新