我使用的是Python 3.6.9。
我被困在这样的数据帧上:
import pandas as pd
import numpy as np
dict_ = {'col1': [3.14, 28, -0.618, 1.159], 'col2': ['a_002_u', 'a_003_u', 'a_001_u', 'a_003_u'], 'a_001_u': [np.nan] * 4, 'a_002_u': [np.nan] * 4, 'a_003_u': [np.nan] * 4}
df = pd.DataFrame(dict_)
col1 col2 a_001_u a_002_u a_003_u
0 3.140 a_002_u NaN NaN NaN
1 28.000 a_003_u NaN NaN NaN
2 -0.618 a_001_u NaN NaN NaN
3 1.159 a_003_u NaN NaN NaN
我想得到这样的结果:
col1 col2 a_001_u a_002_u a_003_u
0 3.140 a_002_u NaN 3.14 NaN
1 28.000 a_003_u NaN NaN 28.000
2 -0.618 a_001_u -0.618 NaN NaN
3 1.159 a_003_u NaN NaN 1.159
换句话说,我想根据"col2"中的列标题用"col1"值填充列"a_001_u"、"a_002_u"one_answers"a_003_u"。
这很容易解释,但我的印象是,它不太明显。有人想帮我吗?
您可以在使用set_index
和unstack
更改前2列的形状后使用fillna
,如:
df = df.fillna(df.set_index('col2', append=True)['col1'].unstack())
print (df)
col1 col2 a_001_u a_002_u a_003_u
0 3.140 a_002_u NaN 3.14 NaN
1 28.000 a_003_u NaN NaN 28.000
2 -0.618 a_001_u -0.618 NaN NaN
3 1.159 a_003_u NaN NaN 1.159
因为实际执行set_index
和unstack
确实会创建所需的其他列,而fillna
将对缺少的值进行(行、列(填充
print(df.set_index('col2', append=True)['col1'].unstack())
col2 a_001_u a_002_u a_003_u
0 NaN 3.14 NaN
1 NaN NaN 28.000
2 -0.618 NaN NaN
3 NaN NaN 1.159
注意:使用类似枢轴的df.pivot(columns='col2', values='col1')
也可以获得相同的结果
您可以通过遍历行来编写它。
for index, row in df.iterrows():
row[row['col2']]=row['col1']
import pandas as pd
import numpy as np
dict_ = {'col1': [3.14, 28, -0.618, 1.159], 'col2': ['a_002_u', 'a_003_u', 'a_001_u', 'a_003_u'], 'a_001_u': [np.nan] * 4, 'a_002_u': [np.nan] * 4, 'a_003_u': [np.nan] * 4}
df = pd.DataFrame(dict_)
count = 0
for key in df['col2']:
df[key][count] = df['col1'][count]
count += 1
df