当两个元素匹配时,如何将特定值放在数据帧的列中

  • 本文关键字:数据帧 两个 元素 python pandas
  • 更新时间 :
  • 英文 :


我试图将"抑制"或"兴奋性"放在df1的"postsyn_type"和"presyn_type"列中,每次"post_pt_root_id"和"pre_pt_root_id"列中的值(分别)与"pt_root_id"列中的df2值匹配。

我拥有的数据帧示例:

df1 = pd.DataFrame({'pre_pt_root_id': [1,1,1,2,2], 'post_pt_root_id': [5,1,5,6,7]})
pre_pt_root_id  post_pt_root_id
0               1                5
1               1                1
2               1                5
3               2                6
4               2                7
df2 = pd.DataFrame({'pt_root_id': [1,2,3,4,5,6,7], 'type': ['inhib','excit','inhib','inhib','excit','excit','inhib']})
pt_root_id   type
0           1  inhib
1           2  excit
2           3  inhib
3           4  inhib
4           5  excit
5           6  excit
6           7  inhib

结果示例:

df1 = pd.DataFrame({'pre_pt_root_id': [1,1,1,2,2], 'post_pt_root_id': [5,1,5,6,7], 'presyn_type': ['inhib','inhib','inhib','excit','excit'], 'postsyn_type': ['excit','inhib','excit','excit','inhib']})
pre_pt_root_id  post_pt_root_id presyn_type postsyn_type
0               1                5       inhib        excit
1               1                1       inhib        inhib
2               1                5       inhib        excit
3               2                6       excit        excit
4               2                7       excit        inhib

我已经尝试过合并,但它似乎不适合我想做的事情。您可能已经注意到,df1"pre_pt_root_id"列中的值可以重复多次,因此我必须在"presyn_type"中输入的值对于每次重复都必须相同。谁能帮忙?

创建映射序列d,然后使用Series.map替换pre_pt_root_idpost_pt_root_id列中的值

d = df2.set_index('pt_root_id')['type']
df1['presyn_type'] = df1['pre_pt_root_id'].map(d)
df1['postsyn_type'] = df1['post_pt_root_id'].map(d)
<小时 />
pre_pt_root_id  post_pt_root_id presyn_type postsyn_type
0               1                5       inhib        excit
1               1                1       inhib        inhib
2               1                5       inhib        excit
3               2                6       excit        excit
4               2                7       excit        inhib
df2['pre_pt_root_id'] = df2['pt_root_id']
df2['post_pt_root_id'] = df2['pt_root_id']
df3 = df1
df3 = pd.merge(df3, df2, on='pre_pt_root_id', ).drop(columns=['pt_root_id','post_pt_root_id_y']).rename(columns={'cell_type':'presyn_type','post_pt_root_id_x':'post_pt_root_id'})
df3 = pd.merge(df3, df2, on='post_pt_root_id', ).drop(columns=['pt_root_id','pre_pt_root_id_y']).rename(columns={'cell_type':'postsyn_type'})
import pandas as pd
import random
###############################
# creating a reproducible MWE #
###############################
random.seed(22)
cell_type = ['inhibitory', 'excitatory']
pt_root_id = [random.getrandbits(32) for _ in range(1, 10)]
cell_types = [random.choice(cell_type) for _ in range(1, 10)]
# create a sample df2.
df2 = pd.DataFrame()
df2['pt_root_id'] = pt_root_id
df2['cell_types'] = cell_types
# assuming that pt_root_id column is unique. set it as index
df2.set_index('pt_root_id', inplace=True)
# create a sample df1
df1 = pd.DataFrame()
df1['pre_pt_root_id'] = [random.choice(pt_root_id) for _ in range(1, 10)]
df1['post_pt_root_id'] = [random.choice(pt_root_id) for _ in range(1, 10)]
#############################
# solution to your problem. #
#############################
df1['presyn_type'] = df2.loc[df1['pre_pt_root_id']]['cell_types'].tolist()
df1['postsyn_type'] = df2.loc[df1['post_pt_root_id']]['cell_types'].tolist()
print(df2)
print(df1)

这也许不是最好的方法。但我认为它能做到你想要的。

祝你好运!!

最新更新