我试图将"抑制"或"兴奋性"放在df1
的"postsyn_type"和"presyn_type"列中,每次"post_pt_root_id"和"pre_pt_root_id"列中的值(分别)与"pt_root_id"列中的df2
值匹配。
我拥有的数据帧示例:
df1 = pd.DataFrame({'pre_pt_root_id': [1,1,1,2,2], 'post_pt_root_id': [5,1,5,6,7]})
pre_pt_root_id post_pt_root_id
0 1 5
1 1 1
2 1 5
3 2 6
4 2 7
df2 = pd.DataFrame({'pt_root_id': [1,2,3,4,5,6,7], 'type': ['inhib','excit','inhib','inhib','excit','excit','inhib']})
pt_root_id type
0 1 inhib
1 2 excit
2 3 inhib
3 4 inhib
4 5 excit
5 6 excit
6 7 inhib
结果示例:
df1 = pd.DataFrame({'pre_pt_root_id': [1,1,1,2,2], 'post_pt_root_id': [5,1,5,6,7], 'presyn_type': ['inhib','inhib','inhib','excit','excit'], 'postsyn_type': ['excit','inhib','excit','excit','inhib']})
pre_pt_root_id post_pt_root_id presyn_type postsyn_type
0 1 5 inhib excit
1 1 1 inhib inhib
2 1 5 inhib excit
3 2 6 excit excit
4 2 7 excit inhib
我已经尝试过合并,但它似乎不适合我想做的事情。您可能已经注意到,df1
"pre_pt_root_id"列中的值可以重复多次,因此我必须在"presyn_type"中输入的值对于每次重复都必须相同。谁能帮忙?
创建映射序列d
,然后使用Series.map
替换pre_pt_root_id
和post_pt_root_id
列中的值
d = df2.set_index('pt_root_id')['type']
df1['presyn_type'] = df1['pre_pt_root_id'].map(d)
df1['postsyn_type'] = df1['post_pt_root_id'].map(d)
<小时 />pre_pt_root_id post_pt_root_id presyn_type postsyn_type
0 1 5 inhib excit
1 1 1 inhib inhib
2 1 5 inhib excit
3 2 6 excit excit
4 2 7 excit inhib
df2['pre_pt_root_id'] = df2['pt_root_id']
df2['post_pt_root_id'] = df2['pt_root_id']
df3 = df1
df3 = pd.merge(df3, df2, on='pre_pt_root_id', ).drop(columns=['pt_root_id','post_pt_root_id_y']).rename(columns={'cell_type':'presyn_type','post_pt_root_id_x':'post_pt_root_id'})
df3 = pd.merge(df3, df2, on='post_pt_root_id', ).drop(columns=['pt_root_id','pre_pt_root_id_y']).rename(columns={'cell_type':'postsyn_type'})
import pandas as pd
import random
###############################
# creating a reproducible MWE #
###############################
random.seed(22)
cell_type = ['inhibitory', 'excitatory']
pt_root_id = [random.getrandbits(32) for _ in range(1, 10)]
cell_types = [random.choice(cell_type) for _ in range(1, 10)]
# create a sample df2.
df2 = pd.DataFrame()
df2['pt_root_id'] = pt_root_id
df2['cell_types'] = cell_types
# assuming that pt_root_id column is unique. set it as index
df2.set_index('pt_root_id', inplace=True)
# create a sample df1
df1 = pd.DataFrame()
df1['pre_pt_root_id'] = [random.choice(pt_root_id) for _ in range(1, 10)]
df1['post_pt_root_id'] = [random.choice(pt_root_id) for _ in range(1, 10)]
#############################
# solution to your problem. #
#############################
df1['presyn_type'] = df2.loc[df1['pre_pt_root_id']]['cell_types'].tolist()
df1['postsyn_type'] = df2.loc[df1['post_pt_root_id']]['cell_types'].tolist()
print(df2)
print(df1)
这也许不是最好的方法。但我认为它能做到你想要的。
祝你好运!!