如何基于两个数据集中不唯一的标识符合并数据集

当在Stata中合并两个数据集时，基于一个在任何一个数据集中不唯一的变量，合并x：x似乎不是一个有用的工具。什么策略会产生预期的结果？

风格化示例：

数据集1

AssetManager | Bankcode
A              1
B              2
B              3
C              3

数据集2

Bankcode | t    
1          t1          
1          t2     
2          t1    
2          t2    
3          t1    
3          t2

目的：

AssetManager | Bankcode | t
A              1         t1
A              1         t2
B              2         t1
B              2         t2
B              3         t1
B              3         t2
C              3         t1
C              3         t2

直觉：一些资产管理公司可以由多家银行持有，而一些银行还拥有多家资产管理公司。

不

鼓励使用merge m:m（阅读Stata手册中的相应条目），许多人支持消除它。尝试joinby：

clear
set more off
input ///
str1 AssetManager Bankcode
A              1
B              2
B              3
C              3
end
tempfile first
save "`first'"
clear
input ///
Bankcode str2 t    
1          t1          
1          t2     
2          t1    
2          t2    
3          t1    
3          t2    
end
joinby Bankcode using "`first'"
sort AssetManager Bankcode t
order AssetManager Bankcode
list, sepby(AssetManager)

相关内容

最新更新

热门标签：