我有两个数据帧,第一个在下面"station_anal";
count Start station number
index
31623 17105 31623
31258 11432 31258
31201 10194 31201
31200 9505 31200
31247 9145 31247
第二数据帧";vt";是:
Start station number Start station
0 31214 17th & Corcoran St NW
1 31104 Adams Mill & Columbia Rd NW
2 31221 18th & M St NW
3 31111 10th & U St NW
4 31260 23rd & E St NW
station_nal的尺寸为486x2
vt大小为8000x2
我左边的加入命令是:
lj = pd.merge(station_anal, vt, how = 'left', on = 'Start station number')
两列的数据类型相同,即int64
然而lj返回:
lj.head()
count Start station number Start station
0 17105 31623 Columbus Circle / Union Station
1 17105 31623 Columbus Circle / Union Station
2 17105 31623 Columbus Circle / Union Station
3 17105 31623 Columbus Circle / Union Station
4 17105 31623 Columbus Circle / Union Station
尺寸为8000x3
毫无意义,因为我的理解是左联接结果矩阵行大小在这种情况下总是486 的第一个数据帧
让我们使用地图:
station_al[‘起点站’]=station_al[‘起点站号’]
.map(vt.set_index(‘起点站号’([‘起点车站’](
更新删除重复项然后映射:
mapper = vt.drop_duplicates('Start Station Number')
.set_index('Start station number')['Start station']
station_anal['Start Station'] = station_anal['Start station number']
.map(mapper)