在多索引数据集中按名称引用panda索引



我有一个pandas数据帧,它是从一个有三个索引的xarray数据集转换而来的。

XTIME  windspeed  direction  
Time lat        lon                                                    
0    -36.457836 146.200153 2017-10-31 01:00:00   6.299611   0.853234   
146.208487 2017-10-31 01:00:00   6.479416   0.794886   
146.216820 2017-10-31 01:00:00   6.661458   0.736131   
146.225153 2017-10-31 01:00:00   6.511693   0.702716   
146.233487 2017-10-31 01:00:00   6.218874   0.725511   
...                                        ...        ...        ...   
463  -37.599502 147.958486 2017-11-19 08:00:00   2.083699  -1.161645   
147.966819 2017-11-19 08:00:00   2.202758  -1.171001   
147.975153 2017-11-19 08:00:00   2.367172  -1.163191   
147.983486 2017-11-19 08:00:00   2.580872  -1.099003   
147.991819 2017-11-19 08:00:00   2.817507  -1.016175   

我希望能够引用lat和lon索引,并提取存储在那里的数字。然而,我不知道如何称呼他们的名字。

这里也提出了类似的问题,但公认的答案只是OP在索引列时保留一份列的副本,这对我的情况没有帮助。

我认为最简单的方法是通过DataFrame.reset_indexMultiIndex转换为列,然后选择getSeries:

df = df.reset_index()
print (df)
Time        lat         lon                XTIME  windspeed  direction
0     0 -36.457836  146.200153  2017-10-31 01:00:00   6.299611   0.853234
1     0 -36.457836  146.208487  2017-10-31 01:00:00   6.479416   0.794886
2     0 -36.457836  146.216820  2017-10-31 01:00:00   6.661458   0.736131
3     0 -36.457836  146.225153  2017-10-31 01:00:00   6.511693   0.702716
4     0 -36.457836  146.233487  2017-10-31 01:00:00   6.218874   0.725511
5   463 -37.599502  147.958486  2017-11-19 08:00:00   2.083699  -1.161645
6   463 -37.599502  147.966819  2017-11-19 08:00:00   2.202758  -1.171001
7   463 -37.599502  147.975153  2017-11-19 08:00:00   2.367172  -1.163191
8   463 -37.599502  147.983486  2017-11-19 08:00:00   2.580872  -1.099003
9   463 -37.599502  147.991819  2017-11-19 08:00:00   2.817507  -1.016175
time = df['Time']
lat = df['lat']
lon = df['lon']
print (lon)
0    146.200153
1    146.208487
2    146.216820
3    146.225153
4    146.233487
5    147.958486
6    147.966819
7    147.975153
8    147.983486
9    147.991819
Name: lon, dtype: float64

或者您可以通过Index.get_level_values选择MultiIndex中的每个级别-在选择获取索引后:

time = df.index.get_level_values('Time')
lat = df.index.get_level_values('lat')
lon = df.index.get_level_values('lon')
print (lon)
Float64Index([        146.200153, 146.20848700000002, 146.21681999999998,
146.225153, 146.23348700000003, 147.95848600000002,
147.966819,         147.975153,         147.983486,
147.991819],
dtype='float64', name='lon')

最新更新