如何在不重复的情况下命名这个数据集的索引?

  • 本文关键字:数据集 索引 情况下 pandas
  • 更新时间 :
  • 英文 :


我有一个看起来像这样的数据集:

NEW QUARTERLY ESTIMATES - After rebasing     GDP (seasonally adjusted)
2009                                                          1314844
Unnamed: 3                                                    1326084
Unnamed: 4                                                    1348808
Unnamed: 5                                                    1371285
2010                                                          1414539
Unnamed: 7                                                    1438482
Unnamed: 8                                                    1449582
Unnamed: 9                                                    1490081
2011                                                          1501464
Unnamed: 11                                                   1512220
Unnamed: 12                                                   1527277
Unnamed: 13                                                  1548587

我想用年份来命名未命名的行例如,未命名:3到未命名:5应该是2009等等,我该怎么做

我唯一能想到的就是在pandas上创建一个字典并使用rename方法。有没有一个可扩展的方法来处理这个问题?

try this:

df['your_col_name'] = pd.to_numeric(
df['your_col_name'], errors='coerce').ffill(downcast='int')

使用Series.str.contains测试4位数值:

m = df['NEW QUARTERLY ESTIMATES - After rebasing'].str.contains('d{4}')

或通过~测试是否包含Unnamed与反掩码:

m = ~df['NEW QUARTERLY ESTIMATES - After rebasing'].str.contains('Unnamed')

然后传递给Series.where,向前填充缺失值:

df['NEW QUARTERLY ESTIMATES - After rebasing'] = df['NEW QUARTERLY ESTIMATES - After rebasing'].where(m).ffill()
print (df)
NEW QUARTERLY ESTIMATES - After rebasing  GDP (seasonally adjusted)
0                                      2009                    1314844
1                                      2009                    1326084
2                                      2009                    1348808
3                                      2009                    1371285
4                                      2010                    1414539
5                                      2010                    1438482
6                                      2010                    1449582
7                                      2010                    1490081
8                                      2011                    1501464
9                                      2011                    1512220
10                                     2011                    1527277
11                                     2011                    1548587

最新更新