在连接两个系列之后，重新格式化数据帧以访问它进行排序

我已经将两个系列连接或连接到一个数据帧中。然而，我没有面临的一个问题是，我没有关于实际数据的列标题，这将帮助我进行排序

hist_a = pd.crosstab(category_a, category, normalize=True)
hist_b = pd.crosstab(category_b, category, normalize=True)
counts_a = pd.Series(np.diag(hist_a), index=[hist_a.index])
counts_b = pd.Series(np.diag(hist_b), index=[hist_b.index])   
df_plots = pd.concat([counts_a, counts_b], axis=1).fillna(0)

数据如下：

0             1
category                        
0017817703277  0.000516  5.384341e-04
0017817703284  0.000516  5.384341e-04
0017817731348  0.000216  2.856169e-04
0017817731355  0.000216  2.856169e-04

我想做一个排序，但没有合适的列标题

df_plots = df_plots.sort_values(by=['0?'])

但数据帧似乎分为两部分。我如何才能更好地将数据帧结构为具有"适当"列，如'0'或'plot a'，而不是通过整数进行索引，这似乎很难使用。

category       plot a    plot b           
0017817703277  0.000516  5.384341e-04
0017817703284  0.000516  5.384341e-04
0017817731348  0.000216  2.856169e-04
0017817731355  0.000216  2.856169e-04

只需重命名数据帧的列，例如：

df = pd.DataFrame({0:[1,23]})
df = df.rename(columns={0:'new name'})

如果你有很多列，你可以一次重命名所有列，比如：

df = pd.DataFrame({0:[1,23]})
rename_dict = {key: f'Col {key}' for key in df.keys() }
df = df.rename(columns=rename_dict)

您也可以用名称定义系列，这样就可以避免在之后更改名称：

counts_a = pd.Series(np.diag(hist_a), index=[hist_a.index], name = 'counts_a')
counts_b = pd.Series(np.diag(hist_b), index=[hist_b.index], name = 'counts_b')

相关内容

最新更新

热门标签：