在python数据框架中对一列进行排序并保持另一列不变的同时重新排列行



我有以下数据集:

Name Year  Date Value
x    year1 date1 v1
x    year1 date2 v2
x    year1 date3 v3
x    year2 date1 v4
x    year2 date2 v5
x    year2 date3 v6
z    year1 date1 v7
z    year1 date2 v8
z    year1 date3 v9
z    year2 date1 v10
z    year2 date2 v11
z    year2 date3 v12
y    year1 date1 v13
y    year1 date2 v14
y    year1 date3 v15
y    year2 date1 v16
y    year2 date2 v17
y    year2 date3 v18

我想要以下数据集输出:

Name Year  Date Value
x    year1 date1 v1
x    year2 date1 v4
x    year1 date2 v2
x    year2 date2 v5
x    year1 date3 v3
x    year2 date3 v6
z    year1 date1 v7
z    year2 date1 v10
z    year1 date2 v8
z    year2 date2 v11
z    year1 date3 v9
z    year2 date3 v12
y    year1 date1 v13
y    year2 date1 v16
y    year1 date2 v14
y    year2 date2 v17
y    year1 date3 v15
y    year2 date3 v18

我尝试了以下代码,但我的"名称"列也被排序为"x,y,z"。我希望'Name'列顺序保持为'x,z,y':df.sort_values(['Name', 'Date'])

现在,让我们创建一个具有集合顺序的新分类dtype:

namedtype = pd.CategoricalDtype([*'xzy'], ordered=True)
df['Name'] = df['Name'].astype(namedtype)
df.sort_values(['Name', 'Date', 'Year'])

输出:

Name   Year   Date Value
0     x  year1  date1    v1
3     x  year2  date1    v4
1     x  year1  date2    v2
4     x  year2  date2    v5
2     x  year1  date3    v3
5     x  year2  date3    v6
6     z  year1  date1    v7
9     z  year2  date1   v10
7     z  year1  date2    v8
10    z  year2  date2   v11
8     z  year1  date3    v9
11    z  year2  date3   v12
12    y  year1  date1   v13
15    y  year2  date1   v16
13    y  year1  date2   v14
16    y  year2  date2   v17
14    y  year1  date3   v15
17    y  year2  date3   v18

datar是对pandas api的重新想象。

使用data:

很容易实现
>>> from datar.all import f, tribble, arrange, match
>>> df = tribble(
... f.Name, f.Year,  f.Date, f.Value,
... "x",    "year1", "date1", "v1",
... "x",    "year1", "date2", "v2",
... "x",    "year1", "date3", "v3",
... "x",    "year2", "date1", "v4",
... "x",    "year2", "date2", "v5",
... "x",    "year2", "date3", "v6",
... "z",    "year1", "date1", "v7",
... "z",    "year1", "date2", "v8",
... "z",    "year1", "date3", "v9",
... "z",    "year2", "date1", "v10",
... "z",    "year2", "date2", "v11",
... "z",    "year2", "date3", "v12",
... "y",    "year1", "date1", "v13",
... "y",    "year1", "date2", "v14",
... "y",    "year1", "date3", "v15",
... "y",    "year2", "date1", "v16",
... "y",    "year2", "date2", "v17",
... "y",    "year2", "date3", "v18",
... )
>>> df >> arrange(match(f.Name, f.Name), f.Date, f.Year)
Name     Year     Date    Value
<object> <object> <object> <object>
0         x    year1    date1       v1
3         x    year2    date1       v4
1         x    year1    date2       v2
4         x    year2    date2       v5
2         x    year1    date3       v3
5         x    year2    date3       v6
6         z    year1    date1       v7
9         z    year2    date1      v10
7         z    year1    date2       v8
10        z    year2    date2      v11
8         z    year1    date3       v9
11        z    year2    date3      v12
12        y    year1    date1      v13
15        y    year2    date1      v16
13        y    year1    date2      v14
16        y    year2    date2      v17
14        y    year1    date3      v15
17        y    year2    date3      v18

相关内容

  • 没有找到相关文章

最新更新