嗨,我得到了这样的数据帧
data = [(1,"tom", 23),
(1,"nick", 12),
(1,"jim",24),
(2,"tom", 44),
(2,"nick", 56),
(2,"jim",77),
(3, "tom", 88),
(3, "nick", 10),
(3, "jim", 13),
]
df = pd.DataFrame(data,columns=['class', 'Name','Number'])
数据帧的输出
class Name Number
0 1 tom 23
1 1 nick 12
2 1 jim 24
3 2 tom 44
4 2 nick 56
5 2 jim 77
6 3 tom 88
7 3 nick 10
8 3 jim 13
我想循环并获得具有相同类的新数据帧。输出应该像这个
class Name Number
0 1 tom 23
1 1 nick 12
2 1 jim 24
max_number_class_1 = 23
class Name Number
3 2 tom 44
4 2 nick 56
5 2 jim 77
max_number_class_2 = 77
class Name Number
6 3 tom 88
7 3 nick 10
8 3 jim 13
max_number_class_3 = 88
非常感谢你帮助我!
您可以使用进行过滤
dfs = [df[df['class'].eq(key)] for key in df['class'].unique()]
这会以列表的形式为您提供所需的结果。
它不是很整洁,但希望能达到目的。
df_list= []
for i in df['class'].unique():
grouped = df.groupby(['class'])
df_class= grouped.get_group(i)
df_list.append(df_class)
for j in range (len(df_list)):
print(df_list[j])
print('The Maximum number in this class is',df_list[j]['Number'].max())
print('The minimum number in this class is',df_list[j]['Number'].min())
print()
输出如下:
class Name Number
0 1 tom 23
1 1 nick 12
2 1 jim 24
The Maximum number in this class is 24
The minimum number in this class is 12
class Name Number
3 2 tom 44
4 2 nick 56
5 2 jim 77
The Maximum number in this class is 77
The minimum number in this class is 44
class Name Number
6 3 tom 88
7 3 nick 10
8 3 jim 13
The Maximum number in this class is 88
The minimum number in this class is 10
以下是几种方法:
1-列表理解
dfs = [df[df["Class"].eq(x)] for x in df["Class"].unique()]
for df in dfs:
max_number = df.Number.max()
class_number = df["Class"].head(1).squeeze()
print(f"{df}n")
print(f"max_number_class_{class_number} = {max_number}n")
Class Name Number
0 1 tom 23
1 1 nick 12
2 1 jim 13
max_number_class_1 = 23
Class Name Number
3 2 tom 44
4 2 nick 56
5 2 jim 77
max_number_class_2 = 77
Class Name Number
6 3 tom 88
7 3 nick 10
8 3 jim 13
max_number_class_3 = 88
2-字典理解
df_mapping = {x: df[df["Class"].eq(x)] for x in df["Class"].unique()}
for key in df_mapping.keys():
max_number = df_mapping[key].Number.max()
class_number = df_mapping[key]["Class"].head(1).squeeze()
print(f"{df_mapping[key]}n")
print(f"max_number_class_{class_number} = {max_number}n")
Class Name Number
0 1 tom 23
1 1 nick 12
2 1 jim 13
max_number_class_1 = 23
Class Name Number
3 2 tom 44
4 2 nick 56
5 2 jim 77
max_number_class_2 = 77
Class Name Number
6 3 tom 88
7 3 nick 10
8 3 jim 13
max_number_class_3 = 88
3-Numpy拆分
dfs = np.split(df.sort_values("Class", ascending=True), len(df["Name"].unique()))
for df in dfs:
max_number = df.Number.max()
class_number = df["Class"].head(1).squeeze()
print(f"{df}n")
print(f"max_number_class_{class_number} = {max_number}n")
Class Name Number
0 1 tom 23
1 1 nick 12
2 1 jim 13
max_number_class_1 = 23
Class Name Number
3 2 tom 44
4 2 nick 56
5 2 jim 77
max_number_class_2 = 77
Class Name Number
6 3 tom 88
7 3 nick 10
8 3 jim 13
max_number_class_3 = 88
这里有一种方法可以实现
#groupby class and then access the group individually along with the data
dflist=[]
for k,v in df.groupby(['class']):
var='class'+str(k)
dflist.append(var)
globals()[var] = v
dflist
#the created df names are stored in a list
['class1', 'class2', 'class3']
单独访问它们将为您提供DFs
class1
class Name Number
0 1 tom 23
1 1 nick 12
2 1 jim 24
class2
class Name Number
3 2 tom 44
4 2 nick 56
5 2 jim 77
class 3
class Name Number
6 3 tom 88
7 3 nick 10
8 3 jim 13