id MaxS MaxA MaxD
43290 9.511364 2.70 0.27
43290 7.547727 2.56 0.34
43290 7.465909 2.66 0.48
43290 7.404545 3.90 0.60
43290 7.772727 2.38 0.11
43290 7.936364 2.62 0.97
43290 7.650000 4.20 1.64
43290 3.088636 1.79 0.06
43290 4.377273 2.19 0.05
43290 6.750000 4.65 1.90
43290 5.461364 2.82 0.19
43290 7.363636 4.13 1.48
43290 11.270455 3.72 0.41
43290 10.186364 3.88 1.17
43290 3.109091 2.05 0.02
43290 7.834091 3.38 0.01
43290 3.252273 2.31 0.03
43290 7.854545 3.00 0.70
43290 9.756818 3.26 0.54
43290 6.954545 2.93 0.24
43291 4.070455 1.21 0.21
43291 6.034091 3.42 0.42
43291 8.018182 2.41 0.66
43291 7.956818 3.55 0.62
43291 8.161364 2.74 0.64
43291 8.263636 4.11 0.13
43291 2.618182 1.80 0.08
43291 2.168182 2.12 0.04
43291 6.095455 3.04 0.11
43291 9.061364 2.91 0.33
45880 5.236364 2.43 0.15
45880 14.972727 4.86 0.23
45880 9.593182 4.48 1.36
45880 4.459091 3.67 0.14
45880 17.325000 4.21 0.44
45880 11.086364 3.30 1.00
45880 5.277273 2.25 0.12
45880 7.547727 2.92 0.34
45880 11.270455 3.33 0.03
45880 13.990909 3.21 0.50
45880 9.122727 3.86 1.14
45880 6.790909 4.24 1.30
45880 8.100000 4.31 0.80
45880 5.809091 3.22 0.94
45881 6.565909 3.50 0.86
45881 10.452273 4.64 0.85
45881 7.281818 3.47 0.71
45881 9.347727 3.67 0.02
45881 14.318182 3.97 0.51
45881 5.481818 3.99 0.21
45881 7.425000 3.93 1.65
45881 8.836364 3.50 0.26
45881 5.277273 2.21 0.57
45881 12.865909 4.38 0.94
45881 7.200000 2.86 0.45
45881 7.138636 4.39 1.18
45881 8.815909 4.34 0.34
45881 9.490909 4.53 0.28
45881 17.652273 4.59 0.05
45881 11.106818 2.64 0.31
45881 9.511364 3.83 1.14
45881 8.284091 3.90 0.20
45881 9.306818 3.54 0.22
45881 5.195455 2.66 0.14
45881 3.477273 2.50 0.16
45881 7.179545 3.70 0.08
45881 8.447727 3.19 0.32
45881 4.990909 2.32 0.86
45881 16.465909 4.28 0.25
大家好,正如你们所看到的,我在熊猫数据帧中有上面的表格。我想做的是,对于每个ID,我都想要第三多到第七多的MaxS、MaxA和MaxD,我想取它们的平均值。我知道你可以head或nlargest来获得这些列中最多的数字,但我不知道如何获得每个ID的第三多到第七多。此外,如果你在多个列上尝试nlargest,Panda会出错。所以我不知道如何处理这个问题。
如果有人能帮我找到每个id的三列(MaxS、MaxA和MaxD(中第三大到第七大数字的平均值,我将不胜感激。
谢谢!
将数据帧列转换为numpy数组可以工作:
import pandas as pd
import numpy as np
def _get_average(column):
""" perform sort, reverse and get 3rd to 7th values to average """
return np.mean(np.sort(column.to_numpy())[::-1][3:8])
def average_csv():
""" read data as csv and average the desired fields """
my_df = pd.read_csv("my_csv.csv", delimiter="s+")
return _get_average(my_df["MaxS"]), _get_average(my_df["MaxA"]), _get_average(my_df["MaxD"])
if __name__ == "__main__":
print("MaxS: {}, MaxA: {}, MaxD: {}".format(*average_csv()))