按最接近的值对pandas数据框进行切片



我有一个熊猫数据框架,看起来像这样:

age      score
5        72      99.424
6        70      99.441
7        69      99.442
8        67      99.443
9        71      99.448
mean score: 99.4396

均值是整个score列的均值。我如何分割/获得一个age值,比如+/- 0.001,更接近平均分。

在这个例子中:67和69

mean = df['score'].mean()
df[df['score'].between(mean - .001, mean + .001)]['age']
import pandas as pd
import statistics
df = pd.DataFrame({"age": [72, 70, 69, 67, 71], "score": (99.424, 99.441, 99.442, 99.443, 99.448)})
df["diff"] = abs(df["score"] - statistics.mean(list(df["score"])))

得到:

age   score    diff
0   72  99.424  0.0156
1   70  99.441  0.0014
2   69  99.442  0.0024
3   67  99.443  0.0034
4   71  99.448  0.0084

:

x = 0.002
ages = list(df.loc[df["diff"] < x]["age"])
[Out]: [70]

x将作为与平均值之差的参数。

编辑:顺便说一下,我们无法得到与您相同的结果,因为我们没有您的整个分数列

最新更新