我有一个熊猫数据框架,看起来像这样:
age score
5 72 99.424
6 70 99.441
7 69 99.442
8 67 99.443
9 71 99.448
mean score: 99.4396
均值是整个score
列的均值。我如何分割/获得一个age
值,比如+/- 0.001,更接近平均分。
在这个例子中:67和69
mean = df['score'].mean()
df[df['score'].between(mean - .001, mean + .001)]['age']
import pandas as pd
import statistics
df = pd.DataFrame({"age": [72, 70, 69, 67, 71], "score": (99.424, 99.441, 99.442, 99.443, 99.448)})
df["diff"] = abs(df["score"] - statistics.mean(list(df["score"])))
得到:
age score diff
0 72 99.424 0.0156
1 70 99.441 0.0014
2 69 99.442 0.0024
3 67 99.443 0.0034
4 71 99.448 0.0084
:
x = 0.002
ages = list(df.loc[df["diff"] < x]["age"])
[Out]: [70]
x将作为与平均值之差的参数。
编辑:顺便说一下,我们无法得到与您相同的结果,因为我们没有您的整个分数列