为什么"score"列保持 0?



我有非常简单的代码,但我无法正确运行它:它的目的是连接列,删除重复,查找字典和总得分的国家,在调试器中,我可以看到下面

row["score"] += localscore

计算正确,所以这个问题是可见性,但我不知道如何修复它,有人能帮助我吗?

import pandas as pd
my_df = pd.DataFrame({'countries': ["UK,DE", "DE", "DE"],
"other_countries": ["DE", "PL", "PL"]})
scores = {
"UK": 10.0,
"DE": 20.0,
"PL": 30.0
}
my_df["joined_countries_without_duplicates"] = my_df["countries"] + "," + my_df["other_countries"]
my_df["joined_countries_without_duplicates"] = my_df["joined_countries_without_duplicates"].str.split(",")
my_df["score"] = 0
for index, row in my_df.iterrows():
row["joined_countries_without_duplicates"] = list(set(row["joined_countries_without_duplicates"]))
localscore = 0
for country in row["joined_countries_without_duplicates"]:
localscore += scores[country]
row["score"] += localscore

改变row不会改变my_df。你需要做一些事情,比如将row['score']添加到列表中,然后在完成后将其分配给my_df

然而,这通常不是一个好方法。

你可以利用explosion, groupby和map来达到这个目的:

import pandas as pd
my_df = pd.DataFrame({'countries': ["UK,DE", "DE", "DE"],
"other_countries": ["DE", "PL", "PL"]})
scores = {
"UK": 10.0,
"DE": 20.0,
"PL": 30.0
}
my_df["joined_countries_without_duplicates"] = my_df["countries"] + "," + my_df["other_countries"]
my_df["joined_countries_without_duplicates"] = my_df["joined_countries_without_duplicates"].str.split(",").apply(set)
my_df['score'] = (my_df['joined_countries_without_duplicates'].explode()
.map(scores)
.groupby(level=0)
.sum())

相关内容

  • 没有找到相关文章

最新更新