我想用每个变量随机生成的变量替换CSV文件中的变量。
例如,将"不可用"更改为"男性"或"女性">
样品:
Number Sex
0 Female
1 Male
2 Not Available
3 Male
4 Not Available
随机变化后:
Number Sex
0 Female
1 Male
2 Female
3 Male
4 Male
import pandas as pd
import random
def RandomSex():
return random.choice(['Male','Female'])
df = pd.read_csv(r'data.csv')
df2 = df.loc[: , 'Sex']
print(df2)
df.loc[(df.Sex == 'Not Available'),'Gender'] = RandomSex()
print(df2)
但这将所有"不可用"更改为所有"男性"或所有"女性">
您可以首先获得"Not Available"
的数量,然后从列表中查找带有random.choices
的选项,而不是只选择一个(random.choice
会这样做(:
not_availables = df.Sex.eq("Not Available")
num_not_availables = not_availables.sum()
choice_list = ["Male", "Female"]
new_values = random.choices(choice_list, k=num_not_availables)
df.loc[not_availables, "Sex"] = new_values
获取(例如(
Number Sex
0 Female
1 Male
2 Male
3 Male
4 Female