如何从数组中的多个数据集创建具有线性趋势的散点图



我想创建一个散点图,以'年龄'和'收入'作为x轴和y轴,同时也按性别(m或f)分开:

vals = [[39, 50907.00500830538, 'm'], [71, 58137.09607273632, 'm'], [27, 44311.25956375814, 'f'], [50, 53194.40398297405, 'f'], [41, 48227.6226667045, 'f'], [38, 51081.77610221989, 'f'], [25, 49202.743772155154, 'f'], [45, 46958.227355122865, 'm'], [46, 54815.07514726054, 'm'], [25, 46734.0863416376, 'f'], [44, 52252.36769285552, 'm'], [70, 58453.80544624214, 'f']]

这是我目前拥有的代码:

ages = [x[0] for x in vals]
incomes = [x[1] for x in vals]
fig, ax = plt.subplots()
male_data = [(a,i) for a,i,g in vals if g == 'male']
male_ages = [a for a,i in male_data]
male_incomes = [i for a,i in male_data]
ax.scatter(male_ages, male_incomes, color='blue', label='male')
female_data = [(a,i) for a,i,g in vals if g == 'female']
female_ages = [a for a,i in female_data]
female_incomes = [i for a,i in female_data]
ax.scatter(female_ages, female_incomes, color='red', label='female')
z = np.polyfit(x, y, 1)
ax.legend()
ax.set_xlabel('age')
ax.set_ylabel('income')

我也试图使用此代码创建线性趋势,但我没有成功:

p = np.poly1d(z)

您正在检查错误的字符串。数据将其标记为&;f&;或";m"你在查完整的单词

female_data = [(a,i) for a,i,g in vals if g == 'f']
male_data = [(a,i) for a,i,g in vals if g == 'm']
#substitute in ages and incomes for x and y respectively. 
#Z returns a the coefficients starting from higher power to 0, in this case from 1 to 0.
z = np.polyfit(ages, incomes, 1) 
#compose the function 
f = lambda x: (x * z[0]) + z[1] 
#create the range of x for our graph
x = [x for x in np.linspace(20,80,10)]
#use our function to calculate the y for each x in our range
y = [f(a) for a in x]
#plot the line
ax.plot(x,y)
ax.set_xlabel('age')
ax.set_ylabel('income')
plt.show()