使用Pandas合并来自单独.csv文件的数据



我想在job_transitions_sample.csv中创建两个新列,并从wage_data_sample.csv中为标题1和标题2添加工资数据:

job_transitions_sample.csv:

Title 1                    Title 2  Count
0   administrative assistant             office manager     20
1                 accountant                    cashier      1
2                 accountant          financial analyst     22
4                 accountant          senior accountant     23
6           accounting clerk                 bookkeeper     11
7     accounts payable clerk  accounts receivable clerk      8
8   administrative assistant           accounting clerk      8
9   administrative assistant       administrative clerk     12
...

wage_data_sample.csv

title   wage
0                   cashier  17.00
1           sandwich artist  18.50
2                dishwasher  20.00
3                babysitter  20.00
4                   barista  21.50
5               housekeeper  21.50
6    retail sales associate  23.00
7                 bartender  23.50
8                   cleaner  23.50
9                 line cook  23.50
10               pizza cook  23.50
...

我希望最终结果看起来像这样:

Title 1             Title 2  Count  Wage of Title 1  Wage of Title 2
0    administrative assistant      office manager     20              NaN              NaN
1                  accountant             cashier      1              NaN              NaN
2                  accountant   financial analyst     22              NaN              NaN
...

我正在考虑使用字典,然后尝试迭代每一列,但有没有更优雅的内置解决方案?这是我迄今为止的代码:

wage_data = pd.read_csv('wage_data_sample.csv')
dict = dict(zip(wage_data.title, wage_data.wage))

通过字典d使用Series.map-不能将dict用于变量名称,因为python代码名称:

df = pd.read_csv('job_transitions_sample.csv')
wage_data = pd.read_csv('wage_data_sample.csv')
d = dict(zip(wage_data.title, wage_data.wage))
df['Wage of Title 1'] = df['Title 1'].map(d)
df['Wage of Title 2'] = df['Title 2'].map(d)

您可以依次尝试使用两个merge来控制两个不同的标题。

例如,设为

  • df1:job_transitions_sample.csv

  • df2:wage_data_sample.csv

    df1.merge

相关内容

  • 没有找到相关文章

最新更新