用于映射的Pandas数据框架



我有两个熊猫数据框架

DF1 =  
Max Score                           Sub parameters Grading Score
0          3                                 Greeting     Yes     3
1          7                                Listening     Yes     7
2          7            Comprehension and Application     Yes     7
3          5                     Appropriate Response     Yes     5
4          4                             Paraphrasing     Yes     4
5          7  Educating customer/Setting Expectations     Yes     7
6          4                          Professionalism     Yes     4
7          4                                  Empathy     Yes     4
8          4                      Ownership/Assurance     Yes     4
9          4                  Followed Hold Procedure     Yes     4
10         3                                  Closure     Yes     3
11         8         Sentence construction/Word order     Yes     8
12         8                   Pronunciation/Chunking     Yes     8
13         8               Fluency & Lexical resource     Yes     8
14         8                        Tone & Intonation     Yes     8
15         8                           Rate of Speech     Yes     8
16         8                                  Diction     Yes     8
DF2=
0            Grade  3  7.0  4.0  5.0  2.0  8.0
0              Yes  3  7.0  4.0  5.0  2.0  8.0
1  Yes/Improvement  1  4.0  2.0  3.0  1.0  4.0
2               No  0  0.0  0.0  0.0  0.0  0.0

我想映射这两个数据框。在df1中,我想映射分级和分数,无论它与df2匹配,我都想获取这些值。例如在DF1 "Grade"值为"是";和分数是"3",我想搜索这个在df2,我希望获取"是/改进"one_answers"1";还有"不";和"0";在df2

中作为单独的列存储
Sub parameters Grading_Yes Score_yes Grading_Y/I Score_yes Grading_No 
Greeting           Yes     3             Y/I           1         No
Score_No
0    

为了方便查询,我修改了df2如下:

import numpy as np
import pandas as pd
txt = '''0            Grade  3  7.0  4.0  5.0  2.0  8.0
0              Yes  3  7.0  4.0  5.0  2.0  8.0
1  Yes/Improvement  1  4.0  2.0  3.0  1.0  4.0
2               No  0  0.0  0.0  0.0  0.0  0.0'''
arr = txt.split()
arr = np.array(arr).reshape(4,-1)
df2 = pd.DataFrame(arr[1:,2:].T, columns=arr[1:,1]).astype(np.float32)

使用apply方法完成映射:

def func(row):
grade = row.Grading
score = row.Score
vals = df2.loc[df2[grade]==score].values[0]
return ('Yes', vals[0], 'Y/I', vals[1], 'No', vals[2]) 
df1[['Grading_Yes', 'Score_Yes', 'Grading_Y/I', 'Score_Y/I', 'Grading_No', 'Score_No']] = df1.apply(func,axis=1,result_type='expand')
# drop unnecessary columns
df1.drop(['Grading','Score'],inplace=True,axis=1)

使用DataFrame.join转置df2并将Grade转换为索引:

df = df1.join(df2.set_index('Grade').T.rename(float).add_prefix('Score_'), on='Score')

print (df)
Max Score                           Sub parameters Grading  Score  
0           3                                 Greeting     Yes      3   
1           7                                Listening     Yes      7   
2           7            Comprehension and Application     Yes      7   
3           5                     Appropriate Response     Yes      5   
4           4                             Paraphrasing     Yes      4   
5           7  Educating customer/Setting Expectations     Yes      7   
6           4                          Professionalism     Yes      4   
7           4                                  Empathy     Yes      4   
8           4                      Ownership/Assurance     Yes      4   
9           4                  Followed Hold Procedure     Yes      4   
10          3                                  Closure     Yes      3   
11          8         Sentence construction/Word order     Yes      8   
12          8                   Pronunciation/Chunking     Yes      8   
13          8               Fluency & Lexical resource     Yes      8   
14          8                        Tone & Intonation     Yes      8   
15          8                           Rate of Speech     Yes      8   
16          8                                  Diction     Yes      8   
Score_Yes  Score_Yes/Improvement  Score_No  
0         3.0                    1.0       0.0  
1         7.0                    4.0       0.0  
2         7.0                    4.0       0.0  
3         5.0                    3.0       0.0  
4         4.0                    2.0       0.0  
5         7.0                    4.0       0.0  
6         4.0                    2.0       0.0  
7         4.0                    2.0       0.0  
8         4.0                    2.0       0.0  
9         4.0                    2.0       0.0  
10        3.0                    1.0       0.0  
11        8.0                    4.0       0.0  
12        8.0                    4.0       0.0  
13        8.0                    4.0       0.0  
14        8.0                    4.0       0.0  
15        8.0                    4.0       0.0  
16        8.0                    4.0       0.0  

相关内容

  • 没有找到相关文章

最新更新