我有这个数据集:
+-----------------------------------+------------+----------------------+
| A | B | C |
+-----------------------------------+------------+----------------------+
| Joseph M. Acaba | 2004 |Geology |
| Loren W. Act Solar Physics |
| James C. Adamson | 1984 |Aerospace Egineerig |
+-----------------------------------+------------+--------+
我想检查C列的每一行,相应的短语中是否有"工程"或"地质"一词。我希望在新列("D"(中设置结果,如下例所示:
+----------------------+------------+----------------------+---------+
| A | B | C | D |
+----------------------+------------+----------------------+---------+
| Joseph M. Acaba | 2004 |Geology |True
| Loren W. Act | Solar Physics |False
| James C. Adamson | 1984 |Aerospace Egineerig |True
+-----------------------------------+------------+--------+
我尝试了:
check=pd['Undergraduate Major'].str.contains('Engineering'|”Geology”)
print(check)
并得到结果:
0 False
1 True
2 True
3 False
4 True
...
352 True
353 False
354 False
355 True
356 False
但我希望结果成为一个新列,只有"假"和"真"。
如果数据帧名为df
,请执行以下操作:
df['NewColumnName'] = df['Undergraduate Major'].str.contains('Engineering|Geology')
我建议避免使用pd
作为数据帧的名称,因为它通常用于熊猫,如import pandas as pd
.