我必须' \n, *, ' ==> 'n *'
但我试着用df['Course_content']=df['Course_content'].replace(' \n, *, ','n *',regex=True)
但它对我不起作用
>>> df['Course_content'][0]
'The syllabus for this course will cover the following:, \n, *, The nature and purpose of cost and management accounting, \n, *, Source documents and coding, \n, *, Cost classification and measuring, \n, *, Recording costs, \n, *, Spreadsheets'
>>> df['Course_content']=df['Course_content'].replace(' \n, *, ','n *',regex=True)
>>> df['Course_content'][0]
'The syllabus for this course will cover the following:, \n, *, The nature and purpose of cost and management accounting, \n, *, Source documents and coding, \n, *, Cost classification and measuring, \n, *, Recording costs, \n, *, Spreadsheets'
>>>
我也尝试使用以下代码,但它也不适用于我
d = {
'Not Mentioned':'',
"rn": "n",
"\r": "n",
'u00a0':' ',
' \n, *,': "n * ",
' \n,':'n',
}
df=df.replace(d.keys(),d.values(),regex=True)
您可以将这两个参数放入r-string中,并在第一个参数的*
之前添加一个。这是必要的,因为
和
*
是正则表达式中的特殊元字符,您必须使用额外的和/或r-string将这些字符"转义"为其文字值。
您可以使用:
df['Course_content'] = df['Course_content'].replace(r' \n, *, ', r'n *', regex=True)
演示:
data = {'Course_content': ['The syllabus for this course will cover the following:, \n, *, The nature and purpose of cost and management accounting, \n, *, Source documents and coding, \n, *, Cost classification and measuring, \n, *, Recording costs, \n, *, Spreadsheets']}
df = pd.DataFrame(data)
df['Course_content'] = df['Course_content'].replace(r' \n, *, ', r'n *', regex=True)
结果:
print(df['Course_content'][0])
'The syllabus for this course will cover the following:,n *The nature and purpose of cost and management accounting,n *Source documents and coding,n *Cost classification and measuring,n *Recording costs,n *Spreadsheets'