我在Python中运行这段代码,我不知道它总是出错
import string
import nltk
from sklearn.pipeline import Pipeline
import pandas as pd
import numpy as np
import re
data = pd.read_csv(r'C:UsersPrihantoro Tri NOneDriveDocumentsfile toroMSIBMagangHukumonlineProjectyoutube commentsdataset_komentar_instagram_cyberbullying.csv', sep=',', encoding='utf-8')
def casefolding(comment):
comment = comment.lower()
comment = comment.strip(" ")
comment = re.sub(r'[?|$|.|!_:")(-+,]', '', comment)
return comment
data['comment'] = data['comment'].apply(casefolding)
data.head(100)
,结果给出如下错误:
NameError Traceback (most recent call last)
Input In [3], in <cell line: 8>()
6 comment = re.sub(r'[?|$|.|!_:")(-+,]', '', comment)
7 return comment
----> 8 data['comment'] = data['comment'].apply(casefolding)
9 data.head(100)
NameError: name 'data' is not defined
或者结果是这样的>>KeyError:"评论">
我认为你的数据框架没有"comment"
列,所以请尝试检查数据框架中的所有列。试着运行这个data.columns
#让我们假设你是从一个CSV文件
import pandas as pd
进口re
df = pd.read_csv('your_csv_file.csv')
data = pd.DataFrame(df)
def casefolding(评论):
comment = comment.lower()
comment = comment.strip(" ")
comment = re.sub(r'[?|$|.|!_:")(-+,]', '', comment)
return comment
data['comment'] = data['comment'].apply(casefolding)
data.head (100)