所以,我试图搜索,看看是否在file2.txt中的每一行包含file1.txt 1中的任何单词。例如:
文件1:love,10
like,5
best,10
hate,1
lol,10
better,10
worst,1
文件2:一堆句子我想看看它是否包含任何file1(超过200行)
我有一种方法在我的程序中处理我自己的文件,它可以工作,但它将总价值添加到一个大列表中(比如如果整个文件说love 43次,那么love:43,但我正在为每一行寻找单独的列表。因此,如果一行包含4次love,另外5次,那么程序将显示这个…具体来说,我要做的是计算文件每行中关键字的总数(因此,如果一行包含4个关键字,那么该行的总数为4,以及与关键字相关的值(因此,您看到我的示例文件1中如何与关键字相关的值?如果文件中的一行是:Hi I love my boyfriend but I like my bestfriend lol
,那么这一行将是{Love: 1, like: , lol:1}(keywords = 3, Total = 25
(总数来自列表中与它们相关的值)
,如果第二行只是
I hate my life. It is the worst day ever!
那么这就是{hate: 1, worst: 1}(keywords = 2, total = 2
我有这个,它可以工作,但是有没有办法修改它,而不是打印一大行,像:
{'please': 24, 'worst': 40, 'regrets': 1, 'hate': 70,... etc,} it simply adds the total number of keywords per line and the values associated with them?
wordcount = {}
with open('mainWords.txt', 'r') as f1, open('sentences.txt', 'r') as f2:
words = f1.read().split()
wordcount = { word.split(',')[0] : 0 for word in words}
for line in f2:
line_split = line.split()
for word in line_split:
if word in wordcount:
wordcount[word] += 1
print(wordcount)
像往常一样,collections
挽救了局面:
from collections import Counter
with open('mainWords.txt') as f:
sentiments = {word: int(value)
for word, value in
(line.split(",") for line in f)
}
with open('sentences.txt') as f:
for line in f:
values = Counter(word for word in line.split() if word in sentiments)
print(values)
print(sum(values[word]*sentiments[word] for word in values)) # total
print(len(values)) # keywords
你在字典sentiments
中有情感极性供以后使用。