将文本中的股票报价机与股票报价机列表匹配，但不匹配停止词

我有一个python列表，大约有28,000个股票报价员。

我正在通过文本进行解析，以便与股票报价器匹配，并在获得匹配时增加计数。

我遇到的问题是所有的停止词都与一些我不想要的股票相匹配;V是一个合法的代码，并且与个人标记化的单词匹配，因为它是自由流动的社交媒体文本。我想要TSLA。

你能给我一些逻辑建议，我可以应用一些逻辑智能匹配使用这些停止词吗?

counts = dict()
Symbol_list =['TSLA','V','T','AAPL',...]
example sentence = { 'V want TSLA but not. T + 5 times' }

这是我到目前为止所尝试的:

sen = example_sentence.translate(str.maketrans('','',string.punctuation))
sentence_words = sen.split()
for words in sentence_words:
if(word in symbol_list):
counts[word] = counts.get(word,0) + 1

我想要{'TSLA':1}而不是{'TSLA':1, 'V':1, 'T': 1}。在某些情况下，我可能需要将T和V添加到字典中，但这取决于上下文。

import string
counts = dict()
example_sentence = { 'V want TSLA but not. T + 5 times' }
counts = dict()
symbol_list =['TSLA','V','T','AAPL',...]
example_sentence = 'V want TSLA but not. T + 5 times'
sen = example_sentence.translate(str.maketrans('','',string.punctuation))
sentence_words = sen.split()
counts_list = []
for word in sentence_words:
if(word in symbol_list):
counts[word] = counts.get(word,0) + 1
counts_list.append({word:counts[word]})

现在你的输出是一个字典列表:counts_list [1]{"厂商":1}

相关内容

最新更新

热门标签：