当检查列表中字符串的存在时,我遇到了TypeError:
我检查了类似的问题,如在这个线程和这个,但我找不到一个解决方案。
我期望代码计算(一次迭代)"pagerank.txt
中所有文档的页面排名,看起来像这样,但是它遇到了一个错误。
我的完整代码:
def calc_pageranks(pagerank_file = "pagerank.txt", damping=0.9):
pagerankScores = {}
pagerankData = {} #dict of what files a file (the dict key) points to
with open(pagerank_file, "r") as f:
for row in f:
row = row.split()
pagerankScores.update({row[0]:1}) #set starting pagerank for documents
if len(row) == 1:
pagerankData.update({row[0]:None})
else:
pagerankData.update({row[0]:row[1:]}) #pagerank graph
for docName in pagerankData:
pagerank = pagerankScores[docName]
temp = 0
for val in pagerankData.values():
if val == None:
pass
else:
if docName in val:
temp += pagerank / len(val)
pagerank = (1 - damping) + damping * temp
pagerankData.update({docName:pagerank})
return pagerankScores, pagerankData
docName
和val
是这样的:doc1.txt <class 'str'> ['doc2.txt', 'doc8.txt'] <class 'list'>
完整错误信息:
/Library/Frameworks/Python.framework/Versions/3.9/bin/python3 /Users/MacSuperior/Desktop/Coding/IR_system/search_engine.py
Traceback (most recent call last):
File "/Users/MacSuperior/Desktop/Coding/IR_system/search_engine.py", line 99, in <module>
print(calc_pageranks())
File "/Users/MacSuperior/Desktop/Coding/IR_system/search_engine.py", line 92, in calc_pageranks
if docName in val:
TypeError: argument of type 'float' is not iterable
Pagerank.txt:
doc1.txt doc2.txt doc8.txt
doc2.txt doc1.txt doc2.txt doc9.txt
doc3.txt doc4.txt
doc4.txt doc1.txt doc10.txt
doc5.txt doc6.txt
doc6.txt
doc7.txt doc1.txt
doc8.txt doc9.txt doc10.txt
doc9.txt doc10.txt
doc10.txt doc9.txt
这里有一些重构的想法来帮助消除打字错误。
def calc_pageranks(pagerank_file = "pagerank.txt", damping=0.9):
pagerankScores = {}
pagerankData = {}
with open(pagerank_file, "r") as f:
for row in f:
key, *values = row.split()
pagerankData[key] = values
for docName in pagerankData:
pagerank = (1-damping) + damping * sum((docName in value)/len(value) for value in pagerankData.values() if value)
pagerankScores[docName] = pagerank
return pagerankScores, pagerankData