我被卡住了,请帮助我。我的目的是计算数据库中每个描述与下行中的描述之间的Levenshtein距离。我已经接近了,但我相信以下行...
x = editdistance.eval(item, t[a:b])
...不起作用,因为项目和t [a:b]没有被视为字符串。
我该如何将它们转换为字符串,以使其起作用?
代码:
import csv
import sqlite3
from array import *
import editdistance
conn = sqlite3.connect('transactions.db')
c = conn.cursor()
c.execute('select distinct description from transactions order by description')
t=[]
for row in c:
row = c.fetchone()
t.append(row)
for item in t:
if t.index(item)<10: #just to limit output for testing
print item
a = t.index(item)+1
b = a + 1
print t[a:b]
x = editdistance.eval(item, t[a:b])
print x
print "n"
输出:
(U'starbucks#02472 Louis Louisville KY借记卡提款:M/C借记卡',)[(U'starbucks#21137 Louis Louisville KY借记卡提款:M/C借记卡',)]
1
(U'starbucks#21137 Louis Louisville KY借记卡提款:M/C借记卡',)[(U'starbucks商店02561 Louisville KY借记卡提款:M/C借记卡',)]
1
(U'starbucks商店02561 Louisville KY借记卡提款:M/C借记卡',),)[(u'steak-n-shake#0701 Louisville KY借记卡提款:M/C借记卡',)
1
没关系我是个白痴。由于某种原因,它以前没有起作用,但是现在当我简单地做x = editdistance.eval(str(item),str(t [a:b]))