我不知道这是否可能,但我正在尝试通过索引访问拆分字符串中的单词(而不是单个字符(,并将其存储在字典中。如果这不起作用,请有其他关于如何获得相同结果的建议吗。这是我迄今为止的代码:
def main():
if len(argv) != 2:
print("usage: python import.py (csv file)")
exit(0)
db = SQL("sqlite:///students.db")
file = open(argv[2], 'r')
csv_file = DictReader(file)
for row in csv_file:
names = row['name']
for i in names:
word = i.split()
# what the csv looks like
name,house,birth
Adelaide Murton,Slytherin,1982
Adrian Pucey,Slytherin,1977
Anthony Goldstein,Ravenclaw,1980
# what i want it to look like
first name,middle name,last name,house,birth
Adelaide,None,Murton,Slytherin,1982
Adrian,None,Pucey,Slytherin,1977
Anthony,None,Goldstein,Ravenclaw,1980
如果单词之间有逗号,则可以执行words = i.split(',')
或任何将分隔符作为参数传递给split()
的操作
sentence = 'This is an example' # string: 'This is an example'
words = sentence.split() # list of strings: ['This', 'is', 'an', 'example']
在这一点上,您可以通过调用其索引来获得特定的单词,或者像for word in words:
一样循环使用它们。
我不确定您代码中的SQL部分,但在执行for i in names:
时,看起来您已经在循环这些单词了。
您可以尝试使用此代码
def create_word_index(filenames, rare_word_flag=False):
word_index, word_count, single_words = {}, 0, [] # original : word_count = 0
## Handle word index and word count
for idx, filename in enumerate(filenames):
with open(filename) as f:
for sentence in f.readlines():
words = sentence.strip().split()
for word in words:
# word = process(word) # Do more preprocessing stuff here if you need
if rare_word_flag and (word in single_words):
word_index[word] = 1
continue
if word in word_index:
continue
word_index[word] = word_count
word_count += 1
return word_index, word_count
filenames = ["haha.txt"]
word_idx, word_count = create_word_index(filenames, False)
print(word_idx)
print(word_count)
# haha.txt file:
name=huhu, check=ok
name=haha, check=not good