将字符串空格映射到单词列表



我有一个字符串作为

flagged_line = "V. Divakar Botcha1,2, Mengdie Zhang1, Kuilong Li1,2, Hong Gu1,2, Zhonghui Huang1, Jianhui Cai3, Youming Lu1, Wenjie Yu3, and Xinke Liu1*  "

和单词列表作为

words = ['V.', 'Divakar', 'Botcha', '1', ',', '2', ',', 'Mengdie', 'Zhang', '1', ',', 'Kuilong', 'Li', '1', ',', '2', ',', 'Hong', 'Gu', '1', ',', '2', ',', 'Zhonghui', 'Huang', '1', ',', 'Jianhui', 'Cai', '3', ',', 'Youming', 'Lu', '1', ',', 'Wenjie', 'Yu', '3', ',', 'and', 'Xinke', 'Liu', '1', '*']

它们都来自 2 个不同的程序,现在我需要将字符串中的空格映射到列表中的单词,例如:(注意,单词后面跟一个空格的尾随空格(

['V. ', 'Divakar ', 'Botcha', '1', ',', '2', ', ', 'Mengdie ', 'Zhang', '1', ', ', 'Kuilong ', 'Li', '1', ',', '2', ', ', 'Hong ', 'Gu', '1', ',', '2', ', ', 'Zhonghui ', 'Huang', '1', ', ', 'Jianhui ', 'Cai', '3', ', ', 'Youming ', 'Lu', '1', ', ', 'Wenjie ', 'Yu', '3', ', ', 'and ', 'Xinke ', 'Liu', '1', '*  ']

我正在尝试的是逐个字符检查它们,然后分配空格

index_str = 0
for elem in words:
    for e in elem:
        if e == flagged_line[index_str]:
            index_str+=1
            pass
        else:
            index_str+=1
            elem = elem+' '  # issue not generalized for spaces
            print('"',elem,'"')

更新:

列表元素将包含空间以帮助映射,例如

在字符串中是

"V. Divakar  "

列表是

['V.','Divakar']

那么最终名单应该是

['V. ','Divakar  ']

稍后我将迭代列表并将元素附加到我的下一个函数中。

最后也可以有多个空格

使用 str.find() 查找子字符串的单行代码,如果存在,则附加空格:

flagged_line = "V. Divakar Botcha1,2, Mengdie Zhang1, Kuilong Li1,2, Hong Gu1,2, Zhonghui Huang1, Jianhui Cai3, Youming Lu1, Wenjie Yu3, and Xinke Liu1*  "    
words = ['V.', 'Divakar', 'Botcha', '1', ',', '2', ',', 'Mengdie', 'Zhang', '1', ',', 'Kuilong', 'Li', '1', ',', '2', ',', 'Hong', 'Gu', '1', ',', '2', ',', 'Zhonghui', 'Huang', '1', ',', 'Jianhui', 'Cai', '3', ',', 'Youming', 'Lu', '1', ',', 'Wenjie', 'Yu', '3', ',', 'and', 'Xinke', 'Liu', '1', '*']
print(['{0} '.format(x) if flagged_line.find(x + " ") != -1 else x for x in words ])

输出

['V. ', 'Divakar ', 'Botcha', '1', ', ', '2', ', ', 'Mengdie ', 'Zhang', '1', ', ', 'Kuilong ', 'Li', '1', ', ', '2', ', ', 'Hong ', 'Gu', '1', ', ', '2', ', ', 'Zhonghui ', 'Huang', '1', ', ', 'Jianhui ', 'Cai', '3', ', ', 'Youming ', 'Lu', '1', ', ', 'Wenjie ', 'Yu', '3', ', ', 'and ', 'Xinke ', 'Liu', '1', '* ']

我假设flagged_linewords完全匹配。你可以通过一次传递来做到这一点,只需保持flagged_lineindex,然后跳过len(word)来查找一个单词后是否有空格,如果有,则添加到您的结果中:

flagged_line = "V. Divakar Botcha1,2, Mengdie Zhang1, Kuilong Li1,2, Hong Gu1,2, Zhonghui Huang1, Jianhui Cai3, Youming Lu1, Wenjie Yu3, and Xinke Liu1*  "
words = ['V.', 'Divakar', 'Botcha', '1', ',', '2', ',', 'Mengdie', 'Zhang', '1', ',', 'Kuilong', 'Li', '1', ',',
         '2', ',', 'Hong', 'Gu', '1', ',', '2', ',', 'Zhonghui', 'Huang', '1', ',', 'Jianhui', 'Cai', '3', ',',
         'Youming', 'Lu', '1', ',', 'Wenjie', 'Yu', '3', ',', 'and', 'Xinke', 'Liu', '1', '*']
words_with_spaces = []
idx = 0
for i, word in enumerate(words):
    idx += len(word)
    cur_word = word
    while idx < len(flagged_line) and flagged_line[idx] == ' ':
        cur_word += ' '
        idx += 1
    words_with_spaces.append(cur_word)
print(words_with_spaces)

输出:

['V. ', 'Divakar ', 'Botcha', '1', ',', '2', ', ', 'Mengdie ', 'Zhang', '1', ', ', 'Kuilong ', 'Li', '1', ',', '2', ', ', 'Hong ', 'Gu', '1', ',', '2', ', ', 'Zhonghui ', 'Huang', '1', ', ', 'Jianhui ', 'Cai', '3', ', ', 'Youming ', 'Lu', '1', ', ', 'Wenjie ', 'Yu', '3', ', ', 'and ', 'Xinke ', 'Liu', '1', '*  ']

希望对您有所帮助,如果您有其他问题,请发表评论。

相关内容

  • 没有找到相关文章

最新更新