Python子字符串查找



我有一个小请求,我需要关于以下代码的帮助:

def grepi(dico, fichier):
line_number = 0
nameFile = os.path.basename(fichier)
# Chargement dico
with open(dico, encoding="utf-8") as dic:
dicolist = dic.read().splitlines()

# Recherche dans fichier
with open(fichier, encoding="utf-8") as fic:
ficlist = fic.read().splitlines()
for line in ficlist:
line_number += 1
for patt in dicolist:
line = line.lower()
if re.search(r' + line + r'b', patt):
print(line.rstrip() + ', ' + patt + ', ' + nameFile + ', '
+ str(line_number))

我遇到麻烦了:if re.search(r' + line + r'b', patt):

dico是名字的字典,如:

benoît
Nicolas
Stéphane
Sébastien
Alexandre

fichier是一个包含大量信息的文件,如:

Is the first name of Nicolas
Is Benoît is here
Hey 1234Alexandre1234
Stéphane found something
dfqklnflSébastiendsqjfldsjfldksj

等等。。

在文件中,我想返回所有精确的字符串(它们是名字(。但有些名称的格式是这样的:1234Alexandre5678和我找不到只返回Alexandre的方法,对于我想返回Sébastien的dfqklnflSébastiendsqjfldsjfldksj也是如此。。。

有人能帮我吗?谢谢

我如何用答案纠正我的代码:

#!/usr/bin/env python3
import os
import re

def grepi(dico, fichier):
line_number = 0
nameFile = os.path.basename(fichier)
result_final = []
dicolist = open(dico, encoding="utf-8").read().splitlines()
print(dicolist)
with open(fichier, encoding="utf-8") as ficlist:
ficstring = ficlist.read().splitlines()
for line in ficstring:
ptrn = re.compile(r"w*(" + "|".join(dicolist) + r")w*",
flags=re.I)
ptrn_result = ptrn.findall(line)
if ptrn_result:
result_final = (nameFile, line_number, str(ptrn.findall(line)))
print(result_final)
line_number += 1

这里的输出:

('prénom.xml', 4, "['Benoit']")
('prénom.xml', 6, "['Stéphane']")
('prénom.xml', 9, "['Alexandre']")
('prénom.xml', 10, "['Nicolas']")
('prénom.xml', 14, "['Sébastien']")

尝试使用模式'w*(benoît|Nicolas|Stéphane|Sébastien|Alexandre)w*'

例如:

import re
dicolist = ['benoît', 'Nicolas', 'Stéphane', 'Sébastien', 'Alexandre']
s = """Is the first name of Nicolas
Is Benoît is here
Hey 1234Alexandre1234
Stéphane found something
dfqklnflSébastiendsqjfldsjfldksj"""
ptrn = re.compile(r"w*(" + "|".join(dicolist) + r")w*", flags=re.I)
print(ptrn.findall(s))

输出:

['Nicolas', 'Benoît', 'Alexandre', 'Stéphane', 'Sébastien']

哦!巴迪,你首先要做的函数grepi((需要一些缩进。其余的问题对我来说也相当复杂。

最新更新