Python:从字符串中检测单词并找到它的位置



我是python新手,想编写一个简单的程序,以詹姆斯·邦德的方式打印你的名字和介词。

因此,如果名称包含任何介词,例如'Van', 'Von', 'De'或'Di',我希望程序将其打印为:

{Preposition} {LastName}, {FirstName} {Preposition} {LastName} *edited

为此,我知道我们需要一个用户名和介词的列表。

a = [user input separated with the .split function]
b = [list of prepositions]

为了在名称中找到介词的实例,我发现可以使用下面的代码:

if any(x in a for x in b):

然而,我在试图打印名称时遇到了一个问题,因为介词可能是上述(列表b)中的任何一个。我找不到一种方法来打印,而不知道这个及其在字符串中的位置。首先,我认为可以使用.index函数,但它似乎只能搜索一个单词,而不是这里需要的几个单词。我能得到的最接近的是:

name_split.index('preposition1') # works
name_split.index('preposition1', 'preposition2', etc.) # does not work

所以我要求的是是否有一种方法可以检查输入文本中是否使用了列表(b)中的任何单词,并获得该单词的位置。

希望我能解释清楚,希望有人能给我一些帮助。提前;谢谢你。

我想不出比使用for循环更好的方法了:

pattern = "{1} {2}, {0} {1} {2}"
prepositions = ['van', 'von', 'de', 'di']
# (optional) 'lower' so that we don't have to consider cases like 'vAn'
name = "Vincent van Gogh".lower()
index = -1  # by default, we believe that we did not find anything
for preposition in prepositions:
# 'find' is the same as 'index', but returns -1 if the substring is not found
index = name.find(preposition)
if index != -1:
break  # found an entry
if index == -1:
print("Not found")
else:
print("The index is", index,
"and the preposition is", preposition)
print(pattern.format(*name.split()))

输出:

The index is 8 and the preposition is van
van gogh, vincent van gogh

如果你想遍历名字列表,那么你可以这样做:

pattern = ...
prepositions = ...
names = ...
for name in names:
name = name.lower()
... # the rest is the same

第二类介词新版本("Jr.", "Sr."):

def check_prepositions(name, prepositions):
index = -1
for preposition in prepositions:
index = name.find(preposition)
if index != -1:
break  # found an entry
return index, preposition

patterns = [
"{1} {2}, {0} {1} {2}",
"{1}, {0} {1} {2}"
]
all_prepositions = [
['van', 'von', 'de', 'di'],
["Jr.", "Sr."]
]
names = ["Vincent van Gogh", "Robert Downey Jr.", "Steve"]
for name in names:
for pattern, prepositions in zip(patterns, all_prepositions):
index, preposition = check_prepositions(name, prepositions)
if index != -1:
print("The index is", index,
"and the preposition is", preposition)
print(pattern.format(*name.split()))
break
if index == -1:
print("Not found, name:", name)

输出:

The index is 8 and the preposition is van
van Gogh, Vincent van Gogh
The index is 14 and the preposition is Jr.
Downey, Robert Downey Jr.
Not found, name: Steve

为什么在名称中找到介词很重要?你不会把它打印到任何地方,你真正关心的是姓氏其余的名字。而不是寻找介词,您可以简单地使用rsplit()从右侧分离,并要求maxsplit为1。例如:

>>> "Vincent van Gogh".rsplit(" ", 1)
['Vincent van', 'Gogh']
>>> "James Bond".rsplit(" ", 1)
['James', 'Bond']

然后,您可以简单地打印出您认为合适的值。

fname, lname = input_name.rsplit(" ", 1)
print(f"{lname}, {fname} {lname}")

对于input_name = "Vincent van Gogh",这打印Gogh, Vincent van Gogh。对于input_name = "James Bond",您得到Bond, James Bond

这有一个额外的好处,如果人们输入中间名/首字母,它也可以工作。

>> fname, lname = "Samuel L. Jackson".rsplit(" ", 1)
>> print(f"{lname}, {fname} {lname}")
Jackson, Samuel L. Jackson

请注意,在人们写名字的方式中有许多的奇怪之处,所以有必要看看程序员相信的关于名字的谎言

使用正则表达式的不同方法(我知道)。

import re
def process_input(string: str) -> str:
string = string.strip()
# Preset some values.
ln, fn, prep = "", "", ""
# if the string is blank, return it
# Otherwise, continue.
if len(string) > 0:
# Search for possible delimiter.
res = re.search(r"([^a-z0-9-'. ]+)", string, flags = re.I)
# If delimiter found...
if res:
delim = res.group(0)
# Split names by delimiter and strip whitespace.
ln, fn, *err = [s.strip() for s in re.split(delim, string)]

else:
# Split on whitespace
names = [s.strip() for s in re.split(r"s+", string)]
# If first, preposition, last exist or first and last exist.
# update variables.
# Otherwise, raise ValueError.
if len(names) == 3:
fn, prep, ln = names
elif len(names) == 2:
fn, ln = names
else:
raise ValueError("First and last name required.")
# Check for whitespace in last name variable.
ws_res = re.search(r"s+", ln)
if ws_res:
# Split last name if found.
prep, ln, *err = re.split(r"s+", ln)

# Create array of known names.
output = [f"{ln},", fn, ln]
# Insert prep if it contains a value
# This is simply a formatting thing.
if len(prep) > 0:
output.insert(2, prep)
# Output formatted string.
return " ".join(output)
return string

if __name__ == "__main__":
# Loop until q called or a max run amout is reached.
re_run = True
max_runs = 10
while re_run or max_runs > 0:
print("Please enter your full namenor press [q] to exit:")
user_input = input()
if user_input:
if user_input.lower().strip() == "q":
re_run = False
break
result = process_input(user_input)
print("n" + result + "nn")
max_runs -= 1

最新更新