从没有标点符号的字符串搜索到主字符串,并从那里取带标点符号的切片,没有库,可能吗?



我有这个作业要做(没有库允许),我低估了这个问题:

假设我们有一个字符串列表:str_list = ["my head's", "free", "at last", "into alarm", "in another moment", "neck"]

我们可以确定的是,master_string中包含的每个字符串都是有序的,并且没有标点符号。(这一切都要归功于我以前的控制)

然后是字符串:master_string = "'Come, my head's free at last!' said Alice in a tone of delight, which changed into alarm in another moment, when she found that her shoulders were nowhere to be found: all she could see, when she looked down, was an immense length of neck, which seemed to rise like a stalk out of a sea of green leaves that lay far below her."

我在这里必须做的基本上是检查包含在master_string中的str_list中至少k(在本例中为k = 2)的字符串序列,但是我低估了这样一个事实,即在str_list中我们每个字符串中有超过1个单词,因此做master_string.split()不会带我去任何地方,因为这意味着要问if "my head's" == "my"之类的东西,这当然是假的。

我正在考虑做一些事情,比如一次以某种方式连接字符串并搜索到master_string.strip(".,:;!?"),但如果我找到相应的序列,我绝对需要直接从master_string中获取它们,因为我需要结果变量中的标点符号。这基本上意味着直接从master_string中获取切片,但这怎么可能呢?有可能吗,还是我要改变方法?这简直要把我逼疯了,特别是因为没有库允许这样做。

如果你想知道这里的预期结果是什么:

["my head's free at last!", "into alarm in another moment,"](因为两者都符合str_list中至少k个字符串的条件)和"neck"将被保存在discard_list中,因为它不尊重那个条件(它不能被.pop()丢弃,因为我需要对丢弃的变量做其他事情)

遵循我的解决方案:

  1. 尝试在master_string和有限的标点符号集(例如my head’s->my head’s free at last!;freefree at last!)。
  2. 只保留至少k次扩展的子字符串。
  3. 删除多余的子字符串(例如free at last!已经与my head’s free at last!一起存在)。

这是代码:

str_list = ["my head’s", "free", "at last", "into alarm", "in another moment", "neck"]
master_string = "‘Come, my head’s free at last!’ said Alice in a tone of delight, which changed into alarm in another moment, when she found that her shoulders were nowhere to be found: all she could see, when she looked down, was an immense length of neck, which seemed to rise like a stalk out of a sea of green leaves that lay far below her."
punctuation_characters = ".,:;!?"  # list of punctuation characters
k = 1
def extend_string(current_str, successors_num = 0) :
# check if the next token is a punctuation mark
for punctuation_mark in punctuation_characters :
if current_str + punctuation_mark in master_string :
return extend_string(current_str + punctuation_mark, successors_num)

# check if the next token is a proper successor
for successor in str_list :
if current_str + " " + successor in master_string :
return extend_string(current_str + " " + successor, successors_num+1)

# cannot extend the string anymore
return current_str, successors_num
extended_strings = []
for s in str_list :
extended_string, successors_num = extend_string(s)
if successors_num >= k : extended_strings.append(extended_string)
extended_strings.sort(key=len)  # sorting by ascending length
result_list = []
for es in extended_strings :
result_list = list(filter(lambda s2 : s2 not in es, result_list))
result_list.append(es)
print(result_list)      # result: ['my head’s free at last!', 'into alarm in another moment,']

我有两个不同的版本,第一个给你脖子,但第二个没有给你那么多,这是第一个:

master_string = "Come, my head’s free at last!’ said Alice in a tone of delight, which changed into alarm in another moment, when she found that her shoulders were nowhere to be found: all she could see, when she looked down, was an immense length of neck, which seemed to rise like a stalk out of a sea of green leaves that lay far below her."
str_list = ["my head's", "free", "at last", "into alarm", "in another moment", "neck"]
new_str = ''
for word in str_list:
if word in master_string:
new_str += word + ' '

print(new_str)

这是第二个:

master_string = "Come, my head’s free at last!’ said Alice in a tone of delight, which changed into alarm in another moment, when she found that her shoulders were nowhere to be found: all she could see, when she looked down, was an immense length of neck, which seemed to rise like a stalk out of a sea of green leaves that lay far below her."
str_list = ["my head's", "free", "at last", "into alarm", "in another moment", "neck"]
new_str = ''
for word in str_list:
if word in master_string:
new_word = word.split(' ')
if len(new_word) == 2:
new_str += word + ' '

print(new_str)

最新更新