"items run out"时停止循环,即使它没有达到正确的 if 子句



我想写一些代码,它接受一个项目列表,并将它们连接(用逗号分隔)为长字符串,其中每个字符串的长度不超过预定义的长度。例如,对于此列表:

colors = ['blue','pink','yellow']

并且最大长度为10个字符,则代码的输出将为:

长字符串0:蓝色,粉红色

长字符串1:黄色

我创建了以下代码(如下),但它的缺陷是连接项的总长度比允许的最大长度短,或者它创建了一个或多个长字符串,并且列表中剩余项的连接总长度比最大长度短。

我想问的是:在下面的代码中,当项目用完,但串联太短,以至于无法到达"else"子句时,如何"停止"循环?

非常感谢:)

import pyperclip

# Theoretical bug: when a single item is longer than max_length. Will never happen for the intended use of this code.


raw_list = pyperclip.paste()
split_list = raw_list.split()
unique_items_list = list(set(split_list))                                       # notice that set are unordered collections, and the original order is not maintained. Not crucial for the purpose of this code the way it is now, but good remembering. See more: http://stackoverflow.com/a/7961390/2594546

print "There are %d items in the list." % len(split_list)
print "There are %d unique items in the list." % len(unique_items_list)

max_length = 10                                                               # salesforce's filters allow up to 1000 chars, but didn't want to hard code it in the rest of the code, just in case.

list_of_long_strs = []
short_list = []                                                                 # will hold the items that the max_length chars long str.
total_len = 0
items_processed = []        # will be used for sanity checking
for i in unique_items_list:
    if total_len + len(i) + 1 <= max_length:                                    # +1 is for the length of the comma
        short_list.append(i)
        total_len += len(i) + 1
        items_processed.append(i)
    elif total_len + len(i) <= max_length:                                      # if there's no place for another item+comma, it means we're nearing the end of the max_length chars mark. Maybe we can fit just the item without the unneeded comma.
        short_list.append(i)
        total_len += len(i)                                                     # should I end the loop here somehow?
        items_processed.append(i)
    else:
        long_str = ",".join(short_list)
        if long_str[-1] == ",":                                                 # appending the long_str to the list of long strings, while making sure the item can't end with a "," which can affect Salesforce filters.
            list_of_long_strs.append(long_str[:-1])
        else:
            list_of_long_strs.append(long_str)
        del short_list[:]                                                       # in order to empty the list.
        total_len = 0
unique_items_proccessed = list(set(items_processed))
print "Number of items concatenated:", len(unique_items_proccessed)

def sanity_check():
    if len(unique_items_list) == len(unique_items_proccessed):
        print "All items concatenated"
    else:           # the only other option is that len(unique_items_list) > len(unique_items_proccessed)
        print "The following items weren't concatenated:"
        print ",".join(list(set(unique_items_list)-set(unique_items_proccessed)))

sanity_check()

print ",".join(short_list)         # for when the loop doesn't end the way it should since < max_length. NEED TO FIND A BETTER WAY TO HANDLE THAT

for item in list_of_long_strs:
    print "Long String %d:" % list_of_long_strs.index(item)
    print item
    print

目前,在else的情况下,您不会对i执行任何操作,因此会漏掉项目,如果short_list没有被循环中的最后一个项目填充,则不会处理它。

最简单的解决方案是用中的i重新启动short_list

short_list = [i]
total_len = 0

并在for循环后检查short_list中是否有剩余,如果有则进行处理:

if short_list:
    list_of_long_strs.append(",".join(short_list))

您可以简化if检查:

new_len = total_len + len(i)
if new_len < max_length:
    ...
elif new_len == max_length:
    ...
else:
    ...

摆脱if/else块启动:

if long_str[-1] == ",":   

",".join(...)表示永远不会发生)

并使用enumerate整理代码的最后部分(我会切换到str.format):

for index, item in enumerate(list_of_long_strs):
    print "Long string {0}:".format(index)
    print item

更广泛地说,我要做的是:

def process(unique_items_list, max_length=10):
    """Process the list into comma-separated strings with maximum length."""
    output = []
    working = []
    for item in unique_items_list:
        new_len = sum(map(len, working)) + len(working) + len(item)
                # ^ items                  ^ commas       ^ new item?
        if new_len <= max_length:
            working.append(item)
        else:
            output.append(working)
            working = [item]
    output.append(working)
    return [",".join(sublist) for sublist in output if sublist]
def print_out(str_list):
    """Print out a list of strings with their indices."""
    for index, item in enumerate(str_list):
        print("Long string {0}:".format(index))
        print(item)

演示:

>>> print_out(process(["ab", "cd", "ef", "gh", "ij", "kl", "mn"]))
Long string 0:
ab,cd,ef
Long string 1:
gh,ij,kl
Long string 2:
mn

好吧,我的OP中描述的问题的解决方案实际上很简单,包括两个修改:

第一个-其他条款:

    else:
    long_str = ",".join(short_list)
    list_of_long_strs.append(long_str)
    items_processed.extend(short_list)                                      #for sanity checking
    del short_list[:]                                                       # in order to empty the list.
    short_list.append(i)                                                    # so we won't lose this particular item
    total_len = len(i)

这里的主要问题是在删除short_list之后追加i,这样循环转到else子句的项就不会丢失。类似地,total_len被设置为该项的len,而不是以前的0。

正如上面友好的评论者所建议的那样,如果else下面的if是多余的,所以我把它去掉了。

第二部分:

residual_items_concatenated = ",".join(short_list)
list_of_long_strs.append(residual_items_concatenated)

这一部分确保当short_list不会因为total_len<maxlength,它的项目仍然被连接起来,并作为另一个项目添加到长字符串列表中,就像它以前的朋友一样。

我觉得这两个小修改是解决我问题的最佳方案,因为它保留了大部分代码,只更改了几行,而不是从sratch重写。

最新更新