根据前缀限制字符串在列表中的出现次数

所以我正在编写的代码是针对IRC机器人的，我想实现一种基于CHANLIMIT服务器选项限制通道的方法。

CHANLIMIT选项是一个以:分隔的前缀和限制的列表，但如果在:之后没有任何限制，则没有限制。

下面的解决方案有效，但我正在寻找任何改进。

result = ['#+:2', '&:']
channels = ['#test1', '#test2', '+test3', '&test4']
prefix_groups = [(prefix, []) for prefix in result]
channel_groups = {k: v for (k, v) in prefix_groups}
for channel in channels:
for group in prefix_groups:
if channel[0] in group[0]:
channel_groups[group[0]].append(channel)
break
for prefix, channels in channel_groups.items():
limit = prefix.split(':')[1]
if limit:
if len(channels) > int(limit):
channel_groups[prefix] = channels[:int(limit)]
channels = [
channel for chanlist in channel_groups.values() for channel in chanlist]
print(channels)

我们可以更进一步：

解决方案2

import itertools
results = ['#+:2', '&:']
channels_to_test = ['#test1', '#test2', '+test3', '&test4',
'#test5', '!test5', '&test6', '&test7',
'+test8', '#test9']
channel_groups = {group: [channel for channel in channels_to_test
if channel[0] in group]
for group in results}
limit = lambda prefix: prefix.split(':')[1]
modified_channel_groups = {prefix: channels[:int(limit(prefix))]
for (prefix, channels) in channel_groups.items()
if limit(prefix)}
channel_groups.update(modified_channel_groups)
result_channels = list(itertools.chain.from_iterable(channel_groups.values()))
print(result_channels)

但在这里我必须做一个假设：我假设一个通道最多可以匹配results的一个元素。换句话说，results的任何两个元素都不会与同一信道匹配。如果你的情况不是这样，请告诉我。

以下是我所做的更改：

我使用字典理解创建了channel_groups，其中每个元素的值都是列表理解
我创建了modified_channel_groups，它包含了channel_groups中被缩短的元素
我用modified_channel_groups的元素更新了channel_groups的元素
我创建了一个lambda表达式，以便将其包含在modified_channel_groups的定义中
我已使用itertools.chain.from_iterable()提取了result_channels

有很多方法可以解决这个问题。做一些最小的简化，你可以得到这样的东西：

解决方案1

results = ['#+:2', '&:']
channels_to_test = ['#test1', '#test2', '+test3', '&test4',
'#test5', '!test5', '&test6', '&test7',
'+test8', '#test9']
channel_groups = {k: [] for k in results}
for channel in channels_to_test:
for group in results:
if channel[0] in group:
channel_groups[group].append(channel)
break
for prefix, channels in channel_groups.items():
limit = prefix.split(':')[1]
if limit:
limit = int(limit)
channel_groups[prefix] = channels[:limit]
result_channels = [
channel for chanlist in channel_groups.values() for channel in chanlist]
print(result_channels)

以下是我所做的更改：

我直接创建了channel_groups，而不是创建元组列表(prefix_groups(，然后用它来创建channel_groups
我在results上迭代了group，而不是在prefix_groups上迭代
我没有检查len(channels) > int(limit)，因为即使channels的长度小于或等于limit，channels[:limit]也会返回所有channels

您甚至可以进一步直接创建答案channel_groups，但阅读起来会变得更加困难。所以我不推荐它：

解决方案2a

import itertools
results = ['#+:2', '&:']
channels_to_test = ['#test1', '#test2', '+test3', '&test4',
'#test5', '!test5', '&test6', '&test7',
'+test8', '#test9']
limit = lambda prefix: prefix.split(':')[1]
channel_groups = {group: [channel for channel in channels_to_test if channel[0] in group][:int(limit(group)) if limit(group) else None]
for group in results}
result_channels = list(itertools.chain.from_iterable(channel_groups.values()))
print(result_channels)

需要注意的几件事：

channel_groups的创建类似于解决方案2，但字典的每个值都是一个列表(从理解中获得(，该列表用当前group或None的整数值切片，这意味着取所有值

当我必须从字符串中提取一些信息时，我倾向于使用正则表达式。因此，扩展解决方案2我们可以得到：

解决方案3

import re
import itertools
results = ['#+:2', '&:']
channels_to_test = ['#test1', '#test2', '+test3', '&test4',
'#test5', '!test5', '&test6', '&test7',
'+test8', '#test9']
prefix_pattern = re.compile(r'^(.*):(d+)?$')
prefix_matches = (prefix_pattern.match(x) for x in results)
prefix_split = (x.groups() for x in prefix_matches)
channel_groups = {group: [channel for channel in channels_to_test
if channel[0] in group[0]]
for group in prefix_split}
prefix_existing_limit = ((x, int(x[1])) for x in channel_groups
if x[1] is not None)
modified_channel_groups = {prefix_group: channel_groups[prefix_group][:limit]
for (prefix_group, limit) in prefix_existing_limit}
channel_groups.update(modified_channel_groups)
result_channels = list(itertools.chain.from_iterable(channel_groups.values()))
print(result_channels)

但让我们备份一点。如果我理解正确的话，最后你想要一个channels_to_test元素的列表，它与前缀匹配，并且不超过前缀的限制(如果有前缀的话(。您可以在生成器中实现这种过滤行为：

解决方案4

import re
results = ['#+:2', '&:']
channels_to_test = ['#test1', '#test2', '+test3', '&test4',
'#test5', '!test5', '&test6', '&test7',
'+test8', '#test9']
def filter_channel_list(prefixes_to_match, input_channel_list):
prefix_pattern = re.compile(r'^(.*):(d+)?$')
prefix_matches = (prefix_pattern.match(x) for x in prefixes_to_match)
prefix_split = (x.groups() for x in prefix_matches)
prefixes_remaining = {x: (int(y) if y is not None else None)
for (x, y) in prefix_split}
for current_channel in input_channel_list:
for (prefix, nb_left) in prefixes_remaining.items():
if current_channel[0] in prefix:
if nb_left is None:
yield current_channel
break
else:
if nb_left > 0:
prefixes_remaining[prefix] -= 1
yield current_channel
break
else:
continue
result_channels = list(filter_channel_list(results, channels_to_test))
print(result_channels)

以下是一些评论：

在这个解决方案中，我提出了channels_to_test的一个元素将只与results的一个元件匹配的要求。这是因为生成器中放置了break语句
我们所做的是为每个results设置一个具有初始极限的dictionary，并在每次遇到与channels_to_test元素的匹配时递减。如果该值变为0，生成器将跳到下一个值。这就是(在这种情况下是可选的(continue语句的作用

相关内容

最新更新

热门标签：