计数器对象/列表中任何项目的最大出现次数



我有一个用不同颜色填充的网格(由不同的整数表示)。我想快速(在一行和最少的处理时间内)计算所有颜色的出现次数并返回最高的出现次数。另外,我想忽略零。

这是我的解决方案(你能想到更好的解决方案吗?

grid = [[0, 0, 5, 0, 0, 0, 0, 10, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 10, 10, 10, 0, 0, 0], [0, 30, 30, 0, 33, 0, 0, 0, 0, 0, 0, 0, 0], [0, 30, 0, 0, 33, 33, 0, 0, 50, 0, 50, 0, 0], [0, 30, 0, 0, 33, 33, 0, 0, 50, 50, 50, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 50, 0, 0], [0, 0, 0, 0, 0, 0, 0, 88, 88, 88, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 88, 88, 0, 0, 0, 0]]
rows,cols=len(grid),len(grid[0])
ctrSum=Counter()
for row in grid:
ctrSum += Counter(row)
ctrSum -= Counter({0:(rows*cols)}) # subtract out a ridiculous amount of zeroes to eliminate them all from the counter
return max(ctrSum.values())

由于 Pranav 的打字速度比我快,这是您问题的另一个潜在答案:

ans = max(Counter(x for row in grid for x in row if x!=0).values())

我想到了不规则的嵌套列表,这就是可以做到的:

unfurl=lambda x: sum(map(unfurl,x),[]) if isinstance(x,list) or isinstance(x, tuple) else [x]
ans = max(Counter(unfurl(grid)).values())

您可以在创建计数器时平展列表,而不是将每行的计数添加到循环中的计数器!

ctr = Counter(x for row in grid for x in row)
# Counter({0: 79, 5: 1, 10: 4, 30: 4, 33: 5, 50: 6, 88: 5})

然后,删除0键,并找到最大值:

del ctr[0]
max_key, max_count = max(ctr.items(), key=lambda item: item[1]) # 50, 6

计算零并稍后删除键并不比根本不计算它们快或慢得多(在我的计算机上,timeit两种方法返回相似的时间)

grid = [[0, 0, 5, 0, 0, 0, 0, 10, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 10, 10, 10, 0, 0, 0], [0, 30, 30, 0, 33, 0, 0, 0, 0, 0, 0, 0, 0], [0, 30, 0, 0, 33, 33, 0, 0, 50, 0, 50, 0, 0], [0, 30, 0, 0, 33, 33, 0, 0, 50, 50, 50, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 50, 0, 0], [0, 0, 0, 0, 0, 0, 0, 88, 88, 88, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 88, 88, 0, 0, 0, 0]]
def m1(grid):
rows,cols=len(grid),len(grid[0])
ctrSum=Counter()
for row in grid:
ctrSum += Counter(row)
ctrSum -= Counter({0:(rows*cols)}) # subtract out a ridiculous amount of zeroes to eliminate them all from the counter
return max(ctrSum.items(), key=lambda x: x[1])
def m2(grid):
ctr = Counter(x for row in grid for x in row)
del ctr[0]
return max(ctr.items(), key=lambda item: item[1])
def m3(grid):
ctr = Counter(x for row in grid for x in row if x)
return max(ctr.items(), key=lambda item: item[1])
def m3_oneline(grid):
return max(Counter(x for row in grid for x in row if x).items(), key=lambda x: x[1])

t1 = timeit.timeit('func(grid)', setup='from __main__ import grid, m1 as func', number=10000)
t2 = timeit.timeit('func(grid)', setup='from __main__ import grid, m2 as func', number=10000)
t3 = timeit.timeit('func(grid)', setup='from __main__ import grid, m3 as func', number=10000)
t3_oneline = timeit.timeit('func(grid)', setup='from __main__ import grid, m3_oneline as func', number=10000)
print(t2/t1, t3/t1, t3_oneline/t1)
0.3436699657289107 0.3483052483876959 0.3614440352763763
  • m1(可预见)花费的时间最长,因为创建所有这些计数器只是将它们的值相加。
  • m2m3花费 ~34% 的时间是m1所花费的时间。计数零和不计算零之间的运行时间没有显着差异,因为您需要检查它是否为零才能决定不计算它。
  • m3_onelinem2m3稍微慢一点(即单行不一定更有效率)。

您可以在一行中执行此操作,使用reduce()构建一个包含所有行中元素频率的计数器,然后将一个key传递给max(),在查找最常见的元素时消除 0:

from collections import Counter
from functools import reduce
import math
result = max(reduce(lambda x, y: x + Counter(y), grid, Counter()).items(),
key=lambda x: x[1] if x[0] else -math.inf)
print(result)

您还可以使用sum()将所有计数器添加在一起:

result = max(sum(map(Counter, grid), Counter()).items(),
key=lambda x: x[1] if x[0] else -math.inf)
print(result)

使用上面的网格,这两个都打印:

(50, 6)

最新更新