如何计算列表中特定类型的出现次数?



是否有一种简单的方法来计算列表中特定类型的元素有多少?

这就是我的想法和我尝试过的。

ex_list = ["string", "data", "item", 1, 3, {3: "im dict"}, "im_item"]
print(ex_list.count(int)) # -> 0, should return 2
print(ex_list.count(type(str))) # -> 0, should return 4

我知道有解决方法(使用循环等),但我想知道是否有任何简单的使用一个函数(如计数函数等)。

您可以使用Counter类(从collections)与映射到类型:

from collections import Counter
ex_list = ["string", "data", "item", 1, 3, {3: "im dict"}, "im_item"]
typeCounts = Counter(map(type,ex_list))
>>> typeCounts
Counter({<class 'str'>: 4, <class 'int'>: 2, <class 'dict'>: 1})
>>> typeCounts[int]
2

这将在O(n)时间内进行初始计数,此后每次使用typeCounts[type]都将是O(1)。

另一方面,如果您只查找一次特定类型,则可以使用sum():

sum(isinstance(e,int) for e in ex_list) # 4
from operator import countOf
countOf(map(type, ex_list), int)

上网试试!

文档基准:

644 ns   646 ns   671 ns  countOf(map(type, ex_list), int)
2102 ns  2149 ns  2154 ns  Counter(map(type, ex_list))[int]
1125 ns  1150 ns  1183 ns  sum(type(x) is int for x in ex_list)

使用1000倍长的列表进行基准测试:

392 μs   396 μs   397 μs  countOf(map(type, ex_list), int)
684 μs   696 μs   699 μs  Counter(map(type, ex_list))[int]
854 μs   887 μs   896 μs  sum(type(x) is int for x in ex_list)

所以对于一个"特定类型"正如你的问题所问的,它是这些方法中最快的,但是如果你想要多个类型,Counter可能会更快(因为一旦它被创建,在Counter中查找一个类型只需要O(1)时间)。

基准代码(在线试用!):

from timeit import repeat
setup = '''
from operator import countOf
from collections import Counter
ex_list = ["string", "data", "item", 1, 3, {3: "im dict"}, "im_item"]
'''
E = [
'countOf(map(type, ex_list), int)',
'Counter(map(type, ex_list))[int]',
'sum(type(x) is int for x in ex_list)',
]
print('example list:')
for _ in range(3):
for e in E:
number = 100000
times = sorted(repeat(e, setup, number=number))[:3]
print(*('%4d ns ' % (t / number * 1e9) for t in times), e)
print()
print('1000 times longer list:')
setup += 'ex_list *= 1000'
for _ in range(3):
for e in E:
number = 100
times = sorted(repeat(e, setup, number=number))[:3]
print(*('%4d μs ' % (t / number * 1e6) for t in times), e)
print()

我不会把循环称为一种变通方法;这个问题的任何解决方案都将使用循环,无论是用Python编写的还是在底层编写的。因此,这里有一个使用isinstance()的循环解决方案,它允许子类。

def type_count(iterable, type_):
return sum(isinstance(x, type_) for x in iterable)
>>> type_count(ex_list, int)
2
>>> type_count(ex_list, str)
4

如果您想使用Alain的Counter解决方案但允许子类,您可以遍历每个类型的方法解析顺序(MRO),如下所示:

from collections import Counter
typeCounts = Counter(map(type, ex_list))
for t, n in list(typeCounts.items()):  # list() since the dict will change size
for parent in t.__mro__[1:]:
typeCounts[parent] += n
>>> typeCounts[int]
2
>>> typeCounts[object] == len(ex_list)  # This should always be true AFAIK
True

docs:class.__mro__

但是,注意这不支持抽象基类。例如:

>>> from numbers import Number
>>> type_count(ex_list, Number)  # 2 are `Number`s
2
>>> typeCounts[Number]  # but 0 inherit from `Number`
0