我需要找到按字母组重复字符的计数例如,如果我有一个字符串,s = "hggdsaajhjhajadj"
,那么我需要计数为
h -, g - 2 d 1, s - 1, a, h - j - 1,等等
而不是{a: 4, ' d ': 2,"g":2,"h":3,"j":4,"s":1}
下面的代码按字母给出了计数。
s = "hggdsaajhjhajadj"
def find_repeated(string):
table = {}
for char in string.lower():
if char in table:
table[char] += 1
elif char != " ":
table[char] = 1
else:
table[char] = 0
return table
print find_repeated(s)
{a: 4, ' d ': 2,"g":2,"h":3,"j":4,"s":1}
如果我尝试使用以下命令,
for c in sorted(set(s)):
i = 1;
while c * i in s:
i += 1
print c, "-", i - 1
然后,我得到以下内容:
a - 2 d - 1 g - 2 h - 1 j - 1 s - 1
你能给我一些建议吗?
Python处理连续组的工具是itertools.groupby
:
>>> from itertools import groupby
>>> s = "hggdsaajhjhajadj"
>>> [(k, len(list(g))) for k,g in groupby(s)]
[('h', 1), ('g', 2), ('d', 1), ('s', 1), ('a', 2), ('j', 1), ('h', 1), ('j', 1), ('h', 1), ('a', 1), ('j', 1), ('a', 1), ('d', 1), ('j', 1)]
groupby
返回一个对象,如果你遍历它,你会得到键和一个遍历组元素的迭代器:
>>> grouped = groupby(s)
>>> for key, group in grouped:
... print(key, list(group))
...
h ['h']
g ['g', 'g']
d ['d']
s ['s']
a ['a', 'a']
j ['j']
h ['h']
j ['j']
h ['h']
a ['a']
j ['j']
a ['a']
d ['d']
j ['j']
下面的函数执行您指定的操作:
def mycount(s):
i = 0
res = []
while i<len(s):
j = i+1
while j<len(s) and s[i] == s[j]:
j += 1
res.append( (s[i],j-i) )
i = j
return res