如何计算连续字符串中字符的频率?



input = 'XXYXYYYXYXXYYY'

输出 =[2,1,1,3,1,1,2,3]

如何按照输入顺序计算字符串中 X 和 Y 的数量,然后将这些值放入列表中?

import itertools
numbers = []
input = 'XXYXYYYXYXXYYY'
split_string = [''.join(g) for k, g in itertools.groupby(input)]
for i in split_string:
numbers.append(len(i))
print(numbers)

输出:

[2, 1, 1, 3, 1, 1, 2, 3]

您可以通过迭代整个列表来使用while循环来执行此操作。

str = 'XXYXYYYXYXXYYY';
i = 0
output = []
k = 1
while i < len(str) - 1:
if str[i] == str[i+1]:
k = k + 1
else:
output.append(k)
k = 1
i = i + 1
output.append(k)
print(output)

输出

[2, 1, 1, 3, 1, 1, 2, 3]

尝试使用itertools.groupby

from itertools import groupby
s = 'XXYXYYYXYXXYYY'
print([len(list(i)) for _, i in groupby(s)])

使用正则表达式的简短解决方案

import re
s = 'XXYXYYYXYXXYYY'
l = [len(m.group()) for m in re.finditer(r'(.)1*', s)]

基于这个答案

这是你可以尝试的

test = 'XXYXYYYXYXXYYY'
count = 1
result_list = list()
prev_char = test[0]
for char in test[1:]:
if char == prev_char:
count+=1
prev_char = char
else:
result_list.append(count)
count=1
prev_char = char
result_list.append(count)
print(result_list)

输出

[2, 1, 1, 3, 1, 1, 2, 3]

没有任何库,它将是这样的:

string = 'XXYXYYYXYXXYYY'
res = []
current = ''
for char in string:
if current == char:
res[-1] += 1
else:
res.append(1)
current = char
print('res', res) # [2,1,1,3,1,1,2,3]

试试这个。

input1 = 'XXYXYYYXYXXYYY'
output_list = []
count = 1
for index in range(len(input1)-1):
if input1[index] == input1[index+1]:
count += 1
else:
output_list.append(count)
count = 1
if input1[-1] == input1[-2]:
output_list[-1] += 1
else:
output_list.append(1)
print(output_list)

基本方法是出现并在出现新字符时停止。代码如下。

list_of_consec = []
def consec_occur(strr):
i = 0
cc = []
while ( i < len(strr) -1 ):
count =1
while strr[i] == strr[i+1]:
i += 1
count += 1
if i + 1 == len(strr):
break
cc.append(count)
i += 1
return (cc)
if __name__ == "__main__":
print(consec_occur('XXYXYYYXYXXYYY'))

您可以根据需要更改代码。如果你想要列表,那么使 cc 全局并删除返回语句,并在打印语句中使用 cc。

最新更新