基于键动态联接列表元素

我的数据以pandas数据帧开始。然后，我将感兴趣的列转换为python列表，并尝试根据一些特定规则将该列表中的某些元素连接到一个列表中。

我的示例列表可能如下所示：

list_1 = ['country: US', 'firstname: John', 'displayName: A', 'JohnS123', 'address: 123 main st', 'baltimore', 'MD', '12345', 'email:jsmith@email.com']

我正在查找所有包含"："的索引。list_1[0]中包含"："，所以可以。list_1[1]也包含该字符，因此也可以。list_1[3]不包含"："，所以我试图通过在它们之间添加一个"，"来将其与list_1[2]连接起来。所以在新列表中，list_1[2]看起来像'displayName: A, JohnS123'。

此外，我正在尝试将所有不包含"："的索引添加到index-1，直到我到达列表中包含"："的下一项为止。

以下是执行后的新列表。

new_list = ['country: US', 'firstname: John', 'displayName: A, JohnS123', 'address: 123 main st, baltimore, MD, 12345', 'email:jsmith@email.com']
len(list_1) -> 9
len(new_list) -> 5

我试图通过使用列表设置不同的算法来实现这一点，但如果也可以通过熊猫来实现，我对这两种选择都持开放态度。

Do：

# setup
list_1 = ['country: US', 'firstname: John', 'displayName: A', 'JohnS123', 'address: 123 main st', 'baltimore', 'MD',
'12345', 'email:jsmith@email.com']
s = pd.Series(list_1, dtype='string')
# group by contiguous chunks were the first element contains :
res = s.groupby(s.str.contains(':').astype(int).cumsum()).agg(', '.join)
print(res)

输出

1                                   country: US
2                               firstname: John
3                      displayName: A, JohnS123
4    address: 123 main st, baltimore, MD, 12345
5                        email:jsmith@email.com
dtype: string

相关内容

最新更新

热门标签：