创建与pandas中特定列的ID字段数相等的行



我在df中有一个详细的列,可以包含多个客户信息,如下表所示。

如果详细信息列有多个客户信息,我需要为每个客户创建一个新行,如第二个表所示。

我如何在Python中做到这一点?

ID:123,数量:1,名称:abcdef, ID: 324,数量:2,名称:giefg, ID:123,数量:1,名称:abcdef,ID:123,数量:1,名称:abcdef, ID:123,数量:1,名称:abcdef,

使用前向正则表达式对split字符串,然后explode它:

df['details'] = df['details'].str.split(',s*(?=ID:)')
df.explode('details')

输出:

date   time consumer                       details
0  12/11  12:00      abc   ID:123, Qty: 1,Name: abcdef
0  12/11  12:00      abc       ID: 324,Qty:2,Name: ghi
1  13/11  13:00      def  ID:123, Qty: 1,Name: abcdef,
2  14/11  11:00      ghj   ID:123, Qty: 1,Name: abcdef
2  14/11  11:00      ghj      ID:123, Qty: 1,Name: abc

假设每个客户有3个字段(ID, Qty, Name),您可以尝试这样做。


import pandas as pd
df = pd.DataFrame(
{
"date:": ["12/11", "13/11", "14/11"],
"time": ["12:00", "13:00", "11:00"],
"consumer": ["abc", "def", "ghj"],
"details": [
"ID:123, Qty: 1,Name: abcdef,ID: 324,Qty:2,Name: ghi",
"ID:123, Qty: 1,Name: abcdef",
"ID:123, Qty: 1,Name: abcdef,ID:123, Qty: 1,Name: abc",
],
}
)
df["details"] = df["details"].apply(
lambda txt: [
','.join(txt.split(",")[idx : idx + 3]) for idx in range(0, len(txt.split(",")), 3)
]
)
df = df.explode("details").reset_index(drop=True)
print(df)
Sample Output
date:   time consumer                      details
0  12/11  12:00      abc  ID:123, Qty: 1,Name: abcdef
1  12/11  12:00      abc      ID: 324,Qty:2,Name: ghi
2  13/11  13:00      def  ID:123, Qty: 1,Name: abcdef
3  14/11  11:00      ghj  ID:123, Qty: 1,Name: abcdef
4  14/11  11:00      ghj     ID:123, Qty: 1,Name: abc

相关内容

  • 没有找到相关文章