我有以下问题。我想将数据框转换为基于类别的元组列表。请看下面的简单代码:
data = {'product_id': ['5', '7', '8', '5', '30'], 'id_customer': ['1', '1', '1', '3', '3']}
df = pd.DataFrame.from_dict(data)
#desired output is:
result = [('5', '7', '8'), ('5', '30')]
我该怎么做呢?将pandas dataframe转换为唯一元组列表
使用GroupBy.agg
和tuple
,如:
print (df.groupby('id_customer', sort=False)['product_id'].agg(tuple).tolist())
print (df.groupby('id_customer', sort=False)['product_id'].apply(tuple).tolist())
print (list(df.groupby('id_customer', sort=False)['product_id'].agg(tuple)))
print (list(df.groupby('id_customer', sort=False)['product_id'].apply(tuple)))
[('5', '7', '8'), ('5', '30')]
使用groupby.agg
:
>>> [tuple(v) for _, v in df.groupby('id_customer')['product_id']]
[('5', '7', '8'), ('5', '30')]
>>>