我已经提取了user_id对shop_ids
user_id shop_ids
0 022221205 541
1 023093087 5088,4460,4460,4460,4460,4460,4460,4460,5090
2 023096023 2053,2053,2053,2053,2053,2053,2053,2053,2053,1...
3 023096446 4339,4339,3966,4339,4339
4 023098684 5004,3604,5004,5749,5004
我正试图将此数据框写入csv使用:df.to_csv('users_ordered_shops.csv')
最后,csv将商店id合并为一个数字,如下所示:
user_id shop_ids
0 22221205 541
1 23093087 508,844,604,460,446,000,000,000,000,000,000,000
2 23096023 2,053,205,320,532,050,000,000,000,000,000,000,000,000,000,000,000,000
3 23096446 43,394,339,396,643,300,000
4 23098684 50,043,604,500,457,400,000
索引2
的值为:
print(df.iloc[2].shop_ids)
2053,2053,2053,2053,2053,2053,2053,2053,2053,1294,1294,2053,1922
期望的输出是一个csv文件,所有的shop_id在一列或不同的列中完整,如:
user_id shop_ids
0 022221205 541
1 023093087 5088,4460,4460,4460,4460,4460,4460,4460,5090
2 023096023 2053,2053,2053,2053,2053,2053,2053,2053,2053,1294,1294,2053,1922
3 023096446 4339,4339,3966,4339,4339
4 023098684 5004,3604,5004,5749,5004
关于如何在写入csv文件时不合并商店id的任何提示?我已经尝试将shop_ids列使用astype()
转换为int
和str
,这导致了相同的输出。
更新
要获得每个列一个商店(并删除重复的),您可以使用:
pd.concat([df['user_id'],
df['shop_ids'].apply(lambda x: sorted(set(x.split(','))))
.apply(pd.Series)],
axis=1).to_csv('users_ordered_shops.csv', index=False)
更改分隔符。试一试:
df.to_csv('users_ordered_shops.csv', sep=';')
或者改变引用策略:
import csv
df.to_csv('users_ordered_shops.csv', quoting=csv.QUOTE_NONNUMERIC)