有一个数据帧,其中包含以下记录的SAMPLE(非原始数据(:
import pandas as pd
df = pd.DataFrame(dikt, columns=['id', 'price', 'day'])
df:
+-------+-----+-------+-----+
| index | id | price | day |
+-------+-----+-------+-----+
| 0 | 34 | 12 | 3 |
+-------+-----+-------+-----+
| 1 | 34 | 6 | 5 |
+-------+-----+-------+-----+
| 2 | 56 | 23 | 8 |
+-------+-----+-------+-----+
| 3 | 56 | 21 | 9 |
+-------+-----+-------+-----+
| 4 | 56 | 67 | 22 |
+-------+-----+-------+-----+
| ... | ... | ... | |
+-------+-----+-------+-----+
我想在一周内将价格分组如下:
+-------+-----+---------------------+
| index | id | price |
+-------+-----+---------------------+
| 0 | 34 | [12, 6] |
+-------+-----+---------------------+
| 1 | 56 | [23, 21], [67] |
+-------+-----+---------------------+
| ... | ... | ... |
+-------+-----+---------------------+
在上表中,价格按日期分组。例如,第12天和第6天可能在第一周的第3天和第5天。所以他们在一起,等等
将一天除以7,添加一列作为周数,并将其分组到该单元中。哪些分组的数据帧将在没有周数的分组中组合。
df['weeknum'] = df['day'] // 7
df2 = df.groupby(['id','weeknum'])['price'].agg(list).to_frame()
df2['price'] = df2['price'].astype(str)
df2.groupby('id')['price'].agg(','.join).to_frame()
price
id
34 [12, 6]
56 [23, 21],[67]