2021-09-01 2021-09-022021-09-012021-09-022021-09-032021-09-012021-09-022021-09-032021-09-042021-09-05
我有一个json数据,必须转换成列,我能够成功地做到这一点,它导致我的东西如下:
时间戳您可以分组,筛选,然后取第一/最后,如下所示:
# copy, clean df
# df = pd.read_clipboard("ss+")
# df["status"] = df.status.str.lower()
# df["timestamp"] = pd.to_datetime(df.timestamp)
# df = df.sort_values("timestamp")
# this stores the original index as a column, creates a new
df = df.reset_index()
df_open = df[df.status == "open"].groupby(["id"], as_index=False).first()
df_closed = df[df.status == "closed"].groupby(["id"], as_index=False).last()
# discard the temp index, set back to original index
sol = pd.concat([df_open, df_closed]).set_index("index", drop=True).sort_index()
输出:
id phone country country_code status timestamp
index
0 abc 123 India 91 open 2021-09-01
1 abc 123 India 91 closed 2021-09-02
2 xyz 456 India 91 open 2021-09-01
3 xyz 456 India 91 closed 2021-09-02
5 ijk 789 India 91 open 2021-09-01
8 ijk 789 India 91 closed 2021-09-04
9 suv 0 US 1 open 2021-09-05