我有这个数据帧。
+-----+--------+--------------------------------+
|ID |Date |Text |
+-----+--------+--------------------------------+
|1 |1 Jan |This is a text |
|2 |2 Jan |Text can be of variant length |
+-----+--------+--------------------------------+
如何将文本列拆分并旋转到ID和日期?
+-----+--------+-------+
|ID |Date |Text |
+-----+--------+-------+
|1 |1 Jan |This |
|1 |1 Jan |is |
|1 |1 Jan |a |
|1 |1 Jan |text |
|2 |2 Jan |Text |
|2 |2 Jan |can |
|2 |2 Jan |be |
|2 |2 Jan |of |
|2 |2 Jan |variant|
|2 |2 Jan |length |
+-----+--------+-------+
我知道对于pivot,我可以使用df.stack()
,但由于每个文本的长度不同,我在拆分它时遇到了问题。
如果有任何帮助,我将不胜感激。
尝试此代码并参考此文档https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.explode.html
df = pd.DataFrame({'col1':[1,2],'col2':['1 Jan', '2 Jan'],'col3':['This is a text','Text can be of varient length']})
df['col3'] = df['col3'].str.split(' ')
a = df.explode('col3')
print(a)
输出:
col1 col2 col3
0 1 1 Jan This
0 1 1 Jan is
0 1 1 Jan a
0 1 1 Jan text
1 2 2 Jan Text
1 2 2 Jan can
1 2 2 Jan be
1 2 2 Jan of
1 2 2 Jan varient
1 2 2 Jan length