根据单元格条目中的空格拆分pandas数据帧中的行



我已经使用pandas 在python中创建了以下数据帧

import numpy as np
import pandas as pd

我们创建一个列表

A=["THIS IS A NEW WORLD   WE NEED A NEW PARADIGM: FOR THE NATION FOR THE PEOPLE",
"THIS IS A NEW WORLD ORDER;.  WE NEED A NEW PARADIGM-: FOR THE NATION FOR THE PEOPLE%",
"THIS IS A NEW WORLD?  WE NEED A NEW PARADIGM  FOR THE NATION FOR THE PEOPLE PRESENT."] 

接下来我们创建一个数据帧

df1=pd.DataFrame()
df1["A"]=A
df1["B"]=["A1", "A2", "A3"]

数据帧显示如下

A                                                                                B
0  THIS IS A NEW WORLD   WE NEED A NEW PARADIGM: FOR THE NATION FOR THE PEOPLE             A1
1  THIS IS A NEW WORLD ORDER;.  WE NEED A NEW PARADIGM-: FOR THE NATION FOR THE PEOPLE%    A2
2  THIS IS A NEW WORLD?  WE NEED A NEW PARADIGM  FOR THE NATION FOR THE PEOPLE PRESENT.    A3

在上面的数据帧中,列A具有由空格分隔的字符向量如何转换数据帧以生成以下数据帧

A                        B
0   THIS IS A NEW WORLD                     A1
1   WE NEED A NEW PARADIGM:                 A1
2   FOR THE NATION FOR THE PEOPLE           A1
3   THIS IS A NEW WORLD ORDER;.             A2
4   WE NEED A NEW PARADIGM-:                A2
5   FOR THE NATION FOR THE PEOPLE%          A2
6   THIS IS A NEW WORLD?                    A3
7   WE NEED A NEW PARADIGM                  A3
8   FOR THE NATION FOR THE PEOPLE PRESENT.  A3

我请求某人查看

如果需要拆分2个或多个空格,请将正则表达式s{2,}添加到Series.str.split,然后使用DataFrame.explode:

df1['A'] = df1['A'].str.split('s{2,}')
df = df1.explode('A')
print (df)
A   B
0                                THIS IS A NEW WORLD  A1
0  WE NEED A NEW PARADIGM: FOR THE NATION FOR THE...  A1
1                        THIS IS A NEW WORLD ORDER;.  A2
1  WE NEED A NEW PARADIGM-: FOR THE NATION FOR TH...  A2
2                               THIS IS A NEW WORLD?  A3
2                             WE NEED A NEW PARADIGM  A3
2             FOR THE NATION FOR THE PEOPLE PRESENT.  A3

相关内容

  • 没有找到相关文章

最新更新