用%拆分数字上的熊猫列



我有一个df,其中一列显示为:

**Share**
We are safe 25%
We are always safe 12.50% (India Aus, West)
We are ok (USA, EU)
We are not OK
What is this
Always wise 25.66%

我想拆分此列,以便在适用的情况下将%值从该列拆分为一个新列。所以输出将是

Share                  Percent    LOCATION
We are safe            25%  
We are always safe     12.50%     India Aus, West
We are ok                         USA, EU
We are not OK
What is this
Always wise            25.66%

我原以为下面会把它从右边分开,但不起作用

df['Percent'] = df['Share'].str.rsplit(r' d',1).str[0]

您可以提取这些值:

df[['Share','Percent']] = df['Share'].str.split(r's+(?=d+(?:.d+)?%s*$)',expand=True).fillna("")

Pandas测试:

import pandas as pd
df = pd.DataFrame({'Share':['We are safe 25%','We are ok', 'We are always safe 12.50%']})
df[['Share','Percent']] = df['Share'].str.split(r's+(?=d+(?:.d+)?%s*$)',expand=True).fillna("")
>>> df
Share Percent
0         We are safe     25%
1           We are ok        
2  We are always safe  12.50%

请参阅regex演示。详细信息:

  • s+-一个或多个空白
  • (?=d+(?:.d+)?%s*$)-与紧接着的位置匹配的正向前瞻:
    • d+-一个或多个数字
    • (?:.d+)?-.和一个或多个数字的可选序列
    • %-%符号
    • s*-0个或更多尾随空白($紧随其后(,以及
    • $—字符串结束

最新更新