我有一个df,其中一列显示为:
**Share**
We are safe 25%
We are always safe 12.50% (India Aus, West)
We are ok (USA, EU)
We are not OK
What is this
Always wise 25.66%
我想拆分此列,以便在适用的情况下将%值从该列拆分为一个新列。所以输出将是
Share Percent LOCATION
We are safe 25%
We are always safe 12.50% India Aus, West
We are ok USA, EU
We are not OK
What is this
Always wise 25.66%
我原以为下面会把它从右边分开,但不起作用
df['Percent'] = df['Share'].str.rsplit(r' d',1).str[0]
您可以提取这些值:
df[['Share','Percent']] = df['Share'].str.split(r's+(?=d+(?:.d+)?%s*$)',expand=True).fillna("")
Pandas测试:
import pandas as pd
df = pd.DataFrame({'Share':['We are safe 25%','We are ok', 'We are always safe 12.50%']})
df[['Share','Percent']] = df['Share'].str.split(r's+(?=d+(?:.d+)?%s*$)',expand=True).fillna("")
>>> df
Share Percent
0 We are safe 25%
1 We are ok
2 We are always safe 12.50%
请参阅regex演示。详细信息:
s+
-一个或多个空白(?=d+(?:.d+)?%s*$)
-与紧接着的位置匹配的正向前瞻:d+
-一个或多个数字(?:.d+)?
-.
和一个或多个数字的可选序列%
-%
符号s*
-0个或更多尾随空白($
紧随其后(,以及$
—字符串结束