访问数据帧列中的子字符串以创建新列



我有一个数据帧

df = pd.DataFrame(np.random.randint(0,10,size=(5, 1)), columns=list('A'))
df.insert(0, 'n', ['this-text in presence 20-30%, and another string','id XDTV/HGF, publication',
'this-text, 37$degree','this-text K0.5, coefficient 0.007',' '])
>>> df
n                                                 A
0   this-text in presence 20-30%, and another string  2
1   id XDTV/HGF, publication                          1
2   this-text, 37$degree                              4
3   coefficient 0.007,this-text K0.5                  1
4                                                     2

我想创建一个新的列

>>> df
new       A
0   this-text 2
1             1
2   this-text 4
3   this-text 1
4             2

我可以将列n保存在列表中,并检查列表中的每个项是否包含子字符串this-text。但我想知道是否有更好的方法。建议会很有帮助。

尝试使用str.findallextract

df['new']=df.n.str.findall('this-text').str[0] 
#df.n.str.extract('(this-text)')[0]
df
Out[373]: 
n  A        new
0  this-text in presence 20-30%, and another string  7  this-text
1                          id XDTV/HGF, publication  4        NaN
2                              this-text, 37$degree  6  this-text
3                 this-text K0.5, coefficient 0.007  0  this-text
4                                                    7        NaN

最新更新