我正在处理一个相当混乱的DF。看起来像这样,但是有30列:
<表类>
b
some text (other text): 56.3% (text again: 40%)
again text (not same text): 33% (text text: 60.1%)
text (always text): 26.6% (and text: 80%)
still text (too text:太多文本):86% (last text: 10%)
表类>
您可以尝试apply
一个自定义函数
def concat(row):
keys = row.str.extract('(d+.?d*)%')[0].astype(float).tolist()
row = [x for _, x in sorted(zip(keys, row.tolist()))]
return ' '.join(row)
df['c'] = df.apply(concat, axis=1)
print(df)
a b
0 some text (other text) : 56.3% (text again: 40%) again text (not same text) : 33% (text text: 6...
1 text (always text) : 26.6% (aaand text: 80%) still text (too much text) : 86% (last text: 10%)
a
0 some text (other text) : 56.3% (text again: 40%)
1 text (always text) : 26.6% (aaand text: 80%)
b
0 again text (not same text) : 33% (text text: 60.1%)
1 still text (too much text) : 86% (last text: 10%)
c
0 again text (not same text) : 33% (text text: 60.1%) some text (other text) : 56.3% (text again: 40%)
1 text (always text) : 26.6% (aaand text: 80%) still text (too much text) : 86% (last text: 10%)