我有一个列,其数据如下。从这里,我需要提取后面的1个单词,批准者是,批准来自等。在关键字"批准"之后的第一个单词/名称。批准不区分大小写。
,
第1行-事件12345,问题是某某,解决是某某。票经管理员批准
第2行事件12900,问题是这样,解决方案是这样。审批人:Wanda倡导者工作julie
第3行事件125790,问题是某某,解决是某某。票获得-蜘蛛侠批准,关闭
第4行事件125790,问题是某某,解决是某某。票被-铁人批准, blah blah
我试着做bApprov*b([w][A-Za-z]{4-7}) -但它不工作
这是一个非常类似于你的解决方案,我希望它适用于你。至少对于这个特殊的例子,它返回您需要的输出:
import regex as re
string = """row 1- incident 12345, issue is so and so, solution is so and so.Ticket was approved by thors
row 2-incident 12900, issue is so and so, solution is so and so. approver is : Wanda Advocate worked julie
row 3-incident 125790, issue is so and so, solution is so and so. Ticket was got approval from- spiderman, closing
row 4-incident 125790, issue is so and so, solution is so and so. Ticket was approved by- ironman, blah blah"""
for row in string.split("n"):
if row.startswith("row"):
m = re.search(r"(?i)(?<=approv[A-Zs-:]+)[A-Z]{5,}", row)
print(m.group(0))
输出:
thors
Wanda
spiderman
ironman
您想使用python实现这一点吗?如果是这样,下面的代码可能会有帮助。
代码:
rows = ['incident 12345, issue is so and so, solution is so and so.Ticket was approved by Thors'
, 'incident 12900, issue is so and so, solution is so and so. approver is : Wanda Advocate worked julie'
, 'incident 125790, issue is so and so, solution is so and so. Ticket was got approval from- spiderman, closing'
, 'incident 125790, issue is so and so, solution is so and so. Ticket was approved by- ironman, blah blah']
for row in rows:
clean_row = row.translate({ord(x): None for x in ',.;:[]()-'})
split_row = clean_row.lower().split('approv')[-1].split()[2]
print(split_row)
输出:
thors
wanda
spiderman
ironman
使用这个回调函数可以解决这个问题。
txt.replace(/(?<=[a-z0-9]+)s+[:-]/gi, x => x.trim()).match(/(?<=(approv)[a-z]+s)[a-zs-:]+/gi).join().split(' ')[1]
解释:
- 我使用
txt.replace(/(?<=[a-z0-9]+)s+[:-]/gi, x => x.trim())
,因为在第二个输入字符串:row 2-incident 12900, issue is so and so, solution is so and so. approver is : Wanda Advocate worked julie
有一些输入问题。在"approver is"one_answers":"之间添加额外的空格 - regex:
txt.match(/(?<=(approv)[a-z]+s)[a-zs-:]+/gi).join().split(' ')[1]
。然后找到"approv*"关键字后的剩余单词,打印第二个单词。 代码:
var ar = [`row 1- incident 12345, issue is so and so, solution is so and so.Ticket was approved by thors`,
`row 2-incident 12900, issue is so and so, solution is so and so. approver is : Wanda Advocate worked julie`,
`row 3-incident 125790, issue is so and so, solution is so and so. Ticket was got approval from- spiderman, closing`,
`row 4-incident 125790, issue is so and so, solution is so and so. Ticket was approved by- ironman, blah blah`]
ar.forEach(txt => {
console.log(txt.replace(/(?<=[a-z0-9]+)s+[:-]/gi, x => x.trim()).match(/(?<=(approv)[a-z]+s)[a-zs-:]+/gi).join().split(' ')[1]);
})
输出:
thors
Wanda
spiderman
ironman