我正在尝试提取文件的特定部分,如下所示:
name = "Account - UU ",
source = "1-account",
destination = "account-manager12",
other = 111111
name = "Account - PP,
source = "2-account",
destination = "account-manager1234",
other = 1212
name = "Account - GG ",
source = "3-account",
destination = "account-manager12345",
other = 44444
name = "Account - QQ,
source = "4-account",
destination = "account-manager123456"
other = 23232323
我的预期输出是
name = "Account - UU" | source = "1-account" | destination = "account-manager12"
name = "Account - PP" | source = "2-account" | destination = "account-manager1234"
name = "Account - GG" | source = "3-account" | destination = "account-manager12345"
name = "Account - QQ" | source = "4-account" | destination = "account-manager123456"
有什么方法可以使用 grep/awk 命令实现相同的目标吗?我真的很感激任何建议。谢谢。
在 https://ideone.com/0O8t3U 看到这个运行:
#!/usr/bin/env bash
shopt -s extglob # enable extended globbing, of which @(one|two|three) is an example
output=""
while IFS= read -r line; do
case $line in
@(name|source|destination)" = "*) # "name = " or "source = " or "destination = "
output+="${line%,} | " ;; # strip trailing comma before appending to output
"") # matches only an empty line
printf '%sn' "${output%' | '}" # print our output, without the last " | "
output="" # ...then reset that output to empty
;;
esac
done
# finally, print anything that didn't have a blank line after it (last block of input)
[[ $output ]] && printf '%sn' "${output% | }"
如果我们真的不必处理 $1 中缺少的引号和尾随空白:
$ awk -v RS= -F',?n' -v OFS=' | ' '{print $1, $2, $3}' file
name = "Account - UU " | source = "1-account" | destination = "account-manager12"
name = "Account - PP | source = "2-account" | destination = "account-manager1234"
name = "Account - GG " | source = "3-account" | destination = "account-manager12345"
name = "Account - QQ | source = "4-account" | destination = "account-manager123456"
或者如果我们这样做:
$ awk -v RS= -F',?n' -v OFS=' | ' '{gsub(/^"? *| *"?$/,""",$1); print $1, $2, $3}' file
"name = "Account - UU" | source = "1-account" | destination = "account-manager12"
"name = "Account - PP" | source = "2-account" | destination = "account-manager1234"
"name = "Account - GG" | source = "3-account" | destination = "account-manager12345"
"name = "Account - QQ" | source = "4-account" | destination = "account-manager123456"
您能否尝试以下操作,使用 GNUawk
中显示的示例编写和测试。
awk '
BEGIN{
OFS=" | "
}
/^ +name/{
if(val){
print val
val=""
}
found=1
}
found{
val=(val?val OFS:"")$0
}
/^ +other/{
found=""
}
END{
if(val){
print val
}
}' Input_file
使用两个 Perl 单行代码的组合,paste
:
perl -lne 'print for /^s*((?:name|source|destination)s*=s*[^,]*)/' input_file | paste - - - | perl -pe 's/t/ | /g'
Perl 单行代码使用这些命令行标志:-e
:告诉 Perl 在内联中查找代码,而不是在文件中查找代码.-n
:一次循环一行输入,默认情况下将其分配给$_
.-p
:一次循环一行输入,默认情况下将其分配给$_
。在每次循环迭代后添加print $_
.-l
:在内联执行代码之前去除输入行分隔符(默认情况下在 *NIX 上"n"
),并在打印时追加它。
第一个 Perl 单行打印此正则表达式中用括号捕获的所有组,这些组创建输出表单元格,每行 1 个单元格:/^s*((?:name|source|destination)s*=s*[^,]*)/
:行的开头,后跟 0 或更多空格,后跟key = value
对,其中键在非捕获括号内指定(?:PATTERN)
。value
是 0 个或多个非逗号字符 ([^,]*
) 的延伸
第二个 Perl 单行代码使用/g
(多个匹配)正则表达式修饰符将所有选项卡替换为|
。
paste - - -
:在 TAB 上连接输入的每 3 行并打印为一行。
另请参阅:perldoc perlrun
: 如何执行 Perl 解释器: 命令行开关perldoc perlre
: Perl 正则表达式 (正则表达式)perldoc perlre
: Perl 正则表达式 (正则表达式): 量词;角色职业和其他特殊逃生;断言;捕获组perldoc perlrequick
:Perl 正则表达式快速入门