如何使用split函数在awk中将camelBase字符串拆分为数组?
输入:
STRING="camelCasedExample"
期望结果:
WORDS[1]="camel"
WORDS[2]="Cased"
WORDS[3]="Example"
错误尝试:
split(STRING, WORDS, /([a-z])([A-Z])/);
错误结果:
WORDS[1]="came"
WORDS[2]="ase"
WORDS[3]="xample"
你不能单独使用split()
,这就是为什么GNU awk有patsplit()
:
$ awk 'BEGIN {
patsplit("camelCasedExample",words,/(^|[[:upper:]])[[:lower:]]+/)
for ( i in words ) print words[i]
}'
camel
Cased
Example
使用您显示的示例,请尝试以下操作。在GNUawk
中编写和测试应该在任何awk
中工作。这将创建一个名为words
的数组,其值可以从索引1、2、3开始访问,依此类推。我正在将其打印为输出,您以后也可以根据自己的意愿使用它。
awk -F'=|"' -v s1=""" '
{
gsub(/[A-Z]/,"n&",$3)
val=(val?val ORS:"")$3
}
END{
num=split(val,words,ORS)
for(i=1;i<=num;i++){
if(words[i]!=""){
print "WORDS[" ++count "]=" s1 words[i] s1
}
}
}
' Input_file
说明:添加对上述awk
代码的详细说明。
awk -F'=|"' -v s1=""" ' ##Starting awk program, setting field separator as = OR " and setting s1 to " here.
{
gsub(/[A-Z]/,"n&",$3) ##Using gsub to globally substitute captial letter with new character and value itself in 3rd field.
val=(val?val ORS:"") $3 ##Creating val which has $3 in it and keep adding values in val itself.
}
END{ ##Starting END block of this program from here.
num=split(val,words,ORS) ##Splitting val into array arr with delmiter of ORS.
for(i=1;i<=num;i++){ ##Running for loop from value of 1 to till num here.
if(words[i]!=""){ ##Checking if arr item is NOT NULL then do following.
print "WORDS[" ++count "]=" s1 words[i] s1 ##Printing WORDS[ value of i followed by ]= followed by s1 words[i] value and s1.
}
}
}
' Input_file ##Mentioning Input_file name here.
这里有一个适用于任何版本的awk
:的awk
解决方案
s='camelCasedExample'
awk '{
while (match($0, /(^|[[:upper:]])[[:lower:]]+/)) {
wrd = substr($0,RSTART,RLENGTH)
print wrd
# you can also store it in array
arr[++n] = wrd
$0 = substr($0,RSTART+RLENGTH)
}
}' <<< "$s"
camel
Cased
Example
echo 'camelCasedExample' | mawk '{ for (_=(____=split($((_=_<_) * gsub("[>-[]", (___)"&")), __, ___) )^_; _<=____; _++) { print "","__["(_)"]",__[_] } }' OFS=' :: ' FS='^$' ___='2022'