sed/awk复杂线路替换



我想像这样替换数千行,但我很难让它工作,而且我有两个变量$time和$date condition,不让它全局化。:

示例:<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>

更换:<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>NaN</v></row>

我试过sed:

sed -i '<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>.*/<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>NaN</v></row>/' dump_teste.xml

sed:-e表达式#1,字符1:未知命令:`<'

还有awk:

awk '{gsub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1' tmp.txt
awk: cmd. line:1: {gsub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1:                                                     ^ syntax error
awk: cmd. line:1: {gsub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1:                                                                               ^ syntax error
awk: cmd. line:1: {gsub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1:                                                                                                                                         ^ syntax error
awk: cmd. line:1: {gsub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1:                                                                                                                                                      ^ syntax error
awk: cmd. line:1: {gsub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1:                                                                                                                                                                ^ unterminated string
awk: cmd. line:1: {gsub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1:                                                                                                                                                                ^ syntax error

awk '{sub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1' tmp.txt
awk: cmd. line:1: {sub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1:                                                    ^ syntax error
awk: cmd. line:1: {sub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1:                                                                              ^ syntax error
awk: cmd. line:1: {sub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1:                                                                                                                                        ^ syntax error
awk: cmd. line:1: {sub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1:                                                                                                                                                     ^ syntax error
awk: cmd. line:1: {sub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1:                                                                                                                                                               ^ unterminated string
awk: cmd. line:1: {sub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1:                                                                                                                                                               ^ syntax error

您尝试的命令没有s选项,这就是它出错的原因。

sed -i 's/<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>.*/<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>NaN</v></row>/g' dumpteste.xml

sed -i 's/<v>.*</v>/<v>NAN</v>/g' dumpteste.xml

您有两个变量$date和$time,希望匹配具有这些变量的行,然后应用sed。执行以下操作:

sed "/"$date" "$time" .*</row>/ s/<v>.*</v>/<v>NAN</v>/g" dumpteste.xml

在上面的命令中,如果线路是

<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>```
And date and time variable are
date='2020-07-06' time='16:45:00' 
then only line containg that date and time will be edited by sed.

Did it solved your problem?

根据您的需要,下面是一个将文件中的数字替换为NAN的命令,考虑到时间范围内的所有行,而不考虑行的出现顺序。

set date from and till variables and then below command
while IFS= read -r in; do out="$(echo "$in" | awk '{print $2}')" && outtime="$(echo "$in" | awk '{print $3}')" && sed -i "/"$out" "$outtime"/ s/<v>.*</v>/<v>NAN</v>/" dumpteste.xml; done <<< "$(sort -k3 -k4 -k5 dumpteste.xml | awk -v date="$date" -v from="$from" -v till="$till" '$2 == date && $3 >= from && $3 <= till' | tac)"

上述命令示例

cat dumpteste.xml         #original file
<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>
<!-- 2020-07-06 16:47:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>
<!-- 2020-07-06 17:47:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>
<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>
<!-- 2020-07-06 16:48:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>
<!-- 2020-07-06 17:45:00 WEST / 1594050300 --><row<v>5.0000000000e+00</v></row>
<!-- 2020-08-06 16:45:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>

date=2020-07-06
from=16:45:00
till=17:45:00
Output  
cat dumpteste.xml      #after change
<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>NAN</v></row>
<!-- 2020-07-06 16:47:00 WEST / 1594050300 --> <row><v>NAN</v></row>
<!-- 2020-07-06 17:47:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>
<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>NAN</v></row>
<!-- 2020-07-06 16:48:00 WEST / 1594050300 --> <row><v>NAN</v></row>
<!-- 2020-07-06 17:45:00 WEST / 1594050300 --> <row><v>NAN</v></row>
<!-- 2020-08-06 16:45:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>

请参阅以了解日期2020-07-06,当提供时间范围为16:45:00-17:45:00时,更改了时间为16:45,16:48,16:47,17:45的行。时间16:45,但日期2020-08-06没有改变,因为日期不匹配。

此外,如果您需要在范围中输入日期,则定义四个变量:日期、结束日期、从、到。并执行以下命令

date=2020-07-06
enddate=2020-08-06
from=16:45:00
till=17:45:00
while IFS= read -r in; do out="$(echo "$in" | awk '{print $2}')" && outtime="$(echo "$in" | awk '{print $3}')" && sed -i "/"$out" "$outtime"/ s/<v>.*</v>/<v>NAN</v>/" du*; done <<< "$(sort -k3 -k4 -k5 du* | awk -v date="$date" -v from="$from" -v till="$till" -v enddate="$enddate" '$2 >= date && $2 <= enddate && $3 >= from && $3 <= till' | tac)"

以上命令将帮助您更改范围内日期和时间提供的值希望这就足够了?

较短版本:1( 。具有时间范围

date=2020-07-06 && from=16:45:00 && till=17:45:00 && gawk -i inplace -v date="$date" -v from="$from" -v till="$till" '$2 == date && $3 >= from && $3 <= till {gsub(/<v>[^<]*/, "<v>nan<")}1' dumpteste.xml

2( 。具有日期和时间范围

date=2020-07-06 && from=16:45:00 && till=17:45:00 && enddate=2020-08-06 && awk -v date="$date" -v from="$from" -v till="$till" -v enddate="$enddate" '$2 >= date && $2 <= enddate && $3 >= from && $3 <= till {gsub(/<v>[^<]*/, "<v>nan<")}1' dumpteste.xml

最新更新