有没有办法删除所有在code="在";>"之后在我的文件中,所以我留下了clearsky_night或cloudy,或sun等?
我已经尝试了grep -o -P '(?<=>).*(?=>)',但得到一个错误消息,将未知选项保存到's'
我也试过grep - o - p"(? & lt;代码= =")。*(? =")’但这也不管用。这是我文件中的内容:
<symbol id="Sun" number="1" code="clearsky_night"></symbol>
<symbol id="Sun" number="1" code="clearsky_night"></symbol>
<symbol id="Sun" number="1" code="clearsky_night"></symbol>
<symbol id="Cloud" number="4" code="cloudy"></symbol>
<symbol id="PartlyCloud" number="3" code="partlycloudy_night"></symbol>
<symbol id="Cloud" number="4" code="cloudy"></symbol>
<symbol id="PartlyCloud" number="3" code="partlycloudy_night"></symbol>
<symbol id="PartlyCloud" number="3" code="partlycloudy_night"></symbol>
<symbol id="PartlyCloud" number="3" code="partlycloudy_night"></symbol>
<symbol id="PartlyCloud" number="3" code="partlycloudy_night"></symbol>
<symbol id="Cloud" number="4" code="cloudy"></symbol>
<symbol id="PartlyCloud" number="3" code="partlycloudy_night"></symbol>
<symbol id="Cloud" number="4" code="cloudy"></symbol>
<symbol id="PartlyCloud" number="3" code="partlycloudy_night"></symbol>
<symbol id="Cloud" number="4" code="cloudy"></symbol>
<symbol id="Cloud" number="4" code="cloudy"></symbol>
<symbol id="PartlyCloud" number="3" code="partlycloudy_night"></symbol>
<symbol id="Cloud" number="4" code="cloudy"></symbol>
<symbol id="PartlyCloud" number="3" code="partlycloudy_night"></symbol>
<symbol id="Cloud" number="4" code="cloudy"></symbol>
<symbol id="Cloud" number="4" code="cloudy"></symbol>
<symbol id="Cloud" number="4" code="cloudy"></symbol>
<symbol id="PartlyCloud" number="3" code="partlycloudy_night"></symbol>
<symbol id="PartlyCloud" number="3" code="partlycloudy_night"></symbol>
<symbol id="LightCloud" number="2" code="fair_night"></symbol>
<symbol id="PartlyCloud" number="3" code="partlycloudy_night"></symbol>
<symbol id="Sun" number="1" code="clearsky_night"></symbol>
<symbol id="PartlyCloud" number="3" code="partlycloudy_night"></symbol>
<symbol id="LightCloud" number="2" code="fair_night"></symbol>
<symbol id="LightCloud" number="2" code="fair_night"></symbol>
<symbol id="LightCloud" number="2" code="fair_night"></symbol>
<symbol id="PartlyCloud" number="3" code="partlycloudy_night"></symbol>
<symbol id="LightCloud" number="2" code="fair_night"></symbol>
<symbol id="Cloud" number="4" code="cloudy"></symbol>
<symbol id="Cloud" number="4" code="cloudy"></symbol>
<symbol id="Cloud" number="4" code="cloudy"></symbol>
这个怎么样:
grep -o -P '(?<=code=").+?(?=")' input_file.xml
我检查了lookaround(?<=...)
和(?=...)
的使用
或者,使用perl
my friend:
$ perl -pe 's:^.+code="(.+?)".+$:1:' <input_file.xml
解释:
perl -pe
:使用包含下一个字符串参数的命令运行perl
。s:...:...:
: substitution."(.+?)"
:""
内部的东西,non- gredy (?
).^.+code="
:从第一行开始到code="
.".+$
:从"
到行尾的所有内容
当然,这是一个快速而肮脏的解决方案。XML解析器会更好。
(抱歉我的英语不好)
假设每个@cyrus注释都有有效的XML,可以通过xsltproc
:
使用XSLT转换src.xslt
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:strip-space elements="*"/>
<xsl:template match="symbol">
<xsl:for-each select="@code">
<xsl:value-of select="concat(., '
')"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
使用xsltproc
转换你的xml:
xsltproc src.xslt src.xml
输出:
clearsky_night
clearsky_night
clearsky_night
cloudy
partlycloudy_night
cloudy
partlycloudy_night
partlycloudy_night
partlycloudy_night
partlycloudy_night
cloudy
partlycloudy_night
cloudy
partlycloudy_night
cloudy
cloudy
partlycloudy_night
cloudy
partlycloudy_night
cloudy
cloudy
cloudy
partlycloudy_night
partlycloudy_night
fair_night
partlycloudy_night
clearsky_night
partlycloudy_night
fair_night
fair_night
fair_night
partlycloudy_night
fair_night
cloudy