GREP中文本文件中的特殊字符和字符字符



我试图将GREP遵守通配符(.{64}.{65}(,开始字符的开始字符(^(和线条结束字符($(,而文本文件中忽略介于两者之间的任何事物。

foo.txt的内容:

^.{64}  /Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>$
^.{64}  /Users/1337/Test Hash Folder/C$
^.{64}  /Users/1337/Test Hash Folder/C [remain]$
^.{64}  /Users/1337/Test Hash Folder/D 日本$
^.{65}  /Users/1337/Test Hash Folder/\F::$

bar.txt的内容:

456f0958a5fd779fd12a0b383cd6384a9916782655f9298865e087630b7dffc1  /Users/1337/Test Hash Folder/\\F::
e7d616682023bf43930eb2c07590f259167b2b937097639975bf0838260be3f5  /Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>
f978dda2d3be7e976ec25eee3a17f24a02af7386d163ae95c1fa48cdf75586a5  /Users/1337/Test Hash Folder/A
9f913e331f16e9bc5493a7c4c9480753351fd0098398e32c9b8d4870a63b65ea  /Users/1337/Test Hash Folder/B [LOL].dmg
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C [remain]
791c3644922e627d46307b901017131ab06575bdcde708298e3a80f47af09d1f  /Users/1337/Test Hash Folder/D 日本

这是我运行的命令:

grep -Ef foo.txt bar.txt

我希望它输出以下内容:

e7d616682023bf43930eb2c07590f259167b2b937097639975bf0838260be3f5  /Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C [remain]
791c3644922e627d46307b901017131ab06575bdcde708298e3a80f47af09d1f  /Users/1337/Test Hash Folder/D 日本
456f0958a5fd779fd12a0b383cd6384a9916782655f9298865e087630b7dffc1  /Users/1337/Test Hash Folder/\\F::

但它会输出此:

14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C
791c3644922e627d46307b901017131ab06575bdcde708298e3a80f47af09d1f  /Users/1337/Test Hash Folder/D 日本

这是我的文件的准确命名的列表:

\F//
^.$[E-frLOL[MAY[]{}()?<NUL>
A
B [LOL].dmg
C
C [remain]
D 日本

grep是否有可能输出我需要的方法?如果没有,我可以使用其他方法(BBEDIT/NOTEPAD ,文本机械师等(来实现相同的效果?


编辑:

...LOL[MAY...行更改为:

#;."'&,:`!*?$(){}[]<>|-=+% ~^^.$[E-frLOL[MAY[]{}()?<NUL>

我要做的就是使用sed,添加通配符等逃脱所有冒犯的字符,将foo.txt馈入grep,然后删除ESCAPES,WIRDCARD,^ S和$

所以,这是foo.txt的新内容:

/Users/1337/Test Hash Folder/#;."'&,:`!*?$(){}[]<>|-=+% ~^^.$[E-frLOL[MAY[]{}()?<NUL>
/Users/1337/Test Hash Folder/C
/Users/1337/Test Hash Folder/C [remain]
/Users/1337/Test Hash Folder/D 日本
/Users/1337/Test Hash Folder/\F::

我将运行这些以逃脱有问题的角色:

sed 's/\/\\/g' foo.txt > baz.txt
sed -i '' 's/$/\$/g' baz.txt
sed -i '' 's/^/\^/g' baz.txt

我还需要逃脱其他哪些角色?仅供参考,这些仅用于grep

接下来,我将使用以下内容:

cat baz.txt | grep '\\' > backslashes.txt
cat baz.txt | grep -v '\\' > no_backslashes.txt
sed 's/^/^.{64}  /; s/$/$/' no_backslashes.txt > eggs.txt
sed 's/^/^.{65}  /; s/$/$/' backslashes.txt >> eggs.txt

然后我将运行:

grep -Ef eggs.txt bar.txt

之后,我将删除 ^.{64}^.{65}$(仅末尾,以防止文件名记录被更改(,然后从baz.txt。

如果任何一种令人困惑,请随时要求我澄清。


mac os x yosemite,bash 3.2.57(1(-Release,grep(bsd grep(2.5.1-freebsd

您的模式有几个问题:

  1. 不逃脱某些在Regex中具有特殊含义的字符
  2. 不正确的重复模式计数

建议的解决方案

^.{64}  /Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>$
^.{64}  /Users/1337/Test Hash Folder/C$
^.{64}  /Users/1337/Test Hash Folder/C [remain]$
^.{64}  /Users/1337/Test Hash Folder/D 日本$
^[].{64}  /Users/1337/Test Hash Folder/\{4}F::$

逃脱字符类字符和量词(^$[]?(并正确设置重复计数。

这很巨大,但这是我为它起作用的原因:

foo.txt的内容:

/Users/1337/Test Hash Folder/#;."'&,:`!*?$(){}[]<>|-=+% ~^^.$[E-frLOL[MAY[]{}()?<NUL>
/Users/1337/Test Hash Folder/\\\<-6 F 2->::
/Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>
/Users/1337/Test Hash Folder/C
/Users/1337/Test Hash Folder/C [remain]
/Users/1337/Test Hash Folder/D 日本

bar.txt的内容:

d4c88a749dcb8d8c09fd8e08e044f66d8d1d9cf9d191c62989697813d26a6b55  /Users/1337/Test Hash Folder/#;."'&,\:`!*?$(){}[]<>|-=+% ~^^.$[E-frLOL[MAY[]{}()?<NUL>
1ef8a2fb20e09e1866768d824289ffdbda12e565839fa7dda1fe12c4206e5759  /Users/1337/Test Hash Folder/@ss
456f0958a5fd779fd12a0b383cd6384a9916782655f9298865e087630b7dffc1  /Users/1337/Test Hash Folder/\\\\\\<-6 F 2->::
e7d616682023bf43930eb2c07590f259167b2b937097639975bf0838260be3f5  /Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>
d96b11b464adc478e458943d080b5efac2e887a18ef4ae785d21506de5594ddc  /Users/1337/Test Hash Folder/A
9f913e331f16e9bc5493a7c4c9480753351fd0098398e32c9b8d4870a63b65ea  /Users/1337/Test Hash Folder/B [LOL].dmg
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C [remain]
791c3644922e627d46307b901017131ab06575bdcde708298e3a80f47af09d1f  /Users/1337/Test Hash Folder/D 日本

命令运行:

$ sed 's/\/\\/g' foo.txt > baz.txt
$ sed -i '' 's/$/\$/g' baz.txt
$ sed -i '' 's/^/\^/g' baz.txt
$ sed -i '' 's/*/\*/g' baz.txt
$ sed -i '' 's/?/\?/g' baz.txt
$ sed -i '' 's/[/\[/g' baz.txt
$ sed -i '' 's/]/\]/g' baz.txt
$ sed -i '' 's/(/\(/g' baz.txt
$ sed -i '' 's/)/\)/g' baz.txt
$ sed -i '' 's/}/\}/g' baz.txt
$ sed -i '' 's/{/\{/g' baz.txt
$ sed -i '' 's/\\/\\{2}/g' baz.txt
$ cat foo.txt | grep '\\' > backslashes.txt
$ cat foo.txt | grep -v '\\' > no_backslashes.txt
$ sed 's/^/^.{64}  /; s/$/$/' no_backslashes.txt > qux.txt
$ sed 's/^/^.{65}  /; s/$/$/' backslashes.txt >> qux.txt

baz.txt的内容:

/Users/1337/Test Hash Folder/#;."'&,\{2}:`!*?$(){}[]<>|-=+% ~^^.$[E-frLOL[MAY[]{}()?<NUL>
/Users/1337/Test Hash Folder/\{2}\{2}\{2}\{2}\{2}\{2}<-6 F 2->::
/Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>
/Users/1337/Test Hash Folder/C
/Users/1337/Test Hash Folder/C [remain]
/Users/1337/Test Hash Folder/D 日本

backslashes.txt的内容:

/Users/1337/Test Hash Folder/#;."'&,\{2}:`!*?$(){}[]<>|-=+% ~^^.$[E-frLOL[MAY[]{}()?<NUL>
/Users/1337/Test Hash Folder/\{2}\{2}\{2}\{2}\{2}\{2}<-6 F 2->::

no_backslashes的内容:

/Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>
/Users/1337/Test Hash Folder/C
/Users/1337/Test Hash Folder/C [remain]
/Users/1337/Test Hash Folder/D 日本

qux.txt的内容:

^.{64}  /Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>$
^.{64}  /Users/1337/Test Hash Folder/C$
^.{64}  /Users/1337/Test Hash Folder/C [remain]$
^.{64}  /Users/1337/Test Hash Folder/D 日本$
^.{65}  /Users/1337/Test Hash Folder/#;."'&,\{2}:`!*?$(){}[]<>|-=+% ~^^.$[E-frLOL[MAY[]{}()?<NUL>$
^.{65}  /Users/1337/Test Hash Folder/\{2}\{2}\{2}\{2}\{2}\{2}<-6 F 2->::$

现在,最后,我们都在等待的命令:

grep -Ef qux.txt bar.txt

输出这些可爱的字符串:

d4c88a749dcb8d8c09fd8e08e044f66d8d1d9cf9d191c62989697813d26a6b55  /Users/1337/Test Hash Folder/#;."'&,\:`!*?$(){}[]<>|-=+% ~^^.$[E-frLOL[MAY[]{}()?<NUL>
456f0958a5fd779fd12a0b383cd6384a9916782655f9298865e087630b7dffc1  /Users/1337/Test Hash Folder/\\\\\\<-6 F 2->::
e7d616682023bf43930eb2c07590f259167b2b937097639975bf0838260be3f5  /Users/1337/Test Hash Folder/^.$[E-frLOL[MAY[]{}()?<NUL>
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C
14e024fd9762abda9958b57ff95c7e515deccbd162eda2e338993ca32d6f0474  /Users/1337/Test Hash Folder/C [remain]
791c3644922e627d46307b901017131ab06575bdcde708298e3a80f47af09d1f  /Users/1337/Test Hash Folder/D 日本

谢谢大家的帮助!

最新更新