我有一个bibtex文件,它是合并了其他几个.bib文件。在合并过程中,除了一个重复条目外,所有重复条目都被注释掉了,因此所有重复条目的情况都如下所示。其中一些有20~30个条目被注释掉,使得一个100个引用的文件有30k行文本长。
@Article{goodnight2005,
author = {Goodnight, N. and Wang, R. and Humphreys, G.},
journal = {{IEEE Computer Graphics and Applications}},
title = {{Computation on programmable graphics hardware}},
year = {2005},
volume = {25},
number = {5},
pages = {12-15}
}
###Article{goodnight2005,
author = {Goodnight, N. and Wang, R. and Humphreys, G.},
journal = {{IEEE Computer Graphics and Applications}},
title = {{Computation on programmable graphics hardware}},
year = {2005},
volume = {25},
number = {5},
pages = {12-15}
}
@INPROCEEDINGS{Llosa-pact96,
author = {Josep Llosa and Antonio González and Eduard Ayguadé and Mateo Valero},
title = {Swing Modulo Scheduling: A Lifetime-Sensitive Approach},
booktitle = {In IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques (PACT'96},
year = {1996},
pages = {80--86}
}
如何删除所有以###开头的行,直到下一行带@排他符?实际上,我的结果文件应该是:
@Article{goodnight2005,
author = {Goodnight, N. and Wang, R. and Humphreys, G.},
journal = {{IEEE Computer Graphics and Applications}},
title = {{Computation on programmable graphics hardware}},
year = {2005},
volume = {25},
number = {5},
pages = {12-15}
}
@INPROCEEDINGS{Llosa-pact96,
author = {Josep Llosa and Antonio González and Eduard Ayguadé and Mateo Valero},
title = {Swing Modulo Scheduling: A Lifetime-Sensitive Approach},
booktitle = {In IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques (PACT'96},
year = {1996},
pages = {80--86}
}
例如sed '/###/,/@/{//!d}的参考书目。Bib保持以###开头的行,但sed '/###/,/@/d' bibliography。Bib使以@开头的行消失。
非常感谢你的帮助
使用$skip
哨兵值的简单解决方案:
use strict;
use warnings;
my $skip = 0;
while ( <> ) {
$skip = 1 if /^###/;
$skip = 0 if /^@/;
next if $skip;
print;
}
输出:[hmcmillen]$ perl test.pl < test.txt
@Article{goodnight2005,
author = {Goodnight, N. and Wang, R. and Humphreys, G.},
journal = {{IEEE Computer Graphics and Applications}},
title = {{Computation on programmable graphics hardware}},
year = {2005},
volume = {25},
number = {5},
pages = {12-15}
}
@INPROCEEDINGS{Llosa-pact96,
author = {Josep Llosa and Antonio González and Eduard Ayguadé and Mateo Valero},
title = {Swing Modulo Scheduling: A Lifetime-Sensitive Approach},
booktitle = {In IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques (PACT'96},
year = {1996},
pages = {80--86}
}
如果你真的希望它是一个单一的命令:
perl -ne 'BEGIN { $SKIP = 1 } $SKIP = 1 if /^###/; $SKIP = 0 if /^@/; print unless $SKIP;' < test.txt
假设您的输入文件都是当前目录或更低目录下的*.bib
文件
让我做你今天的find
perl
魔术师:
find . -name '*.bib' -exec
perl -i -ne '$o=1if/^@/;$o=0if/^###/;print if$o' {} ;
如果你不能阅读,不要使用它。例如,它将删除第一个@
行之前的任何内容,并且不会考虑缩进@
或###
行。
还有一个很好的模块叫做File::Find
,阅读perldoc File::Find
。我个人认为,这不会让它保持一行。
With awk:
$ awk '/###/{p=0} /@/{p=1} p' bib.text
@Article{goodnight2005,
author = {Goodnight, N. and Wang, R. and Humphreys, G.},
journal = {{IEEE Computer Graphics and Applications}},
title = {{Computation on programmable graphics hardware}},
year = {2005},
volume = {25},
number = {5},
pages = {12-15}
}
@INPROCEEDINGS{Llosa-pact96,
author = {Josep Llosa and Antonio González and Eduard Ayguadé and Mateo Valero},
title = {Swing Modulo Scheduling: A Lifetime-Sensitive Approach},
booktitle = {In IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques (PACT'96},
year = {1996},
pages = {80--86}
}