我有7000个文件(sade1.pdbkt…sade7200.pdbqt(。只有一些文件包含第二次出现的关键字TORSDOF。对于给定的文件,如果关键字TORSDOF第二次出现,我希望删除第一次出现后的所有行,同时保留文件名。有人能提供一个样本片段吗。非常感谢。
$ cat FileWith2ndOccurance.txt
ashu
vishu
jyoti
TORSDOF
Jatin
Vishal
Shivani
TORSDOF
Sushil
Kiran
after function run
$ cat FileWith2ndOccurance.txt
ashu
vishu
jyoti
TORSDOF
EDIT1:实际文件副本-
REMARK Name = 17-DMAG.cdx
REMARK 8 active torsions:
REMARK status: ('A' for Active; 'I' for Inactive)
REMARK 1 A between atoms: C_1 and N_8
REMARK 2 A between atoms: N_8 and C_9
REMARK 3 A between atoms: C_9 and C_10
REMARK 4 A between atoms: C_10 and N_11
REMARK 5 A between atoms: C_15 and O_17
REMARK 6 A between atoms: C_25 and O_28
REMARK 7 A between atoms: C_27 and O_33
REMARK 8 A between atoms: O_28 and C_29
REMARK x y z vdW Elec q Type
REMARK _______ _______ _______ _____ _____ ______ ____
ROOT
ATOM 1 C UNL 1 7.579 11.905 0.000 0.00 0.00 +0.000 C
ATOM 2 C UNL 1 7.579 10.500 0.000 0.00 0.00 +0.000 C
ATOM 30 O UNL 1 8.796 8.398 0.000 0.00 0.00 +0.000 OA
ENDROOT
BRANCH 21 31
ATOM 31 O UNL 1 13.701 7.068 0.000 0.00 0.00 +0.000 OA
ATOM 32 C UNL 1 12.306 6.953 0.000 0.00 0.00 +0.000 C
ENDBRANCH 41 42
ENDBRANCH 19 41
TORSDOF 8
REMARK Name = 17-DMAG.cdx
REMARK 8 active torsions:
REMARK status: ('A' for Active; 'I' for Inactive)
REMARK 1 A between atoms: C_1 and N_8
REMARK 2 A between atoms: N_8 and C_9
REMARK x y z vdW Elec q Type
REMARK _______ _______ _______ _____ _____ ______ ____
ROOT
ATOM 1 CL UNL 1 0.000 11.656 0.000 0.00 0.00 +0.000 Cl
ENDROOT
TORSDOF 0
我会做什么:
#!/bin/bash
for file in sade*.pdbqt; do
count=$(grep -c '^TORSDOF' "$file")
if ((count>1)); then
awk '/^TORSDOF/{print;exit}1' "$file" > /tmp/.torsdof &&
mv /tmp/.torsdof "$file"
fi
done