如何使用Unix命令在头文件第一次出现的地方连接多行



我有一个这样的文件:

Query=scaffold1_size75580
lcl|Os10t0535800-01
Query=scaffold1_size75580
lcl|Os10t0536000-02
Query=scaffold1_size75580
lcl|Os10t0536100-01
Query=scaffold1_size75580
lcl|Os10t0536400-01
Query=scaffold1_size75580
lcl|Os10t0536700-01
Query=scaffold1_size75580
lcl|Os10t0536700-01
Query=scaffold1_size75580
lcl|Os10t0536900-00
Query=scaffold1_size75580
lcl|Os10t0536700-01
Query=scaffold1_size75580
lcl|Os10t0536700-01
Query=scaffold2_size74975
lcl|Os11t0637501-00
Query=scaffold2_size74975
lcl|Os11t0637600-00
Query=scaffold2_size74975
lcl|Os11t0637800-01
Query=scaffold2_size74975
lcl|Os11t0637800-01
Query=scaffold2_size74975
lcl|Os11t0638200-00
Query=scaffold2_size74975
lcl|Os11t0638700-00
Query=scaffold2_size74975
lcl|Os11t0638700-00
Query=scaffold2_size74975
lcl|Os11t0638700-00
Query=scaffold2_size74975
lcl|Os11t0638700-00
Query=scaffold2_size74975
lcl|Os11t0638700-00
Query=scaffold2_size74975
lcl|Os11t0638900-01
Query=scaffold2_size74975
lcl|Os11t0638900-01
Query=scaffold3_size69500
lcl|Os06t0725100-01
Query=scaffold3_size69500
lcl|Os06t0724900-01
Query=scaffold3_size69500
lcl|Os06t0724900-01
Query=scaffold3_size69500
lcl|Os06t0724700-01
Query=scaffold3_size69500
lcl|Os06t0724700-01
Query=scaffold3_size69500
lcl|Os06t0724600-01
Query=scaffold3_size69500
lcl|Os06t0724100-02
Query=scaffold3_size69500
lcl|Os06t0724100-02
Query=scaffold3_size69500
lcl|Os06t0724100-02
Query=scaffold3_size69500
lcl|Os06t0724100-02
Query=scaffold4_size68019
lcl|Os01t0627550-00
Query=scaffold4_size68019
lcl|Os01t0626900-01
Query=scaffold4_size68019
lcl|Os01t0626400-01
Query=scaffold4_size68019
lcl|Os01t0626400-01
Query=scaffold4_size68019
lcl|Os01t0626400-01
Query=scaffold4_size68019
lcl|Os01t0626100-01
Query=scaffold4_size68019
lcl|Os01t0626100-01
Query=scaffold4_size68019
lcl|Os01t0626100-01
Query=scaffold4_size68019
lcl|Os01t0626032-01
Query=scaffold5_size66739
lcl|Os04t0653200-01
Query=scaffold5_size66739
lcl|Os04t0653400-01
Query=scaffold5_size66739
lcl|Os04t0653400-01
Query=scaffold5_size66739
lcl|Os04t0653600-01
Query=scaffold5_size66739
lcl|Os04t0654600-01
Query=scaffold5_size66739
lcl|Os04t0654600-01
Query=scaffold5_size66739
lcl|Os04t0654600-01
Query=scaffold5_size66739
lcl|Os04t0654600-01
Query=scaffold5_size66739
lcl|Os04t0654600-01
Query=scaffold5_size66739
lcl|Os04t0654600-01
Query=scaffold5_size66739
lcl|Os04t0654600-01
Query=scaffold5_size66739
lcl|Os04t0654600-01
Query=scaffold5_size66739
lcl|Os04t0654600-01
Query=scaffold6_size65486
lcl|Os01t0259900-00
Query=scaffold6_size65486
lcl|Os01t0259400-01
Query=scaffold6_size65486
lcl|Os01t0259400-01
Query=scaffold6_size65486
lcl|Os01t0259400-01
Query=scaffold6_size65486
lcl|Os01t0259400-01
Query=scaffold6_size65486
lcl|Os01t0259200-01
Query=scaffold7_size64123
lcl|Os04t0162100-01
Query=scaffold7_size64123
lcl|Os05t0325000-00
Query=scaffold7_size64123
lcl|Os05t0325000-00
Query=scaffold7_size64123
lcl|Os05t0325000-00
Query=scaffold7_size64123
lcl|Os05t0324600-01
Query=scaffold7_size64123
lcl|Os05t0324600-01

等直到支架在66000左右。我希望我的文件有重复的头被删除,所有相应的条目来在一个单一的头,即,我想这样:

Query=scaffold1_75580
lcl|Os10t0535800-01
lcl|Os10t0536000-02
lcl|Os10t0536100-01
lcl|Os10t0536400-01
lcl|Os10t0536700-01
lcl|Os10t0536700-01
lcl|Os10t0536900-00
lcl|Os10t0536700-01
lcl|Os10t0536700-01
Query=scaffold2_size74975
lcl|Os11t0637501-00
lcl|Os11t0637600-00
lcl|Os11t0637800-01
lcl|Os11t0637800-01
lcl|Os11t0638200-00
lcl|Os11t0638700-00
lcl|Os11t0638700-00
lcl|Os11t0638700-00
lcl|Os11t0638700-00
lcl|Os11t0638700-00
lcl|Os11t0638900-01
lcl|Os11t0638900-01
Query=scaffold3_size69500
lcl|Os06t0725100-01
lcl|Os06t0724900-01
lcl|Os06t0724900-01
lcl|Os06t0724700-01
lcl|Os06t0724700-01
lcl|Os06t0724600-01
lcl|Os06t0724100-02
lcl|Os06t0724100-02
lcl|Os06t0724100-02
lcl|Os06t0724100-02

等等。如何做到这一点?

如果你不介意通过多次推送,我可能会建议这样:

搜索:

^(Query=.*n)((?:(?!Query=).*n)+)1

替换:

12

现场演示

最新更新