从文件名中提取PATH



需要复制超过100000个文件的列表(不带.ABC.DEF扩展名!(。目前,我正在/opt/project/目录中使用whilefind命令来生成完整的PATH,以便以后可以复制它们。

while read LINE; do find opt/project/TOP3RST_0_/ -name "$LINE"*; done <  < TOP3RST_0_file.list > PATH_TOP3RST_0_file.list

这个过程将缓慢进行。我想知道我是否可以使用awk、sed或其他东西从文件列表中创建完整的PATH。此外,如果我能检查每个文件是否确实存在,那将是一个奖励。

由此

BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002BT_sup_TOP3RST_0__20200716T005308_20200716T005352_0002BT_sup_TOP3RST_0__20200716T005653_20200716T005748_0002BT_sup_TOP3RST_0__20200716T005752_20200716T005824_0002BT_SUPR_TOP3RST_0__20200716T010842_20200716T011051_0002

预期输出PATH应该是这样

/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002.ABC
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002.DEF
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T005308_20200716T005352_0002/BT_SUPR_TOP3RST_0__20200716T005308_20200716T005352_0002.ABC
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T005308_20200716T005352_0002/BT_SUPR_TOP3RST_0__20200716T005308_20200716T005352_0002.DEF
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T005653_20200716T005748_0002/BT_SUPR_TOP3RST_0__20200716T005653_20200716T005748_0002.ABC
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T005653_20200716T005748_0002/BT_SUPR_TOP3RST_0__20200716T005653_20200716T005748_0002.DEF
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T005752_20200716T005824_0002/BT_SUPR_TOP3RST_0__20200716T005752_20200716T005824_0002.ABC
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T005752_20200716T005824_0002/BT_SUPR_TOP3RST_0__20200716T005752_20200716T005824_0002.DEF
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T010842_20200716T011051_0002/BT_SUPR_TOP3RST_0__20200716T010842_20200716T011051_0002.ABC
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T010842_20200716T011051_0002/BT_SUPR_TOP3RST_0__20200716T010842_20200716T011051_0002.DEF

最后我需要计算时间差距:

BT_SUPR_TOP3RST_0__20200716T003457_20100716T004736_0002.ABC

20200716T003457=2020-07-16 00:34:57

20200716T004736=2020-07-16 00:47:36

我想像datediff这样的东西可以计算出差距吗?

下面的sed行可以让您开始:

$ sed 's@.*__([0-9]{4})([0-9]{2})([0-9]{2}).*@/opt/project/TOP3RST_0_/1/2/3/&/&@; s/.*/&.ABCn&.DEF/' <<<'BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002'
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002.ABC
/opt/project/TOP3RST_0_/2020/07/16/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002/BT_SUPR_TOP3RST_0__20200716T004902_20200716T005113_0002.DEF

整行与使用backreferences保存的年、月和日相匹配。然后生成正确的模式。对于第二个s命令,输出具有不同后缀的两行。为了学习正则表达式,我建议在网上提供正则表达式纵横字谜。这个sed介绍很棒,但这里只使用了s命令。常见问题解答:&是整个匹配模式,s命令可以使用任何字符作为分隔符。

相关内容

  • 没有找到相关文章

最新更新