r-替换字符串会导致名称不正确



I,我想更改矢量中的几个字符串。在我的情况下,我在all.images中有一个对象:

# Original character's list
all.images <-c("S2B2A_20171003_124_IndianaIIPR00911120170922_BOA_10.tif",             
"S2B2A_20181028_124_IndianaIIPR0065820181024_BOA_10.tif",              
"S2B2A_20170715_124_SantaMariaCalcasPR0033420170731_BOA_10.tif",       
"S2B2A_20180928_124_NSraAparecidaBortolettoPR0042720180912_BOA_10.tif",
"S2A2A_20170610_124_LagoaAmarelaPR0022020170619_BOA_10.tif",           
"S2A2A_20160705_124_AguaSumidaPR001320160629_BOA_10.tif",              
"S2A2A_20181023_124_SaoPedroGabrielGarciaPR001720181031_BOA_10.tif",   
"S2B2A_20180908_124_NSraAparecidaBortolettoPR001920180911_BOA_10.tif", 
"S2A2A_20180824_124_NSraAparecidaBortolettoPR0043320180911_BOA_10.tif",
"S2A2A_20170720_124_VoAnaPR001520170802_BOA_10.tif",                   
"S2B2A_20180322_124_SaoMateusPR0021920180314_BOA_10.tif",              
"S2A2A_20181212_124_NSradeFatimaJoaoBatistaPR002320181128_BOA_10.tif", 
"S2A2A_20180413_081_SantaFeSebastiaoFogacaPR0021920180427_BOA_10.tif", 
"S2B2A_20170913_124_PerdizesPR0034920170905_BOA_10.tif",               
"S2A2A_20170610_124_TresMeninasPR001820170601_BOA_10.tif",             
"S2B2A_20180428_081_SantaFeSebastiaoFogacaPR0021020180501_BOA_10.tif", 
"S2B2A_20180508_081_SantaFeSebastiaoFogacaPR0022320180427_BOA_10.tif", 
"S2A2A_20170809_124_VoAnaPR001620170803_BOA_10.tif",                   
"S2B2A_20180819_124_PontalIIPR0012220180801_BOA_10.tif",               
"S2B2A_20181214_081_NSradeFatimaJoaoBatistaPR002320181128_BOA_10.tif", 
"S2A2A_20180423_081_SantaFeSebastiaoFogacaPR0033920180427_BOA_10.tif", 
"S2A2A_20180814_124_PontalIIPR0012220180801_BOA_10.tif",               
"S2B2A_20170715_124_VoAnaPR0015A20170803_BOA_10.tif",                  
"S2A2A_20160615_124_AguaSumidaPR0011220160627_BOA_10.tif",            
"S2A2A_20170720_124_SantaMariaCalcasPR0022820170726_BOA_10.tif",       
"S2A2A_20180913_124_SantaMariaCalcasPR001620180829_BOA_10.tif",        
"S2B2A_20170804_124_NSraAparecidaBortolettoPR0035720170811_BOA_10.tif",
"S2A2A_20170809_124_SantaFeBaracatPR001920170801_BOA_10.tif",          
"S2B2A_20180322_124_NSradeFatimaGlebaAPR001320180403_BOA_10.tif",      
"S2B2A_20180508_081_SantaFeSebastiaoFogacaPR0021920180427_BOA_10.tif")
# 

我的想法是1(去除S2B2A__BOA_10.tif;2( 在S2B2A_之后,将8个值转换为日期(例如2017-09-05(;3( 约会后,取下三个值到底(例如124081(;和4(以大写字母和日期分隔字符(例如,AguaSumidaPR0011220160627至AguaSumIDAPR00112-2016-06-27(。但当我尝试做:

sub("^\w+_(\d+)_(\d+)_([A-Za-z]+)([A-Z]{2}\d{3})(\d)(\d{4})(\d{2})(\d+)_.*", 
"\3_\4_\5_\6-\7-\8_\1_\2", all.images)

[1] "IndianaII_PR009_1_1120-17-0922_20171003_124"             
[2] "IndianaII_PR006_5_8201-81-024_20181028_124"              
...
[28] "SantaFeBaracat_PR001_9_2017-08-01_20170809_124"          
[29] "NSradeFatimaGlebaA_PR001_3_2018-04-03_20180322_124"      
[30] "SantaFeSebastiaoFogaca_PR002_1_9201-80-427_20180508_081" 

我有错误的日期(例如在[30]9201-80-427_20180508_081中(,我想要的输出需要是:

[1] "IndianaII_PR009111_2017-09-22_2017-10-03_124"             
[2] "IndianaII_PR00658_2018-10-24_2018-10-28_124"              
...
[28] "SantaFeBaracat_PR0019_2017-08-01_2017-08-09_124"          
[29] "NSradeFatimaGlebaA_PR0013_2018-04-03_2018-03-22_124"      
[30] "SantaFeSebastiaoFogaca_PR00219_2018-04-27_2018-05-08_081"

请帮忙吗?

我认为这可以使用前瞻性处理答案注释中的异常:

sub("^\w+_(\d{4})(\d{2})(\d{2})_(\d+)_([A-Za-z]+)([A-Z]{2}\w+)(?=\d{8})+(\d{4})(\d{2})(\d+)_.*", 
"\5_\6_\7-\8-\9_\1-\2-\3_\4", all.images, perl = TRUE)

最新更新