用正则表达式Erlang/Pearl替换或忽略多个子字符串



我正在尝试使用Erlang regex从文件夹名称(带有发行年份的音乐专辑名称(中提取子字符串。我不希望它适用于所有文件夹名称,但如果它适用于90%,那就足够了。我需要这张专辑的名字和发行年份。如果有一年的改造,我也需要。我基本上想排除诸如CCD_ 1之类的任何特殊字符和诸如"0"之类的字符串;重制、现场、录制";

到目前为止,我想处理的案件有:

1985-An Album Title                        %return:  1985 An Album Title
1985-An Album Title (2003 Remastered)      %return:  1985 An Album Title 2003
An Album Title-1985                        %return:  An Album Title 1985
An Album Title 1985                        %this should be returned as is 
An Album Title                             %            "
1984                                       %            "
1985 An Album Title                        %            "

我的尝试首先是检查正确的年份格式,但后来我被困在"后面的连字符(-(上;1989";。如何忽略连字符或将其替换为空格?

test_regex() ->
Str = "1989-Dr.Feelgood [2009, 2CD Deluxe Edition]",
RegEx = "(^(?:19|20)\d{2})*  <--- What next?              %(?![-])D",              
case re:run(Str, RegEx, [{capture, first, list}]) of
{match, Captured} -> io:format("Captured: ~p~n",[Captured]);
nomatch -> io:format("no match ~n")
end.

还有一个替换功能,但我不知道如何正确使用:

test_regex() -> 
Str = "1989-Dr.Feelgood [2009, 2CD Deluxe Edition]",
RegEx = "(^(?:19|20)\d{2})*-.*",
case re:replace(Str, RegEx, "s", [{return, list}]) of
X -> io:format("X ~p ~n",[X])
end.     

Erlang(不带正则表达式(:

-module(a).
-compile(export_all).
clean(String) ->
clean(String, _Result=[]).
clean([$-|T], Result) -> clean(T, [$ |Result]);  %%Replace hyphen with space
clean([H|T], Result) when H==$(;   %%Delete [,],(,)
H==$);
H==$[;
H==$] -> clean(T, Result);
clean([$ ,$R,$e,$m,$a,$s,$t,$e,$r,$e,$d | T], Result) ->
clean(T, Result);
clean([$ ,$L,$i,$v,$e | T], Result) ->
clean(T, Result);
clean([$ ,$R,$e,$c,$o,$r,$d,$e,$d | T], Result) ->
clean(T, Result);
clean([H|T], Result) ->
clean(T, [H|Result]);
clean([], Result) ->
lists:reverse(Result).
test() ->
"1985 An Album Title"      = clean("1985-An Album Title"),
"1985 An Album Title 2003" = clean("1985-An Album Title (2003 Remastered)"),
"An Album Title 1985"      = clean("An Album Title-1985"),
"An Album Title 1985"      = clean("An Album Title 1985"),
"An Album Title"           = clean("An Album Title"),
"1984"                     = clean("1984"),
"1985 An Album Title"      = clean("1985 An Album Title"), 
ok.

外壳内:

27> c(a).    
a.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,a}
28> a:test().
ok

这里有一个perl答案:

use strict;
use warnings; 
use 5.020;
use autodie;
use Data::Dumper;
sub clean {
my $str = shift;
$str =~ tr/-/ /;  #Replace hyphens with spaces
$str =~ s/        #Delete content inside parenthesis
s+
( 
( [^)]* )
)
//xms;
#If parenthetical content found, extract a date:
my $date = "";
if(my $parens_content = $1) {
$parens_content =~ /(d{4})/xms;
if ($1) {  #then found a date inside parens_content
$date = " $1";
}
}
"$str$date";
}
say clean("1985-An Album Title");
say clean("1985-An Album Title (2003 Remastered)");
say clean("An Album Title-1985");
say clean("An Album Title");
say clean("1984");
say clean("1985 An Album Title");

输出:

$ perl a.pl
1985 An Album Title
1985 An Album Title 2003
An Album Title 1985
An Album Title
1984
1985 An Album Title

最新更新