我正在尝试使用Erlang regex从文件夹名称(带有发行年份的音乐专辑名称(中提取子字符串。我不希望它适用于所有文件夹名称,但如果它适用于90%,那就足够了。我需要这张专辑的名字和发行年份。如果有一年的改造,我也需要。我基本上想排除诸如CCD_ 1之类的任何特殊字符和诸如"0"之类的字符串;重制、现场、录制";
到目前为止,我想处理的案件有:
1985-An Album Title %return: 1985 An Album Title
1985-An Album Title (2003 Remastered) %return: 1985 An Album Title 2003
An Album Title-1985 %return: An Album Title 1985
An Album Title 1985 %this should be returned as is
An Album Title % "
1984 % "
1985 An Album Title % "
我的尝试首先是检查正确的年份格式,但后来我被困在"后面的连字符(-(上;1989";。如何忽略连字符或将其替换为空格?
test_regex() ->
Str = "1989-Dr.Feelgood [2009, 2CD Deluxe Edition]",
RegEx = "(^(?:19|20)\d{2})* <--- What next? %(?![-])D",
case re:run(Str, RegEx, [{capture, first, list}]) of
{match, Captured} -> io:format("Captured: ~p~n",[Captured]);
nomatch -> io:format("no match ~n")
end.
还有一个替换功能,但我不知道如何正确使用:
test_regex() ->
Str = "1989-Dr.Feelgood [2009, 2CD Deluxe Edition]",
RegEx = "(^(?:19|20)\d{2})*-.*",
case re:replace(Str, RegEx, "s", [{return, list}]) of
X -> io:format("X ~p ~n",[X])
end.
Erlang(不带正则表达式(:
-module(a).
-compile(export_all).
clean(String) ->
clean(String, _Result=[]).
clean([$-|T], Result) -> clean(T, [$ |Result]); %%Replace hyphen with space
clean([H|T], Result) when H==$(; %%Delete [,],(,)
H==$);
H==$[;
H==$] -> clean(T, Result);
clean([$ ,$R,$e,$m,$a,$s,$t,$e,$r,$e,$d | T], Result) ->
clean(T, Result);
clean([$ ,$L,$i,$v,$e | T], Result) ->
clean(T, Result);
clean([$ ,$R,$e,$c,$o,$r,$d,$e,$d | T], Result) ->
clean(T, Result);
clean([H|T], Result) ->
clean(T, [H|Result]);
clean([], Result) ->
lists:reverse(Result).
test() ->
"1985 An Album Title" = clean("1985-An Album Title"),
"1985 An Album Title 2003" = clean("1985-An Album Title (2003 Remastered)"),
"An Album Title 1985" = clean("An Album Title-1985"),
"An Album Title 1985" = clean("An Album Title 1985"),
"An Album Title" = clean("An Album Title"),
"1984" = clean("1984"),
"1985 An Album Title" = clean("1985 An Album Title"),
ok.
外壳内:
27> c(a).
a.erl:2: Warning: export_all flag enabled - all functions will be exported
{ok,a}
28> a:test().
ok
这里有一个perl答案:
use strict;
use warnings;
use 5.020;
use autodie;
use Data::Dumper;
sub clean {
my $str = shift;
$str =~ tr/-/ /; #Replace hyphens with spaces
$str =~ s/ #Delete content inside parenthesis
s+
(
( [^)]* )
)
//xms;
#If parenthetical content found, extract a date:
my $date = "";
if(my $parens_content = $1) {
$parens_content =~ /(d{4})/xms;
if ($1) { #then found a date inside parens_content
$date = " $1";
}
}
"$str$date";
}
say clean("1985-An Album Title");
say clean("1985-An Album Title (2003 Remastered)");
say clean("An Album Title-1985");
say clean("An Album Title");
say clean("1984");
say clean("1985 An Album Title");
输出:
$ perl a.pl
1985 An Album Title
1985 An Album Title 2003
An Album Title 1985
An Album Title
1984
1985 An Album Title