PHP preg_split,用于带有大写和自定义字符的新行



我需要解析.ics文件的内容以将事件导入MySQL database,但在我的情况下,事件的描述很长,并且从新行开始。这就是为什么我需要一个regular expression来检测以大写字母开始并在其后的换行符:。这是一个默认级别代码

BEGIN:VEVENT
DTSTART:20181031T200000Z
DTSTAMP:20200507T084920Z
UID:Ical2b1c1757c3ad34668ba61907c0f0c280
CREATED:19000101T120000Z
DESCRIPTION:Lorem Ipsum is simply dummy text of the printing and typesettin
g industry. Lorem Ipsum has been the industry standard dummy text ever sinc
e the 1500s, when an unknown printer took a galley of type and scrambled i
t to make a type specimen book. It has survived not only five centuries, b
ut also the leap into electronic typesetting, remaining essentially unchan
ged. It was popularised in the 1960s with the release of Letraset sheets co
ntaining Lorem Ipsum passages, and more recently with desktop including ve
rsions of Lorem Ipsum.
LAST-MODIFIED:20191119T090442Z
LOCATION:
SEQUENCE:0
STATUS:CONFIRMED
SUMMARY:Event 2
TRANSP:OPAQUE
END:VEVENT  

这是我的代码,它在每一行上返回一个解析的数组。

$parsedEvent = preg_split('/.s*?(?=[A-Z])|(rn|n|r)/', trim($event));

一个非正则表达式版本,它将文件拆分为行,如果第一个字符是空格,则这是上一个条目的延续。否则,将新条目拆分并添加到…中

$values = [];
$lines = explode(PHP_EOL, $event);
$outputValue = '';
foreach ( $lines as $line ) {
if ( $line[0] === ' ' ) {
$values[$outputValue] .= trim($line);
}
else    {
list($key, $value) = explode(":", $line, 2);
$values[$key] = trim($value);
$outputValue = $key;
}
}

给予。。。

Array
(
[BEGIN] => VEVENT
[DTSTART] => 20181031T200000Z
[DTSTAMP] => 20200507T084920Z
[UID] => Ical2b1c1757c3ad34668ba61907c0f0c280
[CREATED] => 19000101T120000Z
[DESCRIPTION] => Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop including versions of Lorem Ipsum.
[LAST-MODIFIED] => 20191119T090442Z
[LOCATION] => 
[SEQUENCE] => 0
[STATUS] => CONFIRMED
[SUMMARY] => Event 2
[TRANSP] => OPAQUE
[END] => VEVENT
)

删除换行符和单个空格如规范中所述

CRLF的任何序列,后面紧跟单个线性在处理内容类型。

最新更新