这是一个常见问题的轻微变体:除非whitespace包含在一对引号("或')中?在这里这样的问题,到目前为止,我发现的最好的答案是。问题是,所有这些答案都包括比赛中的报价。例如:
"foo bar 'i went to a bar'".match(/[^s"']+|"([^"]*)"|'([^']*)'/g);
导致:
["foo", "bar", "'i went to a bar'"]
是否有一个解决方案导致:
["foo", "bar", "i went to a bar"]
请注意,围绕以下方面有一个边缘情况:
"foo bar "'Hi,' she said, 'how are you?'"".match(...);
=> // ["foo", "bar", "'Hi,' she said, 'how are you?'"]
也就是说,子字符串应该能够包含自己的报价,这意味着积极地做类似的事情将无法使用:
"foo bar "'Hi,' she said, 'how are you?'"".match(...).map(function(string) {
return string.replace(/'|"/g, '');
});
更新:
我们基本上可以与此作用:
"foo bar "'Hi,' she said, 'how are you?'"".match(/[^s"']+|"([^"]*)"|'([^']*)'/g).map(function(string) {
return string.replace(/^('|")|('|")$/g, '');
});
,但这很丑陋。(而且它也将打破一个诸如" 5ft 5feet 5'"的边缘案例。
您的正则表达式足够好。您只需要循环浏览比赛,然后选择正确捕获的组:
var re = /'([^'\]*(?:\.[^'\]*)*)'|"([^"\]*(?:\.[^"\]*)*)"|[^s"']+/g;
var arr = ['foo bar "'Hi,' she said, 'how are you?'"',
'foo bar 'i went to a bar'',
'foo bar '"Hi," she said, "how are you?"'',
''"Hi," she \'said\', "how are you?"''
];
for (i = 0; i < arr.length; i++) {
var m;
var result = [];
while ((m = re.exec(arr[i])) !== null) {
if (m.index === re.lastIndex)
re.lastIndex++;
result.push(m[1] || m[2] || m[0])
}
console.log(result)
}
引用的字符串总是很有趣。您需要测试均匀或奇数的逃生字符才能知道何时终止字符串。
function quotedSplit(str) {
let re = /'((?:(?:(?:\\)*\')|[^'])*)'|"((?:(?:(?:\\)*\")|[^"])*)"|(w+)/g,
arr = [],
m;
while(m = re.exec(str))
arr.push(m[1] || m[2] || m[3]);
return arr;
}
quotedSplit("fizz 'foo \'bar\'' buzz" + ' --- ' + 'fizz "foo \"bar\"" buzz');
// ["fizz", "foo 'bar'", "buzz", "fizz", "foo "bar"", "buzz"]
在这里,前两个匹配将找到引用的字符串,第三场比赛是" Word"