如何获得外部只有尖括号的节点(算法)



将此作为输入'<apple, <pear>>, <orange, <apple>, <cherry>, banana>, <banana, <pear>, <orange>>',考虑到尖括号内的任何内容都被认为是节点,因此输出应该是:['<apple, <pear>>', '<orange, <apple>, <cherry>, banana>', '<banana, <pear>, <orange>>']

我已经尝试过regex和这样分割

const regexBycomma = /([^,<]*(?:<[^>]*>[^,<]*)*),?/g;
str.split(regexBycomma);

的效果不太好,也有很多这样的算法:

function getParentNodes(input) {
const nodes = input.match(/<[^<>]+>/g); // Extract all nodes from the input
const parentNodes = [];
let currentParentNode = '';
for (const node of nodes) {
if (currentParentNode === '') {
currentParentNode = node;
} else if (node.startsWith(currentParentNode)) {
currentParentNode += ',' + node;
} else {
parentNodes.push(currentParentNode);
currentParentNode = node;
}
}

if (currentParentNode !== '') {
parentNodes.push(currentParentNode);
}
return parentNodes;
}

这里的想法是通过在遇到<时增加深度并在遇到>时减少深度来跟踪您在节点中的深度。当depth变为0时,开始一个新字符串。

const nodes_string = '<apple, <pear>>, <orange, <apple>, <cherry>, banana>, <banana, <pear>, <orange>>';
const nodes_string2 = 'doughnut, <apple, <pear>>, muffin, crumpet, <orange, <apple>, <cherry>, banana>, <banana, <pear>, <orange>>, pie';
const clean_text_nodes = (node) =>
node.startsWith('<') ?
node :
node
.split(',')
.map( (str) => str.trim() )
.filter( (str) => str !== '' );
const get_top_nodes = (nodes_string) => {
let depth = 0;
const result = [''];

for(const char of nodes_string) {
if(char === '<') {
if(depth === 0) result.push('');
depth++;
}

result[result.length - 1] += char;

if(char === '>') {
depth--;
if(depth === 0) result.push('');
}
}

return result.flatMap(clean_text_nodes);
};
console.log(get_top_nodes(nodes_string));
console.log(get_top_nodes(nodes_string2));

您可能只是天真地计算开括号和闭括号以及push "到结果:

const parse = str => {
const result = [];
let sub = '',
opened = 0,
closed = 0;
for (const s of str) {
if (s === '<') ++opened;
else if (s === '>') ++closed;
if (opened > 0) {
sub += s;
if (opened === closed) {
opened = closed = 0;
result.push(sub);
sub = '';
}
}
}
return result;
}
console.log(parse('<apple, <pear>>, <orange, <apple>, <cherry>, banana>, <banana, <pear>, <orange>>'))

最新更新