字符串中的解析标签



我正在尝试用像这样的自定义标签来解析字符串

[color value=0x000000]This house is [wave][color value=0xFF0000]haunted[/color][/wave]. 
I've heard about ghosts [shake]screaming[/shake] here after midnight.[/color]

我已经弄清楚要使用什么行元

/[color value=(.*?)](.*?)[/color]/gs
/[wave](.*?)[/wave]/gs
/[shake](.*?)[/shake]/gs

但是,我需要在结果字符串中获得这些组的正确范围(StartIndex,endIndex),以便我可以正确应用它们。这就是我完全迷失的地方,因为每次我替换标签时,索引都有机会混乱。它特别难于嵌套标签。

所以输入是字符串

[color value=0x000000]This house is [wave][color value=0xFF0000]haunted[/color][/wave]. 
I've heard about ghosts [shake]screaming[/shake] here after midnight.[/color]

,在输出中,我想得到

之类的东西
Apply color 0x000000 from 0 to 75
Apply wave from 14 to 20
Apply color 0xFF0000 from 14 to 20
Apply shake from 46 to 51

请注意,该索引与结果字符串匹配。

我该如何解析?

不幸的是,我不熟悉ActionScript,但是此C#代码使用正则表达式显示一个解决方案。我不用匹配特定标签,而是使用了可以匹配任何标签的正则表达式。而且,我没有尝试制作与整个启动和结束标签匹配的正则表达式,包括介于两者之间的文本(我认为使用嵌套标签是不可能的),而是使正则表达式仅与start a Start 匹配结束标签,然后进行了一些额外的处理来匹配开始和结束标签,并将其从字符串中删除,以保留基本信息。

using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;
class Program
{
   static void Main(string[] args)
   {
      string data = "[color value=0x000000]This house is [wave][color value=0xFF0000]haunted[/color][/wave]. " +
                    "I've heard about ghosts [shake]screaming[/shake] here after midnight.[/color]";
      ParsedData result = ParseData(data);
      foreach (TagInfo t in result.tags)
      {
         if (string.IsNullOrEmpty(t.attributeName))
         {
            Console.WriteLine("Apply {0} from {1} to {2}", t.name, t.start, t.start + t.length - 1);
         }
         else
         {
            Console.WriteLine("Apply {0} {1}={2} from {3} to {4}", t.name, t.attributeName, t.attributeValue, t.start, t.start + t.length - 1);
         }
         Console.WriteLine(result.data);
         Console.WriteLine("{0}{1}n", new string(' ', t.start), new string('-', t.length));
      }
   }
   static ParsedData ParseData(string data)
   {
      List<TagInfo> tagList = new List<TagInfo>();
      Regex reTag = new Regex(@"[(w+)(s+(w+)s*=s*([^]]+))?]|[(/w+)]");
      Match m = reTag.Match(data);
      // Phase 1 - Collect all the start and end tags, noting their position in the original data string
      while (m.Success)
      {
         if (m.Groups[1].Success) // Matched a start tag
         {
            tagList.Add(new TagInfo()
            {
               name = m.Groups[1].Value,
               attributeName = m.Groups[3].Value,
               attributeValue = m.Groups[4].Value,
               tagLength = m.Groups[0].Length,
               start = m.Groups[0].Index
            });
         }
         else if (m.Groups[5].Success)
         {
            tagList.Add(new TagInfo()
            {
               name = m.Groups[5].Value,
               tagLength = m.Groups[0].Length,
               start = m.Groups[0].Index
            });
         }
         m = m.NextMatch();
      }
      // Phase 2 - match end tags to start tags
      List<TagInfo> unmatched = new List<TagInfo>();
      foreach (TagInfo t in tagList)
      {
         if (t.name.StartsWith("/"))
         {
            for (int i = unmatched.Count - 1; i >= 0; i--)
            {
               if (unmatched[i].name == t.name.Substring(1))
               {
                  t.otherEnd = unmatched[i];
                  unmatched[i].otherEnd = t;
                  unmatched.Remove(unmatched[i]);
                  break;
               }
            }
         }
         else
         {
            unmatched.Add(t);
         }
      }
      int subtractLength = 0;
      // Phase 3 - Remove tags from the string, updating start positions and calculating length in the process
      foreach (TagInfo t in tagList.ToArray())
      {
         t.start -= subtractLength;
         // If this is an end tag, calculate the length for the corresponding start tag,
         // and remove the end tag from the tag list.
         if (t.otherEnd.start < t.start)
         {
            t.otherEnd.length = t.start - t.otherEnd.start;
            tagList.Remove(t);
         }
         // Keep track of how many characters in tags have been removed from the string so far
         subtractLength += t.tagLength;
      }
      return new ParsedData()
      {
         data = reTag.Replace(data, string.Empty),
         tags = tagList.ToArray()
      };
   }
   class TagInfo
   {
      public int start;
      public int length;
      public int tagLength;
      public string name;
      public string attributeName;
      public string attributeValue;
      public TagInfo otherEnd;
   }
   class ParsedData
   {
      public string data;
      public TagInfo[] tags;
   }
}

输出为:

Apply color value=0x000000 from 0 to 76
This house is haunted. I've heard about ghosts screaming here after midnight.
-----------------------------------------------------------------------------
Apply wave from 14 to 20
This house is haunted. I've heard about ghosts screaming here after midnight.
              -------
Apply color value=0xFF0000 from 14 to 20
This house is haunted. I've heard about ghosts screaming here after midnight.
              -------
Apply shake from 47 to 55
This house is haunted. I've heard about ghosts screaming here after midnight.
                                               ---------

让我向您展示一种解析方法,您不仅可以应用于上面的情况,而且还可以对所有情况进行切割。此方法不仅限于术语 - 颜色,波浪,摇动。

    private List<Tuple<string, string>> getVals(string input)
    {
        List<Tuple<string, string>> finals = new List<Tuple<string,string>>();
        // first parser
        var mts = Regex.Matches(input, @"[[^u005D]+]");
        foreach (var mt in mts)
        {
            // has no value=
            if (!Regex.IsMatch(mt.ToString(), @"(?i)value[nrts]*="))
            {
                // not closing tag
                if (!Regex.IsMatch(mt.ToString(), @"^[[nrts]*/"))
                {
                    try
                    {
                        finals.Add(new Tuple<string, string>(Regex.Replace(mt.ToString(), @"^[|]$", "").Trim(), ""));
                    }
                    catch (Exception es)
                    {
                        Console.WriteLine(es.ToString());
                    }
                }
            }
            // has value=
            else
            {
                try
                {
                    var spls = Regex.Split(mt.ToString(), @"(?i)value[nrts]*=");
                    finals.Add(new Tuple<string, string>(Regex.Replace(spls[0].ToString(), @"^[", "").Trim(), Regex.Replace(spls[1].ToString(), @"^]$", "").Trim()));
                }
                catch (Exception es)
                {
                    Console.WriteLine(es.ToString());
                }
            }
        }
        return finals;
    }

我也有一个带有单个正则表达式的json的经验。如果您想知道它是什么,请访问我的博客www.mysplitter.com。

相关内容

  • 没有找到相关文章

最新更新