VC 中的Regex拆分字符串

我在项目中使用VC 10。我只是谷歌搜索的C/C ，看来在标准C 中没有正则是REGEX？VC 10似乎有正则是正则。但是，我该如何进行正则拆分？我只为此需要提升吗？

搜索网络时，我发现许多人推荐了许多东西，用于许多事物，令牌化/拆分字符串，解析（PEG），甚至现在甚至是正则（尽管应该在...中构建...）。我可以得出结论Boost是必须的吗？它的180MB仅适用于琐碎的事物，以多种语言为单位？

c 11标准具有std::regex。它也包含在TR1 for Visual Studio 2010中。实际上TR1自VS2008以来可用，它隐藏在std::tr1名称空间下。因此，您不需要VS2008或更高版本的boost.regex。

可以使用regex_token_iterator进行分裂：

#include <iostream>
#include <string>
#include <regex>
const std::string s("The-meaning-of-life-and-everything");
const std::tr1::regex separator("-");
const std::tr1::sregex_token_iterator endOfSequence;
std::tr1::sregex_token_iterator token(s.begin(), s.end(), separator, -1);
while(token != endOfSequence) 
{
   std::cout << *token++ << std::endl;
}

如果您还需要获得分隔符本身，则可以从token指向的sub_match对象获得它，它是包含令牌的启动和结束迭代器的对。

while(token != endOfSequence) 
{
   const std::tr1::sregex_token_iterator::value_type& subMatch = *token;
   if(subMatch.first != s.begin())
   {
      const char sep = *(subMatch.first - 1);
      std::cout << "Separator: " << sep << std::endl;
   }
   std::cout << *token++ << std::endl;
}

这是您具有单一char分离器时的情况。如果分隔符本身可以是任何子线

，或者您可以使用正则分组并将分离器放在第一组中，然后将真实令牌放在第二组中：

const std::string s("The-meaning-of-life-and-everything");
const std::tr1::regex separatorAndStr("(-*)([^-]*)");
const std::tr1::sregex_token_iterator endOfSequence;
// Separators will be 0th, 2th, 4th... tokens 
// Real tokens will be 1th, 3th, 5th... tokens 
int subMatches[] = { 1, 2 };
std::tr1::sregex_token_iterator token(s.begin(), s.end(), separatorAndStr, subMatches);
while(token != endOfSequence) 
{
   std::cout << *token++ << std::endl;
}

不确定它是100％正确的，而只是为了说明这个想法。

在此处来自此博客的示例。

您将在res

中拥有所有匹配项

std::tr1::cmatch res;
str = "<h2>Egg prices</h2>";
std::tr1::regex rx("<h(.)>([^<]+)");
std::tr1::regex_search(str.c_str(), res, rx);
std::cout << res[1] << ". " << res[2] << "n";

相关内容

最新更新

热门标签：