Boost spirit X3:在可选的情况下如何处理,可以在另一种情况下为空?



在可选的情况下,在一个路径中没有任何可匹配的,如何处理?

考虑这个mvce。这不是我真正的例子,而是我能想到的表达我想要做的事情的最小例子:

解析为有3个字段的foo AST。第二个参数是可选的,可以为空值。第三个字段int有一个解析验证规则,该规则取决于第二个字段是否存在。

在本例中,如果有双精度类型,则int必须为偶数,否则必须为奇数。

Valid cases 
foobar:3.14;4 
foobar;4 
foobar|5 
Invalid cases
foobar:3.14;5 
foobar;5 
foobar|4 
foobar:3.14|4 
#include <iostream>
#include <string>
#include <optional>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/home/x3.hpp>
namespace x3 = boost::spirit::x3;
namespace ast{
struct foo {
std::string string_value;
std::optional<double> optional_double_value;
int int_value;
};

}
template <typename T>
std::ostream& operator<<(std::ostream& os, const std::optional<T> & opt)
{
return opt ? os << opt.value() : os << "nullopt";
};
std::ostream& operator<<(std::ostream& os, const ast::foo & foo)
{
return os << "string_value :"<<  foo.string_value << " optional_double : " << foo.optional_double_value << " int : " << foo.int_value;
};

BOOST_FUSION_ADAPT_STRUCT(ast::foo, string_value, optional_double_value,int_value)
namespace parser {

const auto even_int = x3::rule<struct even_int, int> {"even int"}
= x3::int_ [ ([](auto& ctx) {
auto& attr = x3::_attr(ctx);
auto& val  = x3::_val(ctx);
val = attr;
x3::_pass(ctx) = x3::_val(ctx) %2 == 0;
}) ];

const auto odd_int = x3::rule<struct even_int, int> {"odd int"}
= x3::int_ [ ([](auto& ctx) {
auto& attr = x3::_attr(ctx);
auto& val  = x3::_val(ctx);
val = attr;
x3::_pass(ctx) = x3::_val(ctx) %2 == 1;
}) ];

const auto foo =  ( *x3::alpha  >> -(':' >> x3::double_) >> ';' >> even_int )
;//|  (  *x3::alpha >>  '|' >> odd_int ) ;

}

template <typename Parser, typename Attr>
static inline bool parse(std::string_view in, Parser const& p, Attr& result)
{
return x3::parse(in.begin(), in.end(), p, result);
}
int main()
{
for (auto& input : { "foobar:3.14;4", "foobar;4","foobar|5"}) {
ast::foo result;
if (!parse(input, parser::foo, result))
std::cout << "parsing " << input << " failed" << std::endl;
else
std::cout << "parsing " << input << " success : " << result <<  std::endl;
}
}

取消对奇数int raise

的第二个选项的注释
/usr/local/include/boost/spirit/home/x3/operator/detail/sequence.hpp:144:25: error: static assertion failed: Size of the passed attribute is bigger than expected.
144 |             actual_size <= expected_size

我能理解,因为嗯,有两个"符号"。应该是3。如何处理?

奖金的问题:

为什么
auto even_int = x3::rule<struct even_int, int> {"even int"}
= ...

不能简单地用

来定义
auto even_int = ...;

(在这种情况下编译失败)

所有症状(包括附加问题)都是属性传播机制不完善的症状。

自动属性传播非常好,但是在某些情况下,您必须帮助系统。

看你想要的规则和结果:

const auto foo
= *x3::alpha >> -(':' >> x3::double_) >> ';' >> even_int
| *x3::alpha >> '|' >> odd_int
;

我得出的结论是,您需要相同的规则,只是不要为偶数序数使用可选的双引号,并使用不同的分隔符来区分偶数和奇数。

我将尝试更接近解析器表达式的声明性性质,并尝试使结论更高级。例如

Live On Coliru

#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iomanip>
#include <optional>
namespace ast {
enum class discriminator { even, odd };
struct foo {
std::string           s;
std::optional<double> od;
discriminator         ind;
int                   id;
bool is_valid() const {
bool is_even = 0 == (id % 2);
switch (ind) {
case discriminator::even: return is_even;
case discriminator::odd: return not(is_even or od.has_value());
default: return false;
}
}
};
std::ostream& operator<<(std::ostream& os, const foo& foo)
{
os << std::quoted(foo.s); //
if (foo.od.has_value())
os << "(" << *foo.od << ")";
return os << " " << foo.id //
<< " (" << (foo.is_valid() ? "valid" : "INVALID") << ")";
}
} // namespace ast
BOOST_FUSION_ADAPT_STRUCT(ast::foo, s, od, ind, id)
namespace parser {
namespace x3 = boost::spirit::x3;
static const auto indicator_ = [] {
x3::symbols<ast::discriminator> sym;
sym.add                        //
(";", ast::discriminator::even) //
("|", ast::discriminator::odd);
return sym;
}();
static const auto foo //
= +x3::alpha >> -(':' >> x3::double_) >> indicator_ >> x3::int_;
}
int main()
{
for (std::string const input : {
"foobar:3.14;4",
"foobar;4",
"foobar|5",
// Invalid cases
"foobar:3.14;5",
"foobar;5",
"foobar|4",
"foobar:3.14|4",
}) //
{
ast::foo result;
if (parse(input.begin(), input.end(), parser::foo, result))
std::cout << std::quoted(input) << " -> " << result << std::endl;
else
std::cout << std::quoted(input) << " Syntax error" << std::endl;
}
}

打印

"foobar:3.14;4" -> "foobar"(3.14) 4 (valid)
"foobar;4" -> "foobar" 4 (valid)
"foobar|5" -> "foobar" 5 (valid)
"foobar:3.14;5" -> "foobar"(3.14) 5 (INVALID)
"foobar;5" -> "foobar" 5 (INVALID)
"foobar|4" -> "foobar" 4 (INVALID)
"foobar:3.14|4" -> "foobar"(3.14) 4 (INVALID)

请注意,您可以将此方法视为语法和语义的分离。

选择/改进从这里

当然你现在可以把解析写成

return parse(input.begin(), input.end(), parser::foo, result)
&& result.is_valid();

或者如果你坚持,你可以像前面那样将检查封装在语义动作中:

auto is_valid_ = [](auto& ctx) {
_pass(ctx) = _val(ctx).is_valid();
};
static const auto foo                              //
= x3::rule<struct foo_, ast::foo, true>{"foo"} //
= (+x3::alpha >> -(':' >> x3::double_) >> indicator_ >>
x3::int_)[is_valid_];

现在输出变成:

Live On Coliru

"foobar:3.14;4" -> "foobar"(3.14) 4 (valid)
"foobar;4" -> "foobar" 4 (valid)
"foobar|5" -> "foobar" 5 (valid)
"foobar:3.14;5" Syntax error
"foobar;5" Syntax error
"foobar|4" Syntax error
"foobar:3.14|4" Syntax error
没有融合

现在,上面明确地仍然使用融合序列自适应和自动属性传播。但是,既然您已经深入了解了语义操作¹,那么您当然可以在这里完成其余的工作:

Live On Coliru

#include <boost/spirit/home/x3.hpp>
#include <iomanip>
#include <optional>
namespace ast {
struct foo {
std::string           s;
std::optional<double> od;
int                   id;
};
std::ostream& operator<<(std::ostream& os, const foo& foo)
{
os << std::quoted(foo.s); //
if (foo.od.has_value())
os << "(" << *foo.od << ")";
return os << " " << foo.id;
}
} // namespace ast
namespace parser {
namespace x3 = boost::spirit::x3;
enum class discriminator { even, odd };
static const auto indicator_ = [] {
x3::symbols<discriminator> sym;
sym.add                        //
(";", discriminator::even) //
("|", discriminator::odd);
return sym;
}();
auto make_foo = [](auto& ctx) {
using boost::fusion::at_c;
auto& attr = _attr(ctx);
auto& s    = at_c<0>(attr); // where are
auto& od   = at_c<1>(attr); // structured bindings
auto& ind  = at_c<2>(attr); // when you
auto& id   = at_c<3>(attr); // need them? :|
bool  is_even = 0 == (id % 2);
if (ind == discriminator::even)
_pass(ctx) = is_even;
else
_pass(ctx) = not(is_even or od.has_value());
_val(ctx) = ast::foo{
std::move(s),
od.has_value() ? std::make_optional(*od) : std::nullopt, id};
};
static const auto foo = x3::rule<struct foo_, ast::foo> {}
= (+x3::alpha >> -(':' >> x3::double_) >> indicator_ >>
x3::int_)[make_foo];
} // namespace parser
int main()
{
for (std::string const input : {
"foobar:3.14;4",
"foobar;4",
"foobar|5",
// Invalid cases
"foobar:3.14;5",
"foobar;5",
"foobar|4",
"foobar:3.14|4",
}) //
{
ast::foo result;
if (parse(input.begin(), input.end(), parser::foo, result))
std::cout << std::quoted(input) << " -> " << result << std::endl;
else
std::cout << std::quoted(input) << " Syntax error" << std::endl;
}
}

这有优点也有缺点。优点是

  • 减少编译时间
  • discriminator现在是解析器私有的

缺点:

  • 你正在做手动传播(如boost::optional->std::optional这是笨拙的)
  • 语义动作¹

混合你可能已经知道了,我不喜欢这种"手写属性传播"的卑躬屈膝。如果必须隐藏ind字段从最后,也许这样做:

Live On Coliru

#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iomanip>
#include <optional>
namespace ast {
struct foo {
std::string           s;
std::optional<double> od;
int                   id;
};
std::ostream& operator<<(std::ostream& os, const foo& foo)
{
os << std::quoted(foo.s); //
if (foo.od.has_value())
os << "(" << *foo.od << ")";
return os << " " << foo.id;
}
} // namespace ast
namespace parser {
namespace x3 = boost::spirit::x3;
enum class discriminator { even, odd };
struct p_foo : ast::foo {
discriminator ind;
struct semantic_error : std::runtime_error {
using std::runtime_error::runtime_error;
};
void check_semantics() const {
bool is_even = 0 == (id % 2);
switch (ind) {
case discriminator::even:
if (!is_even)
throw semantic_error("id should be even");
break;
case discriminator::odd:
if (is_even)
throw semantic_error("id should be odd");
if (od.has_value())
throw semantic_error("illegal double at odd foo");
break;
}
}
};
}
BOOST_FUSION_ADAPT_STRUCT(parser::p_foo, s, od, ind, id)
namespace parser {
static const auto indicator_ = [] {
x3::symbols<discriminator> sym;
sym.add                        //
(";", discriminator::even) //
("|", discriminator::odd);
return sym;
}();
static const auto raw_foo      //
= x3::rule<p_foo, p_foo>{} //
= +x3::alpha >> -(':' >> x3::double_) >> indicator_ >> x3::int_;
auto checked_ = [](auto& ctx) {
auto& _pf = _attr(ctx);
_pf.check_semantics();
_val(ctx) = std::move(_pf);
};
static const auto foo                   //
= x3::rule<struct foo_, ast::foo>{} //
= raw_foo[checked_];
} // namespace parser
int main()
{
for (std::string const input : {
"foobar:3.14;4",
"foobar;4",
"foobar|5",
// Invalid cases
"foobar:3.14;5",
"foobar;5",
"foobar|4",
"foobar:3.14|4",
"foobar:3.14|5",
}) //
{
ast::foo result;
try {
if (parse(input.begin(), input.end(), parser::foo, result))
std::cout << std::quoted(input) << " -> " << result << std::endl;
else
std::cout << std::quoted(input) << " Syntax error" << std::endl;
} catch(std::exception const& e) {
std::cout << std::quoted(input) << " Semantic error: " << e.what() << std::endl;
}
}
}

打印

"foobar:3.14;4" -> "foobar"(3.14) 4
"foobar;4" -> "foobar" 4
"foobar|5" -> "foobar" 5
"foobar:3.14;5" Semantic error: id should be even
"foobar;5" Semantic error: id should be even
"foobar|4" Semantic error: id should be odd
"foobar:3.14|4" Semantic error: id should be odd
"foobar:3.14|5" Semantic error: illegal double at odd foo

注意更丰富的诊断信息


Post Scriptum: Minimal Change

后来,重读你的问题,我突然意识到有一些小的变化,将有助于你的语法。我的回答是这样的:

自动属性传播非常好,但是在某些情况下,您必须帮助系统

在这里,您可以通过使两个分支具有相同的结构来帮助它。所以不用

const auto foo
= *x3::alpha >> -(':' >> x3::double_) >> ';' >> even_int
| *x3::alpha >> '|' >> odd_int
;

你可以手动在奇数分支中间插入一个空的可选双精度对象:

const auto foo                                               //
= +x3::alpha >> -(':' >> x3::double_) >> ';' >> even_int //
| +x3::alpha >> x3::attr(ast::optdbl{}) >> '|' >> odd_int;

(optdblstd::optional<double>的别名)。

现在,如果您稍微重构一下odd_int/even_int规则,我会说这种方法比上面的其他选项更有优势:

Live On Coliru

#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iomanip>
#include <optional>
namespace ast{
using optdbl = std::optional<double>;
struct foo {
std::string s;
optdbl      od;
int         id;
};
std::ostream& operator<<(std::ostream& os, const foo& foo)
{
os << std::quoted(foo.s); //
if (foo.od.has_value())
os << "(" << *foo.od << ")";
return os << " " << foo.id;
}
}
BOOST_FUSION_ADAPT_STRUCT(ast::foo, s, od,id)
namespace parser {
namespace x3 = boost::spirit::x3;
static auto mod2check(int remainder) {
return [=](auto& ctx) { //
_pass(ctx) = _val(ctx) % 2 == remainder;
};
}
static auto mod2int(int remainder) {
return x3::rule<struct _, int, true>{} = x3::int_[mod2check(remainder)];
}
const auto foo                                           //
= +x3::alpha >>                                      //
(-(':' >> x3::double_) | x3::attr(ast::optdbl{})) >> //
(';' >> mod2int(0) | '|' >> mod2int(1))              //
;
} // namespace parser
int main()
{
for (std::string const input : {
"foobar:3.14;4",
"foobar;4",
"foobar|5",
// Invalid cases
"foobar:3.14;5",
"foobar;5",
"foobar|4",
"foobar:3.14|4",
}) //
{
ast::foo result;
if (parse(input.begin(), input.end(), parser::foo, result))
std::cout << std::quoted(input) << " -> " << result << std::endl;
else
std::cout << std::quoted(input) << " Syntax error" << std::endl;
}
}

1 Boost Spirit:"语义行为是邪恶的"?

在可选的情况下,在一个路径中没有任何可匹配的,如何处理?

对于这种情况,有attr(x)解析器。它在每次"解析"x时生成一个副本,而不消耗任何输入。

的答案

如何处理在可选的情况下,可以在另一种情况下为空?

是使用attr(std::nullopt),像这样:

const auto foo =  ( *x3::alpha  >> -(':' >> x3::double_) >> ';' >> even_int )
|  (  *x3::alpha >> x3::attr(std::nullopt) >>  '|' >> odd_int ) ;
https://godbolt.org/z/E5jM6s6vW

最新更新