c++ 如何在boost::spirit中实现语句解析,这在本质上会切换解析器?

vpfxa7rd  于 2023-07-01  发布在  其他
关注(0)|答案(2)|浏览(84)

我正在尝试编写解析器的语言有一个语句,它本质上为下面的文本设置属性。这些属性包括

  • 大小写敏感性
  • 格式(包括不同的注解样式)

我只能想象通过切换到不同的解析器来实现这一点。我认为这将需要终止当前解析器,并通过其属性返回如何处理其余不匹配的输入。怎么可能做到这一点?

5ssjco0h

5ssjco0h1#

在语句解析器中使用语义操作和qi::lazy指令,根据指定的属性调用适当的解析器

kfgdxczn

kfgdxczn2#

切换到不同的解析器是一种方法。
与此相关的最显著的模式是Nabialek Trick。这建立在qi::lazy指令的基础上。
但是,由于您已经提到了多个标志,因此可能无法扩展,因为它可能会导致不必要的重复和/或组合爆炸。
我建议使用一些解析器状态。你可以使用一些保持逻辑的语义动作来做到这一点,但这意味着解析器内部的状态是可变的,这可能会损害可重入性、线程安全性和可重用性。这些都是相当一般的drawbacks of semantic actions
相反,Qi提供了本地属性,这些属性位于运行时解析器上下文中。
例如,让我们切换大小写敏感性:
// sample来了,也做晚餐

餐后更新

时间总是一个好老师。我尝试过让局部属性/继承属性为可重入性工作,但它并不像我记忆中的那样工作。
因此,让我们接受可变状态,并将选项状态放在语法示例中。通过这种方式,事情保持在可行的复杂性级别,尽管您不能总是共享解析器示例。

Live On Coliru

// #define BOOST_SPIRIT_DEBUG
#include <boost/phoenix.hpp>
#include <boost/spirit/include/qi.hpp>
#include <iomanip>
namespace qi = boost::spirit::qi;
namespace px = boost::phoenix;
using namespace std::string_literals;

template <typename It> struct DemoParser : qi::grammar<It> {
    DemoParser() : DemoParser::base_type(start) {
        using namespace qi::labels;

        // shorthand mnemonics for accessing option state
        auto _case_option   = px::ref(case_opt);
        auto _strict_option = px::ref(strict_opt);
        qi::_r1_type kw_text; // another mnemonic, for the inherited attribute

        // handy phoenix actor (custom "directives")
        auto const _cs = qi::eps(_case_option == Sensitive);
        auto const _ci = qi::eps(_case_option == Insensitive);
     // auto const _sm = qi::eps(_strict_option == StrictOn);

        start = qi::skip(qi::space)[demo];

        demo = qi::eps[_case_option = Case::Sensitive]    // initialize
                      [_strict_option = Strict::StrictOn] // defaults?
            >> -(option | hello) % ';'                    //
            >> qi::eoi;

        option = kw("Option"s) >> (switch_case | switch_strict);
        hello                             //
            = _cs >> "Hello"              //
            | _ci >> qi::no_case["hello"] //
            ;

        _case_sym.add("sensitive", Case::Sensitive)("insensitive", Case::Insensitive);
        _strict_sym.add("on", Strict::StrictOn)("off", Strict::StrictOff);

        _case         = _cs >> _case_sym | _ci >> qi::no_case[_case_sym];
        _strict       = _cs >> _strict_sym | _ci >> qi::no_case[_strict_sym];
        switch_case   = kw("case"s) >> _case[_case_option = _1];
        switch_strict = kw("strict"s) >> _strict[_strict_option = _1];

        px::function c_str = [](std::string const& s) { return s.c_str(); };

        kw = (_cs >> qi::lit(c_str(kw_text))                 // case sensitive
              | _ci >> qi::no_case[qi::lit(c_str(kw_text))]) // case insensitive
            >> !qi::char_("a-zA-Z0-9._"); // lookahead assertion to avoid parsing partial identifiers

        BOOST_SPIRIT_DEBUG_NODES((start)(demo)(option)(hello)(switch_case)(switch_strict)(_case)(_strict)(kw))
    }

  private:
    qi::rule<It> start;

    enum Case { Sensitive, Insensitive } case_opt = Sensitive;
    enum Strict { StrictOff, StrictOn } strict_opt        = StrictOn;
    qi::symbols<char, Case>   _case_sym;
    qi::symbols<char, Strict> _strict_sym;

    using Skipper = qi::space_type;
    qi::rule<It, Skipper> demo, hello, option, switch_case, switch_strict;

    // lexeme
    qi::rule<It, Case()> _case;
    qi::rule<It, Strict()> _strict;
    qi::rule<It, std::string(std::string kw_text)> kw; // using inherited attribute
};

int main() {
    for (std::string_view input :
         {
             "",
             "bogus;", // FAIL
             "Hello;",
             "hello;",
             "Option case insensitive; heLlO;",
             "Option strict off;",
             "Option STRICT off;",
             "Option case insensitive; Option STRICT off;",
             "Option case insensitive; oPTION STRICT off;",
             "Option case insensitive; oPTION STRICT ON;",
             "Option case insensitive; HeLlO; OPTION CASE SENSitive ; HelLO;", // FAIL
             "Option case insensitive; HeLlO; OPTION CASE SENSitive ; Hello;",
         }) //
    {
        DemoParser<std::string_view::const_iterator> p; // mutable instance now
                                                        //
        bool ok = parse(begin(input), end(input), p);
        std::cout << quoted(input) << " -> " << (ok ? "PASS" : "FAIL") << std::endl;
    }
}

打印测试用例的预期输出:

"" -> PASS
"bogus;" -> FAIL
"Hello;" -> PASS
"hello;" -> FAIL
"Option case insensitive; heLlO;" -> PASS
"Option strict off;" -> PASS
"Option STRICT off;" -> FAIL
"Option case insensitive; Option STRICT off;" -> PASS
"Option case insensitive; oPTION STRICT off;" -> PASS
"Option case insensitive; oPTION STRICT ON;" -> PASS
"Option case insensitive; HeLlO; OPTION CASE SENSitive ; HelLO;" -> FAIL
"Option case insensitive; HeLlO; OPTION CASE SENSitive ; Hello;" -> PASS

提高编译时间:X3

老实说,我认为对于动态参数化/组合规则,X3更方便一些。它的编译速度也快得多,如果需要的话,更容易添加一些调试副作用:

Live On Coliru

// #define BOOST_SPIRIT_X3_DEBUG
#include <boost/spirit/home/x3.hpp>
#include <iomanip>
#include <iostream>
namespace x3 = boost::spirit::x3;
using namespace std::string_literals;

namespace DemoParser {
    enum Case { Insensitive, Sensitive };
    enum Strict { StrictOff, StrictOn };
    struct Options {
        enum Case   case_opt   = Sensitive;
        enum Strict strict_opt = StrictOn;
    };

    // custom "directives"
    auto const _cs = x3::eps[([](auto& ctx) { _pass(ctx) = get<Options>(ctx).case_opt == Sensitive; })];
    auto const _ci = x3::eps[([](auto& ctx) { _pass(ctx) = get<Options>(ctx).case_opt == Insensitive; })];
 // auto const _sm = x3::eps[([](auto& ctx) { _pass(ctx) = get<Options>(ctx).strict_opt == StrictOn; })];

    auto set_opt = [](auto member) {
        return [member](auto& ctx) {
            auto& opt = get<Options>(ctx).*member;
            x3::traits::move_to(_attr(ctx), opt); 
        };
    };

    static inline auto variable_case(auto p, char const* name = "variable_case") {
        using Attr = x3::traits::attribute_of<decltype(p), x3::unused_type, void>::type;
        return x3::rule<struct _, Attr, true>{name} = //
            (_cs >> x3::as_parser(p) |                //
             _ci >> x3::no_case[x3::as_parser(p)]);
    }

    static inline auto kw(char const* kw_text) {
        // using lookahead assertion to avoid parsing partial identifiers
        return x3::rule<struct kw, std::string>{kw_text} = x3::lexeme[ //
                   variable_case(x3::lit(kw_text), kw_text)            //
                   >> !x3::char_("a-zA-Z0-9._")                        //
        ];
    }

    auto _case_sym = x3::symbols<Case>{}.add("sensitive", Case::Sensitive)("insensitive", Case::Insensitive).sym;
    auto _strict_sym = x3::symbols<Strict>{}.add("on", Strict::StrictOn)("off", Strict::StrictOff).sym;

    auto switch_case   = kw("case") >> variable_case(_case_sym)[set_opt(&Options::case_opt)];
    auto switch_strict = kw("strict") >> variable_case(_strict_sym)[set_opt(&Options::strict_opt)];

    auto option = kw("Option") >> (switch_case | switch_strict);
    auto hello  = _cs >> "Hello"      //
        | _ci >> x3::no_case["hello"] //
        ;

    auto demo  = -(option | hello) % ';' >> x3::eoi;
    auto start = x3::skip(x3::space)[demo];
}

int main() {
    auto const p = DemoParser::start; // stateless parser
    using DemoParser::Options;

    for (std::string_view input :
         {
             "",
             "bogus;", // FAIL
             "Hello;",
             "hello;",
             "Option case insensitive; heLlO;",
             "Option strict off;",
             "Option STRICT off;",
             "Option case insensitive; Option STRICT off;",
             "Option case insensitive; oPTION STRICT off;",
             "Option case insensitive; oPTION STRICT ON;",
             "Option case insensitive; HeLlO; OPTION CASE SENSitive ; HelLO;", // FAIL
             "Option case insensitive; HeLlO; OPTION CASE SENSitive ; Hello;",
         }) //
    {
        Options opts;

        bool ok = parse(begin(input), end(input), x3::with<Options>(opts)[p]);
        std::cout << quoted(input) << " -> " << (ok ? "PASS" : "FAIL") << std::endl;
    }
}

仍然打印相同的测试输出:

"" -> PASS
"bogus;" -> FAIL
"Hello;" -> PASS
"hello;" -> FAIL
"Option case insensitive; heLlO;" -> PASS
"Option strict off;" -> PASS
"Option STRICT off;" -> FAIL
"Option case insensitive; Option STRICT off;" -> PASS
"Option case insensitive; oPTION STRICT off;" -> PASS
"Option case insensitive; oPTION STRICT ON;" -> PASS
"Option case insensitive; HeLlO; OPTION CASE SENSitive ; HelLO;" -> FAIL
"Option case insensitive; HeLlO; OPTION CASE SENSitive ; Hello;" -> PASS

对于qi::lazy方法,我有点过时了,我想我会参考这个网站上现有的例子。

相关问题