Header lexy/dsl/scan.hpp

Manually parse input.

Rule lexy::dsl::scan

lexy/dsl/scan.hpp
namespace lexy
{
    template <typename Context, typename Reader>
    class rule_scanner
    : public scanner-common
    {
    public:
        using production      = current-production;

        // Current recursion depth.
        constexpr std::size_t recursion_depth() const noexcept;

        // Position where dsl::scan was reached.
        constexpr auto begin() const noexcept -> typename Reader::iterator;
    };
}

namespace lexy::dsl
{
    constexpr rule auto scan;
}

Parses input using a user-defined function.

Requires

The current production P has a static member function that returns a lexy::scan_result , It must take at least one argument, which is a reference to a lexy::rule_scanner. It can optionally take a second argument, which is a reference to the parse state passed to the top-level action.

Parsing

Invokes P::scan(), passing it an object of type lexy::rule_scanner starting at the current reader position, and optionally the parse state, followed by all values already produced by previous rules. Then parses and consumes everything parsed and consumed by P::scan(). Succeeds, if P::scan() returns a lexy::scan_result  that contains a value. Fails otherwise.

Errors

All errors raised by P::scan(), either directly by calling .error() or indirectly by parsing a rule that fails. Recovers if P::scan() recovers and is able to return a value.

Values

If the value_type of the returned lexy::scan_result  is non-void, returns that value. Otherwise, produces no value. Values produced by previous rules are discarded.

Parse tree

Creates all nodes created from the rules parsed by P::scan() as if those rules were parsed directly.

lexy::rule_scanner implements lexy::scanner-common . While parsing a rule, everything is forwarded to the top-level parse action, and it has access to the same context: The use of dsl::scan is a complete implementation detail of a production; it cannot be observed by the parse action in any way. Regardless of the top-level parse action, the overloads of .parse() and .branch() taking a lexy::scan_result  ensure that values are produced.

Example 1. Parse a Rust-style raw string literal
struct production
// We produce a UTF-8 lexeme (i.e. a string view) of the contents.
: lexy::scan_production<lexy::buffer_lexeme<lexy::utf8_encoding>>,
  lexy::token_production
{
    struct unterminated
    {
        static constexpr auto name = "unterminated raw string literal";
    };

    template <typename Context, typename Reader>
    static constexpr scan_result scan(lexy::rule_scanner<Context, Reader>& scanner)
    {
        auto open_hash_count = 0;

        // Parse the opening delimiter.
        scanner.parse(dsl::lit_c<'r'>);
        while (scanner.branch(dsl::lit_c<'#'>))
            ++open_hash_count;
        scanner.parse(dsl::lit_c<'"'>);
        if (!scanner)
            return lexy::scan_failed;

        // Parse the contents of the string.
        auto content_begin = scanner.position();
        for (auto closing_hash_count = -1; closing_hash_count != open_hash_count;)
        {
            // Handle the next character.
            if (scanner.branch(dsl::lit_c<'"'>))
            {
                // We have a quote, count the next hashes.
                closing_hash_count = 0;
            }
            else if (scanner.branch(dsl::lit_c<'#'>))
            {
                // Count a hash if necessary.
                if (closing_hash_count != -1)
                    ++closing_hash_count;
            }
            else if (scanner.is_at_eof())
            {
                // Report an error for an unterminated literal.
                scanner.fatal_error(unterminated{}, scanner.begin(), scanner.position());
                return lexy::scan_failed;
            }
            else
            {
                // Parse any other code point, and invalidate hash count.
                scanner.parse(dsl::code_point);
                closing_hash_count = -1;
            }

            // Check for any errors we might have gotten.
            if (!scanner)
                return lexy::scan_failed;
        }
        auto content_end = scanner.position();

        return lexy::buffer_lexeme<lexy::utf8_encoding>(content_begin, content_end);
    }
};
Tip
Use lexy::scan_production  to remove boilerplate for a production whose rule is entirely dsl::scan.
Tip
Use condition >> dsl::scan to turn a scanner into a branch. The values produced by condition are available in the scan function to consume. You can use lexy::dsl::parse_as  to ensure a rule produces a value with all parse actions.
Note
Use lexy::scan  if you want to manually parse the root production.
Caution
If P::scan() only provides the overload that takes a specific parse state, this parse state must be passed to all actions that are used with the grammar. Likewise, if it requires values, the rule must always produce a value.

Class lexy::scan_production

lexy/dsl/scan.hpp
namespace lexy
{
    template <typename T>
    struct scan_production
    {
        // The return type of the scan() function.
        using scan_result = lexy::scan_result<T>;

        static constexpr auto rule  = dsl::scan;
        static constexpr auto value = lexy::forward<T>;
    };
}

Convenience class for productions that are parsed entirely manually producing a value of type T.

See also