Reference

To use lexy, you need three things:

  1. Define a grammar using the rule DSL and callbacks.

  2. Create an input object that represents the input you want to parse.

  3. Call an action that will parse the grammar on the input.

lexy is designed to work with your own types and data structures. As such, its usage can be hidden away in the implementation of a parse function whose interface does not mention lexy specific types. Nevertheless, lexy defines some infrastructure types that can be used as vocabulary types in your project interfaces.

Infrastructure

lexy::code_point

A Unicode code point.

lexy::lexeme

A lexeme, i.e. a subrange of the input.

lexy::token_kind and lexy::token

Identify and store tokens, i.e. concrete realization of token rules.

lexy::parse_tree

A parse tree.

lexy::error and lexy::error_context

The parse errors.

Inputs and Encodings

template <typename T>
concept reader = ;

template <typename T>
concept input = requires(const T& obj) {
  { obj.reader() } -> reader;
};

An input defines the input that will be parsed by lexy. It has a corresponding encoding that controls, among other things, its character type and whether certain rules are available. The input itself is unchanging and it produces a reader which remembers the current position of the input during parsing.

The supported encodings:
lexy::default_encoding

An unknown single byte encoding.

lexy::ascii_encoding

ASCII

lexy::utf8_encoding, lexy::utf16_encoding, lexy::utf32_encoding

UTF-8, UTF-16, UTF-32

lexy::byte_encoding

Bytes, not text.

The pre-defined inputs:
lexy::range_input

Use a generic iterator range as input.

lexy::string_input and lexy::zstring_input

Use a string as input.

lexy::buffer

Create a buffer that contains the input.

lexy::read_file

Use a file as input.

lexy::argv_input

Use the command-line arguments as input.

The grammar DSL

template <typename T>
concept production = requires {
      { T::production } -> rule;
      { T::whitespace } -> rule;        // optional
      { T::value } -> callback-or-sink; // optional
};

The grammar is specified by productions: classes with a ::rule member that specifies the rule. They can optionally have a ::whitespace member that controls automatic whitespace skipping. Both of those are specified in a C++ DSL. The action lexy::parse also needs a ::value member that controls the value produced by parsing the production. It is specified using callbacks, special function objects, and sinks.

lexy/grammar.hpp

Type traits for defining and interacting with the grammar.

lexy/dsl.hpp

The DSL for specifying parse rules.

lexy/callback.hpp

The predefined callbacks and sinks for specifying the value.

Tip
It is a recommended to put all the productions into a separate namespace.

Actions

Actions take a grammar and an input, and do something with the grammar on that input.

lexy::match

Matches a grammar on an input and return a true/false result.

lexy::validate

Validates that a grammar matches on an input, and returns the errors if it does not.

lexy::parse

Parses a grammar on an input and returns its value.

lexy::parse_as_tree

Parses a grammar on an input and returns the parse tree.