Reference

To use lexy, you need three things:

  1. Define a grammar using the rule DSL and callbacks.

  2. Create an input object that represents the input you want to parse.

  3. Call an action that will parse the grammar on the input.

lexy is designed to work with your own types and data structures. As such, its usage can be hidden away in the implementation of a parse function whose interface does not mention lexy specific types. Nevertheless, lexy defines some infrastructure types that can be used as vocabulary types in your project interfaces.

Infrastructure

lexy/code_point.hpp

A Unicode code point.

lexy/lexeme.hpp

A lexeme, i.e. a subrange of the input.

lexy/token.hpp

Identify and store tokens, i.e. concrete realization of token rules.

lexy/parse_tree.hpp

A parse tree.

lexy/error.hpp

The parse errors.

lexy/input_location.hpp

Compute human readable line/column numbers for a position of the input.

lexy/visualize.hpp

Visualize the data structures.

Inputs and Encodings

template <typename T>
concept reader = ;

template <typename T>
concept input = requires(const T& obj) {
  { obj.reader() } -> reader;
};

An input defines the input that will be parsed by lexy. It has a corresponding encoding that controls, among other things, its character type and whether certain rules are available. The input itself is unchanging and it produces a reader which remembers the current position of the input during parsing.

The supported encodings of lexy/encoding.hpp:
lexy::default_encoding

An unknown single byte encoding.

lexy::ascii_encoding

ASCII

lexy::utf8_encoding, lexy::utf16_encoding, lexy::utf32_encoding

UTF-8, UTF-16, UTF-32

lexy::byte_encoding

Bytes, not text.

The pre-defined inputs:
lexy/range_input.hpp

Use a generic iterator range as input.

lexy/string_input.hpp

Use a string as input.

lexy/buffer.hpp

Create a buffer that contains the input.

lexy/file.hpp

Use a file as input.

lexy/argv_input.hpp

Use the command-line arguments as input.

The grammar DSL

template <typename T>
concept production = requires {
      { T::production } -> rule;
      { T::whitespace } -> rule;        // optional
      { T::value } -> callback-or-sink; // optional
};

The grammar is specified by productions: classes with a ::rule member that specifies the rule. They can optionally have a ::whitespace member that controls automatic whitespace skipping. Both of those are specified in a C++ DSL. The action lexy::parse also needs a ::value member that controls the value produced by parsing the production. It is specified using callbacks, special function objects, and sinks.

lexy/grammar.hpp

Type traits for defining and interacting with the grammar.

lexy/dsl.hpp

The DSL for specifying parse rules.

lexy/callback.hpp

The predefined callbacks and sinks for specifying the value.

Tip
It is a recommended to put all the productions into a separate namespace.

Actions

Actions take a grammar and an input, and do something with the grammar on that input.

They also optionally take a parse state argument. If provided, it will be forwarded as state parameter to the callbacks, and to rules such as lexy::dsl::scan. It can be used to pass additional meta data or allocators etc. to the grammar.

lexy/action/match.hpp

Matches a grammar on an input and return a true/false result.

lexy/action/validate.hpp

Validates that a grammar matches on an input, and returns the errors if it does not.

lexy/action/parse.hpp

Parses a grammar on an input and returns its value.

lexy/action/parse_as_tree.hpp

Parses a grammar on an input and returns the parse tree.

lexy/action/scan.hpp

Parses a grammar manually by dispatching to other rules.

lexy/action/trace.hpp

Traces parse events to visualize and debug the parsing process.