Header lexy/dsl/expression.hpp
Support for operator precedence parsing.
Some terminology:
- Operator
A
lexy::dsl::op
rule (or multiple composed withlexy::dsl::operator/ (operator)
).- Operand
The objects manipulated by an operator.
- Operation
An operator together with arity, associativity and operands. Its binding power (precedence) is determined implicitly by its relation to other operations in its operand.
- Chain
The same operation applied multiple times in direct sequence.
- Expression
Multiple chains of different operations, potentially nested.
Class lexy::expression_production
lexy/dsl/expression.hpp
namespace lexy
{
struct max_operator_nesting_exceeded {};
struct operator_chain_error {};
struct operator_group_error {};
struct expression_production
{
using operator_nesting_error = lexy::max_operator_nesting_exceeded;
using operator_chain_error = lexy::operator_chain_error;
using operator_group_error = lexy::operator_group_error;
static constexpr std::size_t max_operator_nesting;
static constexpr auto rule = see-below;
};
}
namespace lexy::dsl
{
struct atom {};
}
expression_production
is a base class of a production that wants to parse an expression.
struct operation : operation-base
{
static constexpr auto op = operator-rule;
using operand = operation-or-dsl::atom;
};
struct expression : lexy::expression_production
{
static constexpr auto atom = rule;
using operation = operation;
};
The derived class must have the same members as Expression
shown above.
Optionally, it can shadow each member of lexy::expression_production
indicated in the interface, except for rule
, to override them to custom values.
atom
specifies the rule that is used to parse an atomic operand, i.e. one that isn’t another operation in turn.
operation
specifies the top-level operation, i.e. the one with the lowest binding power.
It is a type with the same members as Operation
shown above:
the operator rule (lexy::dsl::op
or lexy::dsl::operator/ (operator)
) and the operand,
which is either another Operation
, lexy::dsl::groups
, or dsl::atom
, which indicates that the operand is the rule described in the atom
member.
The behavior of the generated rule
of the production is as follows:
- Parsing
Parses an arbitrary operation chain consisting of the operators mentioned in the expression (by starting with the root
operation
and following theoperands
). The rules for operation parsing are described in the individual base classes (lexy::dsl::infix_op_left
,lexy::dsl::infix_op_right
,lexy::dsl::infix_op_single
,lexy::dsl::infix_op_list
,lexy::dsl::postfix_op
,lexy::dsl::prefix_op
). If the input does not contain any operator, parses a singleatom
. If there is a trailing operator, it will attempt to parse a following operand greedily.- Errors
All errors raised by parsing
atom
or an operator (only possible if the operator is a branch rule).Expression::operator_nesting_error
: if the nesting level of operators exceedsExpression::max_operator_nesting_exceeded
, at the location of the operator that first exceeds it. The rule then fails.Expression::operator_chain_error
: seelexy::dsl::infix_op_single
.Expression::operator_group_error
: seelexy::dsl::groups
.
- Values
As the rule is the only rule of the entire production, it does not lazily produce values like the other rules, but passes them directly to the callback of the production.
For each instance of the atomic rule, all values it produces are passed to the callback.
For an operation, the value of the operands and of the operator are passed to the callback. The value of an operand is always the result of another callback invocation.
- Parse tree
An operation creates a production node labeled with the type of the operation, its children are the operands and operator nodes.
The atomic operand creates its nodes as it normally would.
struct production : lexy::expression_production
{
static constexpr auto whitespace = dsl::ascii::space;
static constexpr auto atom = dsl::integer<int>;
struct prefix : dsl::prefix_op
{
static constexpr auto op = dsl::op(dsl::lit_c<'-'>);
using operand = dsl::atom;
};
struct product : dsl::infix_op_left
{
static constexpr auto op = dsl::op(dsl::lit_c<'*'>) / dsl::op(dsl::lit_c<'/'>);
using operand = prefix;
};
struct sum : dsl::infix_op_left
{
static constexpr auto op = dsl::op(dsl::lit_c<'+'>) / dsl::op(dsl::lit_c<'-'>);
using operand = product;
};
using operation = sum;
};
constexpr auto op_plus = dsl::op(dsl::lit_c<'+'>);
constexpr auto op_minus = dsl::op(dsl::lit_c<'-'>);
struct production : lexy::expression_production
{
static constexpr auto atom = dsl::integer<int>;
struct operation : dsl::infix_op_left
{
static constexpr auto op = op_plus / op_minus;
using operand = dsl::atom;
};
static constexpr auto value
= lexy::callback<int>([](int value) { return value; },
[](int lhs, lexy::op<op_plus>, int rhs) { return lhs + rhs; },
[](int lhs, lexy::op<op_minus>, int rhs) { return lhs - rhs; });
};
Tip | See calculator.cpp
for a bigger example. |
Caution | If two operators at different binding powers share a common prefix (e.g. and
* or = and == ),
it might be necessary to use lexy::dsl::not_followed_by . |
Class lexy::subexpression_production
lexy/dsl/expression.hpp
namespace lexy::dsl
{
template <expression Expr, operation RootOperation>
struct subexpression_production { … };
}
subexpression_production
is a base class of a production that wants to parse a subexpression.
It will parse the same expression as Expr
, but instead of starting with Expr::operation
, it starts with RootOperation
, which must be an operation of the expression.
All operators with a binding power lower than RootOperation
are not recognized.
Operation lexy::dsl::groups
lexy/dsl/expression.hpp
namespace lexy::dsl
{
template <operation... Operands>
struct groups
{};
}
groups
is a special operation that selects one of the specified operations as operand.
- Parsing
When attempting to parse an operand in the current operation, it will parse one of the specified
Operands
(a "group"), which can be other operations with their own operators. Parsing fails, if operators from distinct groups are mixed.- Errors
All errors raised by regular expression parsing.
Expression::operator_group_error
: if an operator from group B was parsed after an operator from group 1, at the position of operator B.
struct production : lexy::expression_production
{
static constexpr auto whitespace = dsl::ascii::space;
static constexpr auto atom = dsl::integer<int>;
struct math : dsl::infix_op_left
{
static constexpr auto op = dsl::op(dsl::lit_c<'+'>) / dsl::op(dsl::lit_c<'-'>);
using operand = dsl::atom;
};
struct bits : dsl::infix_op_left
{
static constexpr auto op = dsl::op(dsl::lit_c<'|'>) / dsl::op(dsl::lit_c<'&'>);
using operand = dsl::atom;
};
// Either a math or a bit operator, but not mixing.
using operation = dsl::groups<math, bits>;
};
Note | groups can be used in the top-level operation of an expression as well. |
Operation base lexy::dsl::infix_op_left
lexy/dsl/expression.hpp
namespace lexy::dsl
{
struct infix_op_left {};
}
infix_op_left
is an operation base that specifies a left-associative infix operator.
- Parsing
It will parse the chain
operand op operand
.a op b op c
is treated as(a op b) op c
.- Errors
All errors raised by parsing the operand or operator.
- Values
It will invoke the callback with the value of the left operand, followed by the values of the operator, followed by the value of the right operand. In
a op b op c
, it will invokecallback(callback(a, op, b), op, c)
.- Parse tree
A production node labeled with the type of the operation. Its children are all nodes created from the left operand, followed by the nodes for the operator, followed by the nodes from the right operand.
Operation base lexy::dsl::infix_op_right
lexy/dsl/expression.hpp
namespace lexy::dsl
{
struct infix_op_right {};
}
infix_op_right
is an operation base that specifies a right-associative infix operator.
- Parsing
It will parse the chain
operand op operand
.a op b op c
is treated asa op (b op c)
.- Errors
All errors raised by parsing the operand or operator.
- Values
It will invoke the callback with the value of the left operand, followed by the values of the operator, followed by the value of the right operand. In
a op b op c
, it will invokecallback(a, op, callback(b, op, c))
.- Parse tree
A production node labeled with the type of the operation. Its children are all nodes created from the left operand, followed by the nodes for the operator, followed by the nodes from the right operand.
Operation base lexy::dsl::infix_op_single
lexy/dsl/expression.hpp
namespace lexy::dsl
{
struct infix_op_single {};
}
infix_op_single
is an operation base that specifies a non-associative infix operator.
- Parsing
It will parse the chain
operand op operand
.a op b op c
is an error.- Errors
All errors raised by parsing the operand or operator.
Expression::operator_chain_error
: if the operator occurs multiple times in the chain, at the second location. It then recovers, treating it as a left-associative operator.
- Values
It will invoke the callback with the value of the left operand, followed by the values of the operator, followed by the value of the right operand.
- Parse tree
A production node labeled with the type of the operation. Its children are all nodes created from the left operand, followed by the nodes for the operator, followed by the nodes from the right operand.
Operation base lexy::dsl::infix_op_list
lexy/dsl/expression.hpp
namespace lexy::dsl
{
struct infix_op_list {};
}
infix_op_list
is an operation base that specifies an associative infix operator.
- Requires
The callback of the expression production is a sink.
- Parsing
It will parse the chain
operand op operand
.a op b op c
is treated asa op b op c
, i.e. no implicit grouping occurs.- Errors
All errors raised by parsing the operand or operator.
- Values
It will use the callback as a sink. The sink is passed the value of the initial operand, then it is passed all values of the first operator, the value of the next operand, all values of the second operator, and so on. In
a op b op c
, it will invokesink(a), sink(op), sink(b), sink(op), sink(c)
.- Parse tree
A production node labeled with the type of the operation. Its children are all nodes created from the initial operand, followed by the nodes for the first operator, followed by the nodes from the next operand, followed by all nodes from the second operator, and so on.
Tip | This can be used to implement chained comparisons, as seen in calculator.cpp
. |
Operation base lexy::dsl::postfix_op
lexy/dsl/expression.hpp
namespace lexy::dsl
{
struct postfix_op {};
}
postfix_op
is an operation base that specifies a postfix operator.
- Parsing
It will parse the chain
operand op
.a op op c
is treated as(a op) op
.- Errors
All errors raised by parsing the operand or operator.
- Values
It will invoke the callback with the value of the operand, followed by the values of the operator. In
a op op
, it will invokecallback(callback(a, op), op)
.- Parse tree
A production node labeled with the type of the operation. Its children are all nodes created from the operand, followed by the nodes for the operator.
Operation base lexy::dsl::prefix_op
lexy/dsl/expression.hpp
namespace lexy::dsl
{
struct prefix_op {};
}
prefix_op
is an operation base that specifies a prefix operator.
- Parsing
It will parse the chain
op operand
.op op a
is treated asop (op a)
.- Errors
All errors raised by parsing the operand or operator.
- Values
It will invoke the callback with the value of the operator, followed by the value of the operand. In
op op a
, it will invokecallback(op, callback(op, a))
.- Parse tree
A production node labeled with the type of the operation. Its children are the nodes for the operator, followed by the nodes from the operand.