Header lexy/input_location.hpp

Converts a position (iterator) into a human readable location (line/column number).

Class lexy::input_location

lexy/input_location.hpp
namespace lexy
{
    template <input Input, typename Counting = see-below>
    class input_location
    {
    public:
        // Location of the beginning.
        constexpr explicit input_location(const Input& input);

        constexpr input_location_anchor<Input> anchor() const;

        constexpr unsigned line_nr() const;
        constexpr unsigned column_nr() const;

        constexpr auto position() const
          -> lexy::input_reader<Input>::iterator;

        // operator==, operator!=, operator<, operator<=, operator>=, operator>
    };
}

A human readable location in the input.

It combines the iterator based position with line/column number, according to some Counting strategy. The Counting strategy defaults to lexy::code_unit_location_counting  for text and lexy::byte_location_counting  for bytes.

The constructor constructs an input location for the beginning of the input. anchor() returns the latest anchor before that location to start the search.

Note
Use lexy::get_input_location  to create location objects.

Function lexy::get_input_location

lexy/input_location.hpp
namespace lexy
{
    template <input Input>
    class input_location_anchor
    {
    public:
        // Anchor to the beginning.
        constexpr explicit input_location_anchor(const Input& input);
    };

    template <typename Counting = see-below>
    constexpr auto get_input_location(const input auto& input,
                                lexy::input_reader<input>::iterator position,
                                input_location_anchor<input> anchor = see-below)
        -> input_location<Input, Counting>;
}

Gets the lexy::input_location  for a position.

This is a linear search over the input that starts at the anchor. If none is provided, it starts at the beginning. As an optimization, pass an anchor of an existing location that is before position.

The result will be the location for the position as determined by the Counting strategy. The Counting strategy defaults to lexy::code_unit_location_counting  for text and lexy::byte_location_counting  for bytes. Calling position() on the resulting location will be the beginning of the last column before position.

Note
If position points inside a \r\n sequence, for example, the resulting position() will point to the initial \r.

Counting strategies lexy::​code​_unit​_location​_counting, lexy::​code​_point​_location​_counting, lexy::​byte​_location​_counting

lexy/input_location.hpp
namespace lexy
{
    struct code_unit_location_counting {};
    struct code_point_location_counting {};

    template <std::size_t LineWidth = 16>
    struct byte_location_counting {};
}

Strategies for counting column and line numbers.

code_unit_location_counting

Increments the column for every code unit; increments the line for every lexy::dsl::newline . For example, UTF-8 encoded "abä" is a single line with four columns (two for each code unit of ä). This is the default for text.

code_point_location_counting

Increments the column for every lexy::dsl::code_point ; increments the line for every lexy::dsl::newline . For example, UTF-8 encoded "abä" is a single line with three columns.

byte_location_counting

Increments the column for every byte (requires lexy::byte_encoding ); increments the line for every LineWidth bytes. This is the default for byte input.

See my blog post for an in-depth discussion about the choice of column units.

Function lexy::get_input_line_annotation

lexy/input_location.hpp
namespace lexy
{
    template <input Input>
    struct input_line_annotation
    {
        lexy::lexeme_for<Input> before;
        lexy::lexeme_for<Input> annotated;
        lexy::lexeme_for<Input> after;

        bool truncated_multiline;
        bool rounded_end;
    };

    template <input Input, typename Counting>
    constexpr auto get_input_line_annotation(const Input& input,
                                 const input_location<Input, Counting>& begin_location,
                                 lexy::input_reader<Input>::iterator end)
      -> input_line_annotation<Input>;

    template <typename Input, typename Counting>
    constexpr auto get_input_line_annotation(const Input& input,
                                 const input_location<Input, Counting>& location,
                                 std::size_t                            size)
      -> input_line_annotation<Input>;
    {
        auto end = std::next(location.position(), size);
        return get_input_line_annotation(input, location, end);
    }
}

Computes the part of the input referenced by a [begin_location.position(), end) with surrounding input.

The result is an object of type input_line_annotation with the following values:

before

A lexy::lexeme  for the range [line_begin, begin_location.position()), where line_begin is the beginning of the line of begin_location, as determined by Counting.

annotated

A lexy::lexeme  for the range [begin_location.position(), modified_end). If begin_location.position() == end, modified_end is an incremented end:

  • If end points to the beginning or inside of the newline, as determined by Counting, it is set to the end of the newline.

  • Otherwise, it is set to the end of the current code point.

If end is on a different line then begin_location, modified_end is the end of the newline, as determined by Counting. Otherwise, modified_end is the end of the code point end points or multi character newline.

after

A lexy::lexeme  for the range [modified_end, line_end), where line_end is either the position of the newline or the end of the newline to ensure modified_end ⇐ line_end, as determined by Counting.

truncated_multiline

true if end was on a different line than begin_location, false otherwise.

rounded_end

true if end points inside a code point and needed to be adjusted.

Tip
Use this function for error message generation. Use lexy::visualize  to print before, annotated, after; and lexy::visualization_display_width  to compute the indent below before and the number of underline characters for annotated.

See also