JSON Validation Benchmark

This benchmark measures the time it takes to validate JSON, i.e. to check whether it is well-formed. Validation was chosen as opposed to parsing, as parsing speed depends on the JSON data structure as well. Implementing an efficient JSON container is out of scope for lexy, so it would have a disadvantage over the specialized JSON libraries.

The average validation times for each input are shown in the boxplots below. Lower values are better.

The implementations

baseline: This simply adds all input characters of the JSON document without performing actual validation.
lexy: A JSON validator using the lexy grammar from the example. It uses the regular lexy::buffer as input, enabling SWAR and other optimizations. For maximum performance, this is the recommended input.
lexy (no SWAR): Same as above, but it uses a special input where SWAR optimization has been manually disabled.
lexy (no buffer): Same as above, but it uses lexy::string_input as the input. This is a stand-in for a generic non-buffer input where no input specific optimizations are possible.
pegtl: A JSON validator using the PEGTL JSON grammar.
nlohmann/json: A JSON validator using JSON for Modern C++ implemented by nlohmann::json::accept().
rapidjson: A JSON validator using rapidjson implemented using a SAX parser with the rapidjson::BaseReaderHandler.
Boost.JSON: A JSON validator using Boost.JSON implemented using a custom parse handler.

The inputs

canada.json: Contains lots of 2-element arrays holding floating-point coordinate pairs. Taken from https://github.com/miloyip/nativejson-benchmark.
citm_catalog.json: Big JSON file with some variety. Taken from https://github.com/miloyip/nativejson-benchmark.
twitter.json: Some data from twitter’s API. Taken from https://github.com/miloyip/nativejson-benchmark.

The Methodology

The input data is read using lexy::read_file(). The resulting buffer is then passed to the various implementations using their memory inputs. Benchmarking is done by nanobench on an 2020 Mac Mini with M1 processor.