back to list

Project: SIMD-based JSON data processing in a dynamic Language VM

Description

The JSON data format is one of the most popular human-readable data formats, and is widely used in Web and Data-intensive applications. Unfortunately, reading (i.e., parsing) and processing JSON data is often a performance bottleneck due to the inherent textual nature of JSON. Recent research work has tried to improve JSON parsing performance by means of JIT compilation [1] or parallel data processing [2].

In this project, we want to push the boundary for JSON data processing performance on a dynamic language runtime such as the GraalVM JavaScript engine [3]. To this end, we want to explore how modern Single-instruction Multiple-data (SIMD) hardware capabilities (such as intel AVX) can be used to speedup JSON data processing. While existing research on SIMD-based JSON processing has mostly focused on building auxiliary indexes to speedup and facilitate data access [2], the goal of this project is to explore alternative approaches that could enable better interactions with the internal Just-in-time (JIT) compiler of the language VM executing the application.

[1] http://www.vldb.org/pvldb/vol10/p1778-bonetta.pdf

[2] https://arxiv.org/abs/1902.08318

[3] https://www.graalvm.org and https://github.com/oracle/graaljs 

Details
Supervisor
Daniele Bonetta
Interested?
Get in contact