rasdaman newsletter 09/2019


SQL Now With Datacubes

The most recent enhancement to the SQL query language has been published by the ISO standardization body: Multi-Dimensional Arrays (MDA). MDA allows for powerful, efficient queries on massive "datacubes" as they appear in spatio-temporal satellite image timeseries, human brain imaging, astrophysical simulations, as well as business and financial data, to name but a few. SQL/MDA is seen as a game changer in Big Data Analytics.

Since decades, SQL is lingua franca for the management of large and complex data. One reason for this sustained success of SQL is the continued innovation, driven by the ISO SQL working group. "We continuously incorporate new ideas if we see that they are adding substantial value and are elaborated with sufficient rigour" explains Keith Hare, SQL working group convenor and a leading SQL expert.

When the rasdaman team approached SC32 WG3 for adding declarative array support in 2014, WG3 members recognized the value proposition. "Array support closes a gap in large-scale data services; it is a groundbreaking enhancement to the SQL database language", underlines Hare. After own investigation ISO chose the rasdaman query language due to its substantial advantages, such as formal semantics, powerful and practice proven, and fitting seamlessly into the relational model. Over several years SQL/MDA developed in a tight collaboration of rasdaman and the SQL team. Released in summer 2019, it allows for powerful, efficient queries on massive "datacubes" as they appear in science, engineering, business, and other fields. The full designation for this part of SQL is: ISO/IEC 9075-15:2019 Information technology database languages -- SQL -- Part 15: Multi-dimensional arrays (SQL/MDA).

Historically, accessing image metadata is fast and relationally queryable whereas accessing data (such as image pixels) is slow, not queryable, and can only be downloaded as format-encoded files. With SQL/MDA, queries can address both data and metadata simultaneously. "This integration overcomes the age-old distinction between data and metadata, leading to a new quality of data services. This is an exciting perspective," explains Baumann who adds "not least, providing analysis-ready data boosts Machine Learning to new scalability."