on average, less than half of an organization’s structured data is actively used in making decisions
Storing data is now easier than ever, and many companies now want to store as much data as possible to make more “data-driven” business decisions. The problem is that in practice more data gets stored but it’s rarely accessed.
One of the problem is that often the data is not stored in a huge single database, but scattered across multiple data store with different schemas and sometimes with no schema (e.g. JSON, CSV, Parquet, etc.), so it’s not easy to retrieve all this information from different storage systems and it’s even harder to combine the things together.
AWS has now come up with a (potential) solution to the problem:
PartiQL, a SQL-compatible query language that makes it easy to efficiently query data, regardless of where or in what format it is stored.
This is a very hard task, but they have been working a lot on it and it already supports multiple data stores in AWS, and even external parties like Couchbase are working on supporting their database.
How does it work? Essentially PartiQL is an abstraction on top of different query languages, and the query you write in PartiQL will be parsed and compiled in the query language of the datastore you want to query.
The reference implementation is open source and requires the JVM. You can easily install it and try it locally.