28msec - query data from any source in real time

Derrick Harris writing about 28msec, still-in-stealth-mode, generic query language:

Their solution was to create a platform able to extract data from any of these sources, transform it into a standard format, and then let users analyze it using a single query language that looks a lot like the SQL they already know. 28msec is based on the open source JSONiq and Zorba query languages and will be available as a cloud service.

This sounds like a variant of an ETL process: Extract-Transform-Query. But it got me thinking of what Daniel Abadi has wrote about the difference between Hadapt and PolyBase, HAWQ—just replace Hadoop with another source of data and SQL with JSONiq:

[…] they all can access data in Hadoop, but there needs to be some sort of structured schema defined in order for the database to understand how to access it via SQL. So, bottom line, Polybase/SQL-H/Hawq let you dynamically get at data in Hadoop/HDFS that could theoretically have been stored in the DBMS all along, but for some reason is being stored in Hadoop instead of the DBMS.

The question is not if this process will work (ETL processes have been around for quite a while), but what can you do to optimize this extract-transform-query process.

via: http://gigaom.com/2013/06/11/stealth-mode-28msec-wants-to-build-a-tower-of-babel-for-databases/

Ref:  http://nosql.mypopescu.com/post/52723058696/28msec-query-data-from-any-source-in-real-time  

你可能感兴趣的:(28msec - query data from any source in real time)