What is Big Data SQL?
Oracle Big Data SQL runs on the Big Data Appliance and allows an Oracle database to run one SQL query to pull data from disparate sources such as Hadoop, NoSQL and relational databases.
Leverage Existing Skills, Better Use of Skilled Resources
This allows developers, BI and power users to leverage their existing SQL skills and experience instead of having to learn multiple new, not-yet-established languages and tools to extract/transform/load data between the various data stores of a Big Data environment.
Instead of having to write customized ETL processes in order to bring data from Hadoop or NoSQL in a “usable” format into the database, developers and users can spend their time on processing and analyzing the data instead – thus enabling enterprises to implement Big Data much faster and focus on high-value business analysis instead of maintaining an additional processing layer.
Understanding the Data
Big Data SQL works by deploying SQL engines to the data sources and extracting the data using a uniform SQL dialect (Oracle’s). Inside the database, improvements in how external tables are integrated into the relational database allow for the database to gather and utilize expanded metadata and treat these external tables are part of the database.
This extra, uniform metadata, combined with metadata from the Hadoop cluster, allows the database optimizer to plan out queries using the location, shape and parallelism of the data and to treat Big Data as though it was data within the database itself.
Very Fast Access to Big Data
Big Data SQL leverages technologies proven in Exadata to allow for fast access to these disparate data sources. Columnar technology allows for faster analytics while storage cell offloading uses SmartScan on Hadoop and storage indexes to filter data at the storage layer.
Instead of very large data sets being pulled from the storage layer by the database layer for query processing, Big Data SQL queries the data on the Hadoop storage nodes and brings only the result set back to the database layer, creating huge network bandwidth savings and significant performance gains: less data movement = faster query speed.
The Big Data Appliance connects its Hadoop storage over Infiniband instead of the typical TCP/IP to an Exadata machine, producing data transfer rates of 15Tb/sec.
Expanding Security to Big Data
By integrating these data sources into the database, security features – such as auditing, users, roles, privileges and Advanced Security – are expanded to cover Big Data as well. Oracle’s Big Data Appliance also encrypts data in the Hadoop cluster at the storage level.
Oracle Big Data SQL runs on Oracle’s Big Data Appliance and can work in conjunction with Exadata. It will be available in the third quarter of 2014.