Partner Integration: Apache Hive + StarRocks

Publish date: Sep 15, 2023 3:20:37 PM

What is Apache Hive?

Apache Hive is an open-source data warehouse software project built on top of Apache Hadoop for providing data query and analysis. Hive gives an SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop.

What is StarRocks?

StarRocks is a next-generation, blazing-fast massively parallel processing (MPP) database designed to make real-time analytics easy for enterprises. It is built to power sub-second queries at scale. StarRocks can read stored in Apache Hive.

StarRocks + Apache Hive = The Modern Open Data Lake

hive

Technical Benefits

Our performance tests have shown that StarRocks can get to near local disk performance when using Apache Hive.
No lock in on the query layer. You can change the query layer when it doesn't meet the technical or financial requirements anymore.
Get all the capabilities of an OLAP database like the ability to do JOINs and materialized views on the data within Apache Hive (you can also do a JOIN across an Apache Iceberg, Apache Hudi and Apache Hive table).
Many database tools just work out of the box through the Mysql wire compatible protocol support within StarRocks.

Try out our hands on lab!

One of the best way to understand our product is through our hands on labs at https://killercoda.com/starrocks/

Resources

Documentation: StarRocks Hive External Catalog

Documentation: StarRocks on Apache Hive's Wiki