Some additional information in one line

What is Apache Hudi?

Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Hudi reimagines slow old-school batch data processing with a powerful new incremental processing framework for low latency minute-level analytics.

What is StarRocks?

StarRocks is a next-generation, blazing-fast massively parallel processing (MPP) database designed to make real-time analytics easy for enterprises. It is built to power sub-second queries at scale.   StarRocks can read data stored in Apache Hudi.

 

StarRocks + Apache Hudi = The Modern Open Data Lake

hudi

Technical Benefits

  • Our performance tests have shown that StarRocks can get to near local disk performance when using Apache Hudi. 

  • No lock in on the query layer.  You can change the query layer when it doesn't meet the technical or financial requirements anymore. 

  • Get all the capabilities of an OLAP database like the ability to do JOINs and materialized views on the data within Apache Hudi (you can also do a JOIN across an Apache Iceberg, Apache Hudi and Apache Hive table).

  • Many database tools just work out of the box through the Mysql wire compatible protocol support within StarRocks.

 

Apache Hudi + StarRocks Webinar

Try out our hands on lab!

One of the best way to understand our product is through our hands on labs at https://killercoda.com/starrocks/

Resources

Tutorial: How to query data in Apache Hudi using StarRocks

Video: Apache Hudi Community Call May 2023

Documentation: StarRocks Apache Hudi External Catalog

Documentation: StarRocks querying on hudi.apache.org