Technology Partners
Apache Iceberg is one of our supported open table formats that you can use with StarRocks. You can either have StarRocks read and write in Apache Iceberg format on your own S3 buckets or connect to an Apache Iceberg External Catalog.
Apache Hudi is one of our supported open table formats that you can use with StarRocks. You can have StarRocks connect to an Apache Hudi External Catalog and query the data in Apache Hudi.
Apache Hive is one of our supported open table formats that you can use with StarRocks. You can have StarRocks connect to an Apache Hive External Catalog and query the data in Apache Hive.
Apache Hive is one of our supported open table formats that you can use with StarRocks. You can have StarRocks connect to an Delta Lake External Catalog and query the data in Delta Lake.
Our most tested solution for real time data streaming is Apache Kafka. We have several methods to load and sink data into StarRocks.
Within the community, Airbyte seems to be the most used option for scheduled ETL / ELT.
Apache SuperSet is very popular with our community. StarRocks can connect to it using the mysql driver or the StarRocks dialect.
Min.IO is a high-performance, open-source object storage platform designed for cloud-native environments and large-scale data storage. StarRocks can it MINIO to store data, provide faster performance through faster IOPS and help with HADR through their S3 replication technology.
dbt Labs is a company developing tools and workflows for data transformation in data warehouses. They aim to democratize data transformation and make it more accessible for analysts and engineers, even those without extensive coding experience. StarRocks provides a dbt connector for StarRocks.
Vendors
This page contains some of the vendors who provide commercial support and services for StarRocks, including:
Alibaba Cloud Elastic MapReduce (EMR) is a big data processing solution that runs on the Alibaba Cloud platform. EMR is built on Alibaba Cloud ECS instances and is based on open-source Apache Hadoop and Apache Spark. EMR allows you to use the Hadoop and Spark ecosystem components, such as Apache Hive, Apache Kafka, Flink, Druid, and TensorFlow, to analyze and process data. You can use EMR to process data stored on different Alibaba Cloud data storage service, such as Object Storage Service (OSS), Log Service (SLS), and Relational Database Service (RDS).
Alibaba Cloud EMR StarRocks is available to you on a semi-managed hosting basis, giving you full administrative access to the cluster.
Learn more about StarRocks on Alibaba Cloud EMR:
This Quick Start deploys StarRocks on the Amazon Web Services (AWS) Cloud. It's for organizations that want a data service layer that supports real-time analytics and high concurrency while simplifying data pipelines.
With this Quick Start, you can connect your applications to a highly available StarRocks architecture deployed to Amazon Elastic Compute Cloud (Amazon EC2) instances. You can analyze large volumes of data without having to maintain fault tolerance or scale StarRocks instances.
Volcano Engine is a cloud service platform, aiming to provide services to enterprises. Our services include growth methods, technical capabilities and application tools that accumulated during ByteDance's business development. Volcano Engine E-MapReduce (EMR) is cloud-native open-source big data platform which provides enterprise-level big data ecosystem components such as Hadoop, Spark, Flink, Hive, Presto, Kafka, StarRocks, Hudi and etc. It is 100% open-source compatible and enables the quick construction of an enterprise-level big data platform while lowering operational barriers.
CelerData enables enterprises to quickly and easily grow their business with a real-time analytical engine that is 3X the performance/cost of any other solutions on the market.
Powered by StarRocks, CelerData is the only platform uniquely designed for the next generation real-time Enterprise, unleashing the power of business intelligence to help accelerate Enterprise digital transformation. Used worldwide by market-leading brands including Airbnb, Lenovo, and Trip.com, CelerData generates critical new insights for these data-driven companies.
Powered by StarRocks, Mirrorship is a high-performance analytical data warehouse that uses vectorization, MPP architecture, real-time updatable columnar storage engine, and other technologies to achieve multi-dimensional, real-time, and highly concurrent data analysis. Mirrorship database supports both efficient data import from various real-time and offline data sources and direct analysis of data in various formats on data lakes.
Combining cloud computing and community open-source technologies such as Hadoop, Hive, Spark, HBase, Presto, and Storm, Tencent Cloud Elastic MapReduce (EMR) provides secure and cost-effective cloud-based Hadoop services featuring high reliability and elastic scalability. Using EMR, you can create a secure and reliable Hadoop cluster in just minutes to analyze petabytes of data stored on the data nodes in the cluster or in Cloud Object Storage (COS).
Tencent Cloud EMR StarRocks is available to you on a semi-managed hosting basis, giving you full administrative access to the cluster.
Learn more about StarRocks on Tencent Cloud EMR: