Apache Kylin
This article needs additional citations for verification. (September 2018) |
Developer(s) | Apache Kylin Committee |
---|---|
Initial release | June 10, 2015[1] |
Stable release | |
Repository | Kylin Repository |
Written in | Java |
License | Apache License 2.0 |
Website | kylin |
Apache Kylin is an open source distributed analytics engine designed to provide a SQL interface and multi-dimensional analysis (OLAP) on Hadoop and Alluxio supporting extremely large datasets.
It was originally developed by eBay, and is now a project of the Apache Software Foundation.[3]
History
[edit]The Kylin project was started in 2013, in eBay's R&D in Shanghai, China. In Oct 2014, Kylin v0.6 was open sourced on github.com with the name "KylinOLAP".[4]
In November 2014, Kylin joined Apache Software Foundation incubator.
In December 2015, Apache Kylin graduated to be a Top Level Project.[3]
In March 2016, Kyligence, Inc. was founded by the creators of Apache Kylin.[5][6] Kyligence provides a commercial analytics platform based on Apache Kylin for on-premise and cloud-based datasets.[7]
Architecture
[edit]Apache Kylin is built on top of Apache Hadoop, Apache Hive, Apache HBase, Apache Parquet, Apache Calcite, Apache Spark and other technologies.[8] These technologies enable Kylin to easily scale to support massive data loads.[9]
Kylin has the following core components:[10][8]
- REST Server: Receive and response user or API requests
- Metadata: Persistent and manage system, especially the cube metadata;
- Query Engine: Parse SQL queries to execution plan, and then talk with storage engine;
- Storage Engine: Pushdown and scan underlying cube storage (default in HBase);
- Job Engine: Generate and execute MapReduce or Spark job to build source data into cube;
Users
[edit]Apache Kylin has been adopted by many companies as their OLAP platform in production. Typical users includes eBay, Meituan, XiaoMi, NetEase, Beike, Yahoo! Japan.
Roadmap
[edit]Apache Kylin roadmap (from Kylin website[11]):
- Hadoop 3.0 support (Erasure Coding) - completed (v2.5)
- Fully on Spark Cube engine - completed (v2.5)
- Connect more data sources (MySQL, Oracle, SparkSQL, etc) - completed (v2.6)
- Real-time analytics with Lambda Architecture - completed (v3.0)
- Cloud-native storage (Parquet) - In progress (v4.0.0-alpha)
- Ad hoc queries without Cubing
References
[edit]- ^ "Previous Release". v0.7.1-incubating (First Apache Release). Retrieved 15 June 2019.
- ^ a b "Apache Kylin - Release Notes". Retrieved 27 September 2022.
- ^ a b Apache Software Foundation. "The Apache Software Foundation Announces Apache Kylin as a Top-Level Project", 8 December 2015
- ^ "Announcing Kylin: Extreme OLAP Engine for Big Data". www.ebayinc.com. 2014-10-20. Retrieved 2018-11-08.
- ^ "Apache Kylin Through the Eyes of the Founders - Part One". Kyligence. 2020-06-12. Retrieved 2020-09-30.
- ^ "Big Data Analytics Platform | Learn More About Kyligence". Kyligence. Retrieved 2020-09-30.
- ^ "Big Data Analytics Platform: Apache Kylin vs. Kyligence". Kyligence. Retrieved 2020-09-30.
- ^ a b "Apache Kylin | Analytical Data Warehouse for Big Data". kylin.apache.org. Retrieved 2020-09-30.
- ^ Knorr, Eric (2016-03-07). "What eBay looks like under the hood". InfoWorld. Retrieved 2020-09-30.
- ^ "Apache Kylin Adds Real-time OLAP". www.i-programmer.info. Retrieved 2020-09-30.
- ^ Kylin, Apache. "Apache Kylin | Development Quick Guide". kylin.apache.org. Retrieved 2020-09-30.