TechAE Blogs - Explore now for new leading-edge technologies

TechAE Blogs - a global platform designed to promote the latest technologies like artificial intelligence, big data analytics, and blockchain.

Full width home advertisement

Post Page Advertisement [Top]

Kylin Install

Looking to improve your big data processing, extreme OLAP engines offer an easy path to success. Try Apache Kylin now!

What is Apache Kylin:

Apache Kylin is a big data open-source Distributed Analytical Data Warehouse that was designed to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop while supporting extremely large datasets. The latest major version is Kylin 4.0.1.

Latest Features of Kylin 4.0:

  • It uses Parquet as Storage rather than HBase as in Apache Kylin 3.x.
  • It uses a new spark build engine instead of MapReduce.
  • It uses Spark SQL as a query engine instead of Hive/JDBC.

Why Apache Kylin Proved to be Better?

  • MOLAP Cube Precalculation
  • Cloud Friendly due to the inclusion of Parquet
  • Cubing duration and cube size
  • Interactive SQL Query Interface
  • Query performance
  • High concurrency
  • Seamless integration with BI tools
  • Real-time OLAP (soon)

Step 1: Installation

Run all the necessary commands to start the Hadoop cluster with other services.

Step 2: Kylin setup

Download Apache Kylin 4.0.1 binary package from the Apache Kylin Download Site. For example, the following command line can be used:


Unzip the tarball and configure the environment variable $KYLIN_HOME to the Kylin folder in the .bashrc file.

tar -zxvf apache-kylin-4.0.1-bin-spark3.tar.gz
cd apache-kylin-4.0.1-bin-spark3
export KYLIN_HOME=`pwd`

Step 3: Configure MySQL metastore

Kylin 4.0 uses MySQL as metadata storage, make the following configuration in


Also put MySQL JDBC connector into $KYLIN_HOME/ext/, if there is no such directory, please create it. You can download the MySQL JDBC connector from here.

Two more jar files need to be placed in $KYLIN_HOME/tomcat/webapps/kylin/WEB-INF/lib

Step 4: Start Kylin

Run the script, $KYLIN_HOME/bin/ start , to start Kylin. The interface output is as follows:
Retrieving hadoop conf dir...
KYLIN_HOME is set to /home/hdoop/apache-kylin-4.0.1-bin-spark3
A new Kylin instance is started by hdoop. To stop it, run ' stop'
Check the log at /usr/local/apache-kylin-4.0.1-bin-spark3/logs/kylin.log
Web UI is at http://localhost:7070/kylin

As soon as you run this command, Web UI will start. The initial username and password are ADMIN/KYLIN.

Stop Kylin

Run the $KYLIN_HOME/bin/ stop script to stop Kylin. The console output is as follows:

Stopping Kylin: 14818
Stopping in progress. Will check after 2 secs again...
Kylin with pid 14818 has been stopped.

HDFS Storage

Kylin will generate files on HDFS. The default root directory is “kylin/”, and then the metadata table name of kylin cluster will be used as the second layer directory name, and the default is “kylin_metadata” (can be customized in conf/

Enable Dashboard

In conf/, add the following configuration:



To conclude, this is an advanced open-source data warehouse where you can explore many functionalities, I have explained the installation of Apache Kylin. Further, I will be explaining how to create models and cubes in my next article.

If you found this article useful, feel free to read more articles on this topic.


No comments:

Post a Comment

Bottom Ad [Post Page]