TechAE Blogs - Explore now for new leading-edge technologies

TechAE Blogs - a global platform designed to promote the latest technologies like artificial intelligence, big data analytics, and blockchain.

Full width home advertisement

Post Page Advertisement [Top]

Tez Setup

So after learning about Tez in the previous blog, we are here to set up Apache Tez onto Hive so that our queries run more easily and efficiently.

PREREQUISITES: 

• Apache Hadoop-3.3.1 with Yarn.
• Maven 3
• Protocol Buffers 2.5 or later
• Hive-3.1.2 release

STEP 1: Installing Tez 

$ wget https://dlcdn.apache.org/tez/0.9.2/apache-tez-0.9.2-bin.tar.gz
# Download the source files for Tez
$ tar xzf apache-tez-0.9.2-bin.tar.gz
# Extracting the source files
$ cd apache-tez-0.9.2-bin/conf
# Changing to the directory's conf folder

STEP 2: Create the tez-site.xml file

Now, we need to create a new tez-site.xml file with the below properties

Tez-site config

STEP 3: Modifying .bashrc file 

Adding all the Tez configurations in .bashrc file

export TEZ_HOME="apache-tez-0.9.2-bin/"

export TEZ_CONF_DIR="$TEZ_HOME/conf"

export TEZ_JARS="$TEZ_HOME"

if [ -z "$HIVE_AUX_JARS_PATH" ]; then

export HIVE_AUX_JARS_PATH="$TEZ_JARS"

else

export HIVE_AUX_JARS_PATH="$HIVE_AUX_JARS_PATH:$TEZ_JARS"

fi

export HADOOP_CLASSPATH=${TEZ_CONF_DIR}:${TEZ_JARS}/*:${TEZ_JARS}/lib/*

# Adding this to the .bashrc file

$ source ~/.bashrc # Apply the changes by this command

STEP 4: Modifying POM.xml 

Change hadoop version to <hadoop.version>3.3.1</hadoop.version>

STEP 5: Modifying mapred-site.xml

Change mapreduce.framework.name from yarn to yarn-tez. Restart the Hadoop cluster to update the settings.

STEP 6: Build binary tar ball 

$ mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true
# Building Binary Tar Ball

When the build is successful, we move the tez-0.9.2-full.tar.gz to our Tez directory and extract it there.

STEP 7: Copying Jar files in HDFS 

$ hadoop fs -mkdir /apache-tez-0.9.2/
$ export TEZ_HOME='apache-tez-0.9.2-bin/'
$ hadoop fs -put $TEZ_HOME/* /apps/apache-tez-0.9.2/
# Copying the relevant jar files onto HDFS directory

STEP 8: Copying the relevant files 

$ hdfs dfs -mkdir /user
# Make a new directory “user”
$ hdfs dfs -mkdir /user/tez
$ hdfs dfs -chmod g+w /user/tez
$ cd $TEZ_HOME
# Moving to the tez home directory
$ hdfs dfs -put * /user/tez
$ hadoop fs -put $HIVE_HOME/lib/hive-exec-3.1.2.jar /user/tez
# Copying hive-exec-0.9.2.jar file from $HIVE_HOME/lib directory 
into HDFS directory specified in tez.lib.uris property in tez-site.xml file

STEP 9: Integrating Tez on Hive 

To run a query on Tez engine, we need whether to set hive.execution.engine=tez; each time for hive session or change this value permanently in hive-site.xml

hive> set hive.execution.engine=tez; # Setting hive execution engine to tez

Or you can permanently change the value in hive-site.xml by changing:

<property>

<name>hive.execution.engine</name>

<value>tez</value>

</property>

Tez running

Conclusion:

So in this way, you can easily set up tez onto the Hadoop cluster without any hectic procedures.

No comments:

Post a Comment

Bottom Ad [Post Page]