Hive Installation Guide and Practical Example Lecturer Prof




























- Slides: 28
Hive Installation Guide and Practical Example Lecturer : Prof. Kyungbaek Kim Presenter : Alvin Prayuda Juniarta Dwiyantoro
Installation Guide(1) • How to install Hive v 0. 13. 1 • Requirements • Java 1. 6 (example use java-7 -openjdk) • Hadoop 0. 20. x, 0. 23. x, or 2. 0. x (example use Hadoop 2. 5. 1 in pseudo mode)
Installation Guide(2) • Download Hive from a Stable Release • http: //apache. mirror. cdnetworks. com/hive-0. 13. 1/apache-hive-0. 13. 1 bin. tar. gz • Extract the tar files and move it to preferred location (example use /usr/local/hive) • tar –xvzf hive-x. y. z. tar. gz • mv hive-x. y. z /usr/local/hive • Modify ~/. bashrc and add the following statement in the last line • Export HIVE_HOME=/usr/local/hive • Export PATH=$HIVE_HOME/bin: $PATH • source ~/. bashrc
Configuration Guide(1) • Hive uses Hadoop, so modify ~/. bashrc to add Hadoop in the path or add the following statement • export HADOOP_HOME=<hadoop-installation-dir> (example use /usr/local/hadoop) • Start hadoop dfs and yarn • start-dfs. sh • start-yarn. sh
Configuration Guide(2) • Create /tmp and /user/hive/warehouse in the HDFS and set them chmod g+2 • • hadoop fs –mkdir /tmp hadoop fs –mkdir /user/hive/warehouse hadoop fs –chmod g+w /tmp hadoop fs –chmod g+w /user/hive/warehouse
Configuration Guide(3) • Go to /usr/local/hive/conf • cd /usr/local/hive/conf • Change the name of these configuration files template • • hive-env. sh. template hive-env. sh hive-default. xml. template hive-default. xml hive-exec-log 4 j. properties. template hive-exec-log 4 j. properties hive-log 4 j. properties. template hive-log 4 j. properties
Configuration Guide(4) • Create new file, add these statement below and save as hive-site. xml <configuration> <property> <name>fs. default. FS</name> <value>hdfs: //localhost: 9000</value> </property> <name>mapred. job. tracker</name> <value>localhost: 50030</value> </property> </configuration>
Configuration Guide(5) • Open file hive-env. sh • Uncomment HADOOP_HOME and HIVE_CONF_DIR and modify it like below export HADOOP_HOME=/usr/local/hadoop export HIVE_CONF_DIR=/usr/local/hive/conf • Run hive CLI • Hive Note : if the configuration is correct, all table created will exist in HDFS /user/hive/warehouse
Practical Example(1) • Download example data from http: //seanlahman. com/files/database/lahman 591 -csv. zip • Extract the file, we will use Batting. csv data • Copy the data into HDFS hadoop fs -put /home/hduser/Downloads/Batting. csv /user/hive • Enter hive cli
Practical Example(2) • Create table temp_batting create table temp_batting(col_value string); • Load data from Batting. csv to temp_batting load data inpath ’user/hive/Batting. csv’ overwrite into table temp_batting; • To see the data format select * from temp_batting;
Practical Example(2) • Create new table batting create table batting(player_id string, year int, runs int); • Extract information from temp_batting to batting insert overwrite table batting select regexp_extract(col_value, ‘^(? : ([^, ]*), ? ){1}’, 1) player_id, regexp_extract(col_value, ‘^(? : ([^, ]*), ? ){2}’, 1) year, regexp_extract(col_value, ‘^(? : ([^, ]*), ? ){9}’, 1) runs from temp_batting; • View the resulting table select * from batting;
Practical Example(3) • Find the highest run for each year select year, max(runs) from batting group by year; • Find the corresponding player for highest run each year select a. year, a. player_id, a. runs from batting a join (select year, max(runs) runs from batting group by year) b on (a. year = b. year and a. runs = b. runs) ; • Delete table temp_batting drop table temp_batting;
Screenshot of Practical Example
Screenshot of Practical Example
Screenshot of Practical Example
Screenshot of Practical Example
Screenshot of Practical Example
Screenshot of Practical Example
Screenshot of Practical Example
Screenshot of Practical Example
Screenshot of Practical Example
Screenshot of Practical Example
Screenshot of Practical Example
Screenshot of Practical Example
Screenshot of Practical Example
Screenshot of Practical Example
Screenshot of Practical Example
Screenshot of Practical Example