Cluster environment: CDH5.3.0
The specific JAR version is as follows:
Spark version: 1.2.0-cdh5.3.0
Honeycomb version: 0. 13. 1-cdh5.3.0
Hadoop version: 2.5.0-cdh5.3.0
Start JDBC server
cd /etc/spark/conf
ln-s/etc/hive/conf/hive-site . XML hive-site . XML
CD/opt/cloud era/parcels/CDH/lib/spark/
Chmod- -R 777 log/
CD/opt/cloud era/parcels/CDH/lib/spark/sbin
. /start-thrift server . sh-master yarn-hive conf hive . server 2 . thrift . port = 10008
Connect to JDBC server with a straight line
CD/opt/cloud era/parcels/CDH/lib/spark/bin
beeline-u JDBC:hive 2://Hadoop 04: 10000
[root @ Hadoop 04 bin]# beeline-u JDBC:hive 2://Hadoop 04: 10000
The scan was completed in 2 milliseconds.
Connect to JDBC: hive 2://Hadoop 04:10000.
Connect to: Spark SQL (version 1.2.0)
Driver: Hive JDBC (version 0. 13. 1-cdh5.3.0)
Transaction isolation: transactions can be read repeatedly.
Beeline version 0. 13. 1-cdh5.3.0 of Apache Hive.
0:JDBC:hive 2://Hadoop 04: 10000 & gt;
Use straight lines
In the Beeline client, you can use the standard HiveQL command to create, list and query tables. You can find all the details of Hive QL in the HiveQL language manual, but here we show some common operations.
If it doesn't exist, create the table mytable (key INT, value STRING).
Separate fields with a line format ending in ",".
Create a delimited field with the line format ending with "#" in the table mytable (name string, address string, status string).
# Load local file
Load the data local path' /external/tmp/data.txt' into the table mytable.
# Load hdfs file
Load the data in the path "HDFS://ju51nn/external/tmp/data.txt" into the table mytable;
Describe my watch;
Explain select * from my table where name =' Zhang San'
select * from my table where name = ' Zhang San '
Cache table mytable
Select count(*) total, count(distinct addr) num 1, count (distinct status) num 2 from my table where addr =' gz.
Uncaching table mytable
Using data examples
Zhang San # Guangzhou # student
Teacher Li Si # Guizhou #
Wu Wang # Wuhan # Lecturer
Liu Zhao # Chengdu # student
Lisa # Guangzhou # student
Lily # gz # Studin
Independent Spark SQL Shell
Spark SQL also supports a simple shell that can be used as a process: spark-sql.
It is mainly used in local development environment. Please use JDBC server in a * * * cluster environment.
CD/opt/cloud era/parcels/CDH/lib/spark/bin
. /spark-sql