Current location - Education and Training Encyclopedia - Education and training - How to use JDBC server of Spark SQL
How to use JDBC server of Spark SQL
Runtime environment

Cluster environment: CDH5.3.0

The specific JAR version is as follows:

Spark version: 1.2.0-cdh5.3.0

Honeycomb version: 0. 13. 1-cdh5.3.0

Hadoop version: 2.5.0-cdh5.3.0

Start JDBC server

cd /etc/spark/conf

ln-s/etc/hive/conf/hive-site . XML hive-site . XML

CD/opt/cloud era/parcels/CDH/lib/spark/

Chmod- -R 777 log/

CD/opt/cloud era/parcels/CDH/lib/spark/sbin

. /start-thrift server . sh-master yarn-hive conf hive . server 2 . thrift . port = 10008

Connect to JDBC server with a straight line

CD/opt/cloud era/parcels/CDH/lib/spark/bin

beeline-u JDBC:hive 2://Hadoop 04: 10000

[root @ Hadoop 04 bin]# beeline-u JDBC:hive 2://Hadoop 04: 10000

The scan was completed in 2 milliseconds.

Connect to JDBC: hive 2://Hadoop 04:10000.

Connect to: Spark SQL (version 1.2.0)

Driver: Hive JDBC (version 0. 13. 1-cdh5.3.0)

Transaction isolation: transactions can be read repeatedly.

Beeline version 0. 13. 1-cdh5.3.0 of Apache Hive.

0:JDBC:hive 2://Hadoop 04: 10000 & gt;

Use straight lines

In the Beeline client, you can use the standard HiveQL command to create, list and query tables. You can find all the details of Hive QL in the HiveQL language manual, but here we show some common operations.

If it doesn't exist, create the table mytable (key INT, value STRING).

Separate fields with a line format ending in ",".

Create a delimited field with the line format ending with "#" in the table mytable (name string, address string, status string).

# Load local file

Load the data local path' /external/tmp/data.txt' into the table mytable.

# Load hdfs file

Load the data in the path "HDFS://ju51nn/external/tmp/data.txt" into the table mytable;

Describe my watch;

Explain select * from my table where name =' Zhang San'

select * from my table where name = ' Zhang San '

Cache table mytable

Select count(*) total, count(distinct addr) num 1, count (distinct status) num 2 from my table where addr =' gz.

Uncaching table mytable

Using data examples

Zhang San # Guangzhou # student

Teacher Li Si # Guizhou #

Wu Wang # Wuhan # Lecturer

Liu Zhao # Chengdu # student

Lisa # Guangzhou # student

Lily # gz # Studin

Independent Spark SQL Shell

Spark SQL also supports a simple shell that can be used as a process: spark-sql.

It is mainly used in local development environment. Please use JDBC server in a * * * cluster environment.

CD/opt/cloud era/parcels/CDH/lib/spark/bin

. /spark-sql