Appendix 5: Move Data from MySQL to HDFS
sqoop import
syntax:
sqoop import connect jdbc:mysql://localhost/db_name --table table_name --m 1
import the employees database and table employee from mysql to hdfs, there are totally 300k records
sqoop import connect jdbc:mysql://localhost/employees --table departments --m 1
sqoop import \
--connect jdbc:mysql://localhost/employees \
--username root \
--table departments --m 1
22:13:50 ~ 22:15:33: duration:1min43sec, 9 records
for table employees, there are 300024 records:
sqoop import \
--connect jdbc:mysql://localhost/employees \
--username root \
--table employees --m 1
the mapreduce job can be monitored through web gui:
localhost:8088
22:19:00 ~ 22:20:48 duration: 1min 48sec, 300024 records
for salaries table: 2844047 records, 1min 24sec
Append data
sqoop import --connect jdbc:mysql://localhost/employees --username=root -P -table=dept_manager --append --m 1
Verification
mysql:
hadoop: the second time appended import is identical with the first import
sqoop import \
--connect jdbc:mysql://localhost/employees \
--username root \
--table titles --m 1