Appendix 5: Move Data from MySQL to HDFS

sqoop import

syntax:

sqoop import connect jdbc:mysql://localhost/db_name --table table_name --m 1

import the employees database and table employee from mysql to hdfs, there are totally 300k records

sqoop import connect jdbc:mysql://localhost/employees --table departments --m 1

sqoop import \

--connect jdbc:mysql://localhost/employees \

--username root \

--table departments --m 1

22:13:50 ~ 22:15:33: duration:1min43sec, 9 records

for table employees, there are 300024 records:

sqoop import \

--connect jdbc:mysql://localhost/employees \

--username root \

--table employees --m 1

the mapreduce job can be monitored through web gui:

localhost:8088

22:19:00 ~ 22:20:48 duration: 1min 48sec, 300024 records

for salaries table: 2844047 records, 1min 24sec

Append data

sqoop import --connect jdbc:mysql://localhost/employees --username=root -P -table=dept_manager --append --m 1

Verification

mysql:

hadoop: the second time appended import is identical with the first import

sqoop import \

--connect jdbc:mysql://localhost/employees \

--username root \

--table titles --m 1

results matching ""

    No results matching ""