Finding

If using sqoop to import a mysql table to hive, it actually will use hdfs as a staging place and will leave a tmp file there, so if the mysql table is already imported to hdfs, then the import to hive will fail with error of file exists.

The actual hdfs file's location is /user/the_current_user, in this case it is /user/root

Solution to this:

remove or rename the first time sqoop importd to hdfs:

1. hdfs dfs -mv /user/root/employees /user/root/employeesHDFS

2. re-run the sqoop import:

sqoop import --connect jdbc:mysql://localhost/employees --username root --P --table employees --hive-import --hive-table employeesImported2Hive

note: this time indicate the hive table's name different than the source table name in mysql

Change the table's name

alter table employeesimported2hive rename to emp;

get the table schema

create new partitioned table

Create table partitionedemp

CREATE TABLE partitionedemp

(emp_no int,

birth_date string,

first_name string,

last_name string

)

Partitioned By (gender string, hire_date string)

Create a new partitioned table:

set the partition mode to nonstrict

Insert data to the partitioned table

troubleshooting

results matching ""

    No results matching ""