Setting Up A Data Transfer with Sqoop
According to its page on the Apache website , Sqoop is "a tool designed for efficiently transferring data between Apache Hadoop and structured datastores such as relational databases." This process is largely automated, due to the way that the database schema informs Sqoop about how the data is to be imported, as per the user guide . In this post, we're going to import data from a MySQL database and then transfer it to Hive. Setting Up the MySQL Data The first step will be to set up the MySQL databases and tables that will be used for the transfer. First we'll create a database called "sqoop_test": mysql> CREATE DATABASE sqoop_test; mysql> USE sqoop_test; Next, the table "stocks" will be created, preparing it for a text file with stock data to be populated: mysql> CREATE TABLE stocks ( id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, symbol VARCHAR(100), quote_date VARCHAR(100), open_price DOUBLE PRECISION, high_price DOUBLE PRECISIO...