Apache Sqoop documentation

Building Sqoop2 from source code


Building Sqoop2 from source code

This guide will show you how to build Sqoop2 from source code. Sqoop is using maven as build system. You you will need to use at least version 3.0 as older versions will not work correctly. All other dependencies will be downloaded by maven automatically. With exception of special JDBC drivers that are needed only for advanced integration tests.

Downloading source code

Sqoop project is using git as a revision control system hosted at Apache Software Foundation. You can clone entire repository using following command:

git clone https://git-wip-us.apache.org/repos/asf/sqoop.git sqoop2

Sqoop2 is currently developed in special branch sqoop2 that you need to check out after clone:

cd sqoop2
git checkout sqoop2

Building project

You can use usual maven targets like compile or package to build the project. Sqoop supports two major Hadoop revisions at the moment - 1.x and 2.x. As compiled code for one Hadoop major version can’t be used on another, you must compile Sqoop against appropriate Hadoop version. You can change the target Hadoop version by specifying -Dhadoop.profile=$hadoopVersion on the maven command line. Possible values of $hadoopVersions are 100 and 200 for Hadoop version 1.x and 2.x respectively. Sqoop will compile against Hadoop 2 by default. Following example will compile Sqoop against Hadoop 1.x:

mvn compile -Dhadoop.profile=100

Maven target package can be used to create Sqoop packages similar to the ones that are officially available for download. Sqoop will build only source tarball by default. You need to specify -Pbinary to build binary distribution. You might need to explicitly specify Hadoop version if the default is not accurate.

mvn package -Pbinary

Running tests

Sqoop supports two different sets of tests. First smaller and much faster set is called unit tests and will be executed on maven target test. Second larger set of integration tests will be executed on maven target integration-test. Please note that integration tests might require manual steps for installing various JDBC drivers into your local maven cache.

Example for running unit tests:

mvn test

Example for running integration tests:

mvn integration-test