Sqoop REST API Guide¶
This document will explain how you can use Sqoop REST API to build applications interacting with Sqoop server. The REST API covers all aspects of managing Sqoop jobs and allows you to build an app in any programming language using HTTP over JSON.
Table of Contents
- Sqoop REST API Guide
- Initialization
- Understand Connector, Driver, Link and Job
- Objects
- Header Parameters
- REST APIs
- /version - [GET] - Get Sqoop Version
- /v1/connectors - [GET] Get all Connectors
- /v1/connector/[cname] or /v1/connector/[cid] - [GET] - Get Connector
- /v1/driver - [GET]- Get Sqoop Driver
- /v1/links/ - [GET] Get all links
- /v1/links?cname=[cname] - [GET] Get all links by Connector
- /v1/link/[lname] or /v1/link/[lid] - [GET] - Get Link
- /v1/link - [POST] - Create Link
- /v1/link/[lname] or /v1/link/[lid] - [PUT] - Update Link
- /v1/link/[lname] or /v1/link/[lid] - [DELETE] - Delete Link
- /v1/link/[lid]/enable or /v1/link/[lname]/enable - [PUT] - Enable Link
- /v1/link/[lid]/disable - [PUT] - Disable Link
- /v1/jobs/ - [GET] Get all jobs
- /v1/jobs?cname=[cname] - [GET] Get all jobs by connector
- /v1/job/[jname] or /v1/job/[jid] - [GET] - Get Job
- /v1/job - [POST] - Create Job
- /v1/job/[jid] - [PUT] - Update Job
- /v1/job/[jid] - [DELETE] - Delete Job
- /v1/job/[jid]/enable - [PUT] - Enable Job
- /v1/job/[jid]/disable - [PUT] - Disable Job
- /v1/job/[jid]/start or /v1/job/[jname]/start - [PUT]- Start Job
- /v1/job/[jid]/stop or /v1/job/[jname]/stop - [PUT]- Stop Job
- /v1/job/[jid]/status or /v1/job/[jname]/status - [GET]- Get Job Status
- /v1/submissions? - [GET] - Get all job Submissions
- /v1/submissions?jname=[jname] - [GET] - Get Submissions by Job
Initialization¶
Before continuing further, make sure that the Sqoop server is running.
Then find out the details of the Sqoop server: host, port and webapp, and keep them in mind. Note that the sqoop server is running on Apache Tomcat. To exercise a REST API for Sqoop, you could assemble and send a HTTP request to an url corresponding to that API. Generally, the url contains the host on which the sqoop server is running, the port at which the sqoop server is listening to and webapp, the context path at which the Sqoop server is registered in the Apache Tomcat engine.
Certain requests might need to contain some additional query parameters and post data. These parameters could be given via the HTTP headers, request body or both. All the content in the HTTP body is in JSON format.
Understand Connector, Driver, Link and Job¶
To create and run a Sqoop Job, we need to provide config values for connecting to a data source and then processing the data in that data source. Processing might be either reading from or writing to the data source. Thus we have configurable entities such as the From and To parts of the connectors, the driver that each expose configs and one or more inputs within them.
For instance a connector that represents a relational data source such as MySQL will expose config classes for connecting to the database. Some of the relevant inputs are the connection string, driver class, the username and the password to connect to the database. These configs remain the same to read data from any of the tables within that database. Hence they are grouped under LinkConfiguration.
Each connector can support Reading from a data source and/or writing/to a data source it represents. Reading from and writing to a data source are represented by From and To respectively. Specific configurations are required to peform the job of reading from or writing to the data source. These are grouped in the FromJobConfiguration and ToJobConfiguration objects of the connector.
For instace, a connector that represents a relational data source such as MySQL will expose the table name to read from or the SQL query to use while reading data as a FromJobConfiguration. Similarly a connector that represents a data source such as HDFS, will expose the output directory to write to as a ToJobConfiguration.
Objects¶
This section covers all the objects that might exist in an API request and/or API response.
Configs and Inputs¶
Before creating any link for a connector or a job with associated From and To links, the first thing to do is getting familiar with all the configurations that the connector exposes.
Each config consists of the following information
Field | Description |
---|---|
id | The id of this config |
inputs | A array of inputs of this config |
name | The unique name of this config per connector |
type | The type of this config (LINK/ JOB) |
A typical config object is showing below:
{
id:7,
inputs:[
{
id: 25,
name: "throttlingConfig.numExtractors",
type: "INTEGER",
sensitive: false
},
{
id: 26,
name: "throttlingConfig.numLoaders",
type: "INTEGER",
sensitive: false
}
],
name: "throttlingConfig",
type: "JOB"
}
Each input object in a config is structured below:
Field | Description |
---|---|
id | The id of this input |
name | The unique name of this input per config |
type | The data type of this input field |
size | The length of this input field |
sensitive | Whether this input contain sensitive information |
To send a filled config in the request, you should always use config id and input id to map the values to their correspondig names. For example, the following request contains an input value com.mysql.jdbc.Driver with input id 7 inside a config with id 4 that belongs to a link with id 3
link: {
id: 3,
enabled: true,
link-config-values: [{
id: 4,
inputs: [{
id: 7,
name: "linkConfig.jdbcDriver",
value: "com.mysql.jdbc.Driver",
type: "STRING",
size: 128,
sensitive: false
}, {
id: 8,
name: "linkConfig.connectionString",
value: "jdbc%3Amysql%3A%2F%2Fmysql.ent.cloudera.com%2Fsqoop",
type: "STRING",
size: 128,
sensitive: false
},
...
}
}
Exception Response¶
Each operation on Sqoop server might return an exception in the Http response. Remember to take this into account.The exception code and message could be found in both the header and body of the response.
Please jump to “Header Parameters” section to find how to get exception information from header.
In the body, the exception is expressed in JSON format. An example of the exception is:
{
"message":"DERBYREPO_0030:Unable to load specific job metadata from repository - Couldn't find job with id 2",
"stack-trace":[
{
"file":"DerbyRepositoryHandler.java",
"line":1111,
"class":"org.apache.sqoop.repository.derby.DerbyRepositoryHandler",
"method":"findJob"
},
{
"file":"JdbcRepository.java",
"line":451,
"class":"org.apache.sqoop.repository.JdbcRepository$16",
"method":"doIt"
},
{
"file":"JdbcRepository.java",
"line":90,
"class":"org.apache.sqoop.repository.JdbcRepository",
"method":"doWithConnection"
},
{
"file":"JdbcRepository.java",
"line":61,
"class":"org.apache.sqoop.repository.JdbcRepository",
"method":"doWithConnection"
},
{
"file":"JdbcRepository.java",
"line":448,
"class":"org.apache.sqoop.repository.JdbcRepository",
"method":"findJob"
},
{
"file":"JobRequestHandler.java",
"line":238,
"class":"org.apache.sqoop.handler.JobRequestHandler",
"method":"getJobs"
}
],
"class":"org.apache.sqoop.common.SqoopException"
}
Config and Input Validation Status Response¶
The config and the inputs associated with the connectors also provide custom validation rules for the values given to these input fields. Sqoop applies these custom validators and its corresponding valdation logic when config values for the LINK and JOB are posted.
An example of a OK status with the persisted ID:
{
"id": 3,
"validation-result": [
{}
]
}
An example of ERROR status:
{
"validation-result": [
{
"linkConfig": [
{
"message": "Invalid URI. URI must either be null or a valid URI. Here are a few valid example URIs: hdfs://example.com:8020/, hdfs://example.com/, file:///, file:///tmp, file://localhost/tmp",
"status": "ERROR"
}
]
}
]
}
Job Submission Status Response¶
After starting a job, you could look up the running status of it. There could be 7 possible status:
Status | Description |
---|---|
BOOTING | In the middle of submitting the job |
FAILURE_ON_SUBMIT | Unable to submit this job to remote cluster |
RUNNING | The job is running now |
SUCCEEDED | Job finished successfully |
FAILED | Job failed |
NEVER_EXECUTED | The job has never been executed since created |
UNKNOWN | The status is unknown |
Header Parameters¶
For all Sqoop requests, the following header parameters are supported:
Parameter | Required | Description |
---|---|---|
sqoop-user-name | true | The name of the user who makes the requests |
For all the responses, the following parameters in the HTTP message header are available:
Parameter | Required | Description |
---|---|---|
sqoop-error-code | false | The error code when some error happen in the server side for this request |
sqoop-error-message | false | The explanation for a error code |
So far, there are only these 2 parameters in the header of response message. They only exist when something bad happen in the server. And they always come along with an exception message in the response body.
REST APIs¶
The section elaborates all the rest apis that are supported by the Sqoop server.
/version - [GET] - Get Sqoop Version¶
Get all the version metadata of Sqoop software in the server side.
- Method: GET
- Format: JSON
- Request Content: None
- Fields of Response:
Field | Description |
---|---|
source-revision | The revision number of Sqoop source code |
api-versions | The version of network protocol |
build-date | The Sqoop release date |
user | The user who made the release |
source-url | The url of the source code trunk |
build-version | The version of Sqoop in the server side |
- Response Example:
{
source-url: "git://vbasavaraj.local/Users/vbasavaraj/Projects/SqoopRefactoring/sqoop2/common",
source-revision: "418c5f637c3f09b94ea7fc3b0a4610831373a25f",
build-version: "2.0.0-SNAPSHOT",
api-versions: [
"v1"
],
user: "vbasavaraj",
build-date: "Mon Nov 3 08:18:21 PST 2014"
}
/v1/connectors - [GET] Get all Connectors¶
Get all the connectors registered in Sqoop
- Method: GET
- Format: JSON
- Request Content: None
- Response Example
{
connectors: [{
id: 1,
link-config: [],
job-config: {},
name: "hdfs-connector",
class: "org.apache.sqoop.connector.hdfs.HdfsConnector",
all-config-resources: {},
version: "2.0.0-SNAPSHOT"
}, {
id: 2,
link-config: [],
job-config: {},
name: "generic-jdbc-connector",
class: "org.apache.sqoop.connector.jdbc.GenericJdbcConnector",
all-config - resources: {},
version: "2.0.0-SNAPSHOT"
}]
}
/v1/connector/[cname] or /v1/connector/[cid] - [GET] - Get Connector¶
Provide the id or unique name of the connector in the url [cid] or [cname] part.
- Method: GET
- Format: JSON
- Request Content: None
- Fields of Response:
Field | Description |
---|---|
id | The id for the connector ( registered as a configurable ) |
job-config | Connector job config and inputs for both FROM and TO |
link-config | Connector link config and inputs |
all-config-resources | All config inputs labels and description for the given connector |
version | The build version required for config and input data upgrades |
- Response Example:
{
connector: {
id: 1,
job-config: {
TO: [{
id: 3,
inputs: [{
id: 3,
values: "TEXT_FILE,SEQUENCE_FILE",
name: "toJobConfig.outputFormat",
type: "ENUM",
sensitive: false
}, {
id: 4,
values: "NONE,DEFAULT,DEFLATE,GZIP,BZIP2,LZO,LZ4,SNAPPY,CUSTOM",
name: "toJobConfig.compression",
type: "ENUM",
sensitive: false
}, {
id: 5,
name: "toJobConfig.customCompression",
type: "STRING",
size: 255,
sensitive: false
}, {
id: 6,
name: "toJobConfig.outputDirectory",
type: "STRING",
size: 255,
sensitive: false
}],
name: "toJobConfig",
type: "JOB"
}],
FROM: [{
id: 2,
inputs: [{
id: 2,
name: "fromJobConfig.inputDirectory",
type: "STRING",
size: 255,
sensitive: false
}],
name: "fromJobConfig",
type: "JOB"
}]
},
link-config: [{
id: 1,
inputs: [{
id: 1,
name: "linkConfig.uri",
type: "STRING",
size: 255,
sensitive: false
}],
name: "linkConfig",
type: "LINK"
}],
name: "hdfs-connector",
class: "org.apache.sqoop.connector.hdfs.HdfsConnector",
all-config-resources: {
fromJobConfig.label: "From Job configuration",
toJobConfig.ignored.label: "Ignored",
fromJobConfig.help: "Specifies information required to get data from Hadoop ecosystem",
toJobConfig.ignored.help: "This value is ignored",
toJobConfig.label: "ToJob configuration",
toJobConfig.storageType.label: "Storage type",
fromJobConfig.inputDirectory.label: "Input directory",
toJobConfig.outputFormat.label: "Output format",
toJobConfig.outputDirectory.label: "Output directory",
toJobConfig.outputDirectory.help: "Output directory for final data",
toJobConfig.compression.help: "Compression that should be used for the data",
toJobConfig.outputFormat.help: "Format in which data should be serialized",
toJobConfig.customCompression.label: "Custom compression format",
toJobConfig.compression.label: "Compression format",
linkConfig.label: "Link configuration",
toJobConfig.customCompression.help: "Full class name of the custom compression",
toJobConfig.storageType.help: "Target on Hadoop ecosystem where to store data",
linkConfig.help: "Here you supply information necessary to connect to HDFS",
linkConfig.uri.help: "HDFS URI used to connect to HDFS",
linkConfig.uri.label: "HDFS URI",
fromJobConfig.inputDirectory.help: "Directory that should be exported",
toJobConfig.help: "You must supply the information requested in order to get information where you want to store your data."
},
version: "2.0.0-SNAPSHOT"
}
}
/v1/driver - [GET]- Get Sqoop Driver¶
Driver exposes configurations required for the job execution.
- Method: GET
- Format: JSON
- Request Content: None
- Fields of Response:
Field | Description |
---|---|
id | The id for the driver ( registered as a configurable ) |
job-config | Driver job config and inputs |
version | The build version of the driver |
all-config-resources | Driver exposed config and input labels and description |
- Response Example:
{
id: 3,
job-config: [{
id: 7,
inputs: [{
id: 25,
name: "throttlingConfig.numExtractors",
type: "INTEGER",
sensitive: false
}, {
id: 26,
name: "throttlingConfig.numLoaders",
type: "INTEGER",
sensitive: false
}],
name: "throttlingConfig",
type: "JOB"
}],
all-config-resources: {
throttlingConfig.numExtractors.label: "Extractors",
throttlingConfig.numLoaders.help: "Number of loaders that Sqoop will use",
throttlingConfig.numLoaders.label: "Loaders",
throttlingConfig.label: "Throttling resources",
throttlingConfig.numExtractors.help: "Number of extractors that Sqoop will use",
throttlingConfig.help: "Set throttling boundaries to not overload your systems"
},
version: "1"
}
/v1/links/ - [GET] Get all links¶
Get all the links created in Sqoop
- Method: GET
- Format: JSON
- Request Content: None
- Response Example
{
links: [
{
id: 1,
enabled: true,
update-user: "root",
link-config-values: [],
name: "First Link",
creation-date: 1415309361756,
connector-id: 1,
update-date: 1415309361756,
creation-user: "root"
},
{
id: 2,
enabled: true,
update-user: "root",
link-config-values: [],
name: "Second Link",
creation-date: 1415309390807,
connector-id: 2,
update-date: 1415309390807,
creation-user: "root"
}
]
}
/v1/links?cname=[cname] - [GET] Get all links by Connector¶
Get all the links for a given connector identified by [cname] part.
/v1/link/[lname] or /v1/link/[lid] - [GET] - Get Link¶
Provide the id or unique name of the link in the url [lid] or [lname] part.
Get all the details of the link including the id, name, type and the corresponding config input values for the link
- Method: GET
- Format: JSON
- Request Content: None
- Response Example:
{
link: {
id: 1,
enabled: true,
link-config-values: [{
id: 1,
inputs: [{
id: 1,
name: "linkConfig.uri",
value: "hdfs%3A%2F%2Fnamenode%3A8090",
type: "STRING",
size: 255,
sensitive: false
}],
name: "linkConfig",
type: "LINK"
}],
update-user: "root",
name: "First Link",
creation-date: 1415287846371,
connector-id: 1,
update-date: 1415287846371,
creation-user: "root"
}
}
/v1/link - [POST] - Create Link¶
Create a new link object. Provide values to the link config inputs for the ones that are required.
- Method: POST
- Format: JSON
- Fields of Request:
Field | Description |
---|---|
link | The root of the post data in JSON |
id | The id of the link can be left blank in the post data |
enabled | Whether to enable this link (true/false) |
update-date | The last updated time of this link |
creation-date | The creation time of this link |
update-user | The user who updated this link |
creation-user | The user who created this link |
name | The name of this link |
link-config-values | Config input values for link config for the corresponding connector |
connector-id | The id of the connector used for this link |
- Request Example:
{
link: {
id: -1,
enabled: true,
link-config-values: [{
id: 1,
inputs: [{
id: 1,
name: "linkConfig.uri",
value: "hdfs%3A%2F%2Fvbsqoop-1.ent.cloudera.com%3A8020%2Fuser%2Froot%2Fjob1",
type: "STRING",
size: 255,
sensitive: false
}],
name: "testInput",
type: "LINK"
}],
update-user: "root",
name: "testLink",
creation-date: 1415202223048,
connector-id: 1,
update-date: 1415202223048,
creation-user: "root"
}
}
- Fields of Response:
Field | Description |
---|---|
id | The id assigned for this new created link |
validation-result | The validation status for the link config inputs given in the post data |
- ERROR Response Example:
{
"validation-result": [
{
"linkConfig": [
{
"message": "Invalid URI. URI must either be null or a valid URI. Here are a few valid example URIs: hdfs://example.com:8020/, hdfs://example.com/, file:///, file:///tmp, file://localhost/tmp",
"status": "ERROR"
}
]
}
]
}
/v1/link/[lname] or /v1/link/[lid] - [PUT] - Update Link¶
Update an existing link object with name [lname] or id [lid]. To make the procedure of filling inputs easier, the general practice is get the link first and then change some of the values for the inputs.
- Method: PUT
- Format: JSON
- OK Response Example:
{
"validation-result": [
{}
]
}
/v1/link/[lname] or /v1/link/[lid] - [DELETE] - Delete Link¶
Delete a link with name [lname] or id [lid]
- Method: DELETE
- Format: JSON
- Request Content: None
- Response Content: None
/v1/link/[lid]/enable or /v1/link/[lname]/enable - [PUT] - Enable Link¶
Enable a link with id lid or name lname
- Method: PUT
- Format: JSON
- Request Content: None
- Response Content: None
/v1/link/[lid]/disable - [PUT] - Disable Link¶
Disable a link with id lid or name lname
- Method: PUT
- Format: JSON
- Request Content: None
- Response Content: None
/v1/jobs/ - [GET] Get all jobs¶
Get all the jobs created in Sqoop
- Method: GET
- Format: JSON
- Request Content: None
- Response Example:
{
jobs: [{
driver-config-values: [],
enabled: true,
from-connector-id: 1,
update-user: "root",
to-config-values: [],
to-connector-id: 2,
creation-date: 1415310157618,
update-date: 1415310157618,
creation-user: "root",
id: 1,
to-link-id: 2,
from-config-values: [],
name: "First Job",
from-link-id: 1
},{
driver-config-values: [],
enabled: true,
from-connector-id: 2,
update-user: "root",
to-config-values: [],
to-connector-id: 1,
creation-date: 1415310650600,
update-date: 1415310650600,
creation-user: "root",
id: 2,
to-link-id: 1,
from-config-values: [],
name: "Second Job",
from-link-id: 2
}]
}
/v1/jobs?cname=[cname] - [GET] Get all jobs by connector¶
Get all the jobs for a given connector identified by [cname] part.
/v1/job/[jname] or /v1/job/[jid] - [GET] - Get Job¶
Provide the name or the id of the job in the url [jname] part or [jid] part.
- Method: GET
- Format: JSON
- Request Content: None
- Response Example:
{
job: {
driver-config-values: [{
id: 7,
inputs: [{
id: 25,
name: "throttlingConfig.numExtractors",
value: "3",
type: "INTEGER",
sensitive: false
}, {
id: 26,
name: "throttlingConfig.numLoaders",
value: "3",
type: "INTEGER",
sensitive: false
}],
name: "throttlingConfig",
type: "JOB"
}],
enabled: true,
from-connector-id: 1,
update-user: "root",
to-config-values: [{
id: 6,
inputs: [{
id: 19,
name: "toJobConfig.schemaName",
type: "STRING",
size: 50,
sensitive: false
}, {
id: 20,
name: "toJobConfig.tableName",
value: "text",
type: "STRING",
size: 2000,
sensitive: false
}, {
id: 21,
name: "toJobConfig.sql",
type: "STRING",
size: 50,
sensitive: false
}, {
id: 22,
name: "toJobConfig.columns",
type: "STRING",
size: 50,
sensitive: false
}, {
id: 23,
name: "toJobConfig.stageTableName",
type: "STRING",
size: 2000,
sensitive: false
}, {
id: 24,
name: "toJobConfig.shouldClearStageTable",
type: "BOOLEAN",
sensitive: false
}],
name: "toJobConfig",
type: "JOB"
}],
to-connector-id: 2,
creation-date: 1415310157618,
update-date: 1415310157618,
creation-user: "root",
id: 1,
to-link-id: 2,
from-config-values: [{
id: 2,
inputs: [{
id: 2,
name: "fromJobConfig.inputDirectory",
value: "hdfs%3A%2F%2Fvbsqoop-1.ent.cloudera.com%3A8020%2Fuser%2Froot%2Fjob1",
type: "STRING",
size: 255,
sensitive: false
}],
name: "fromJobConfig",
type: "JOB"
}],
name: "First Job",
from-link- id: 1
}
}
/v1/job - [POST] - Create Job¶
Create a new job object with the corresponding config values.
- Method: POST
- Format: JSON
- Fields of Request:
Field | Description |
---|---|
job | The root of the post data in JSON |
from-link-id | The id of the from link for the job |
to-link-id | The id of the to link for the job |
id | The id of the link can be left blank in the post data |
enabled | Whether to enable this job (true/false) |
update-date | The last updated time of this job |
creation-date | The creation time of this job |
update-user | The user who updated this job |
creation-user | The uset who creates this job |
name | The name of this job |
from-config-values | Config input values for FROM part of the job |
to-config-values | Config input values for TO part of the job |
driver-config-values | Config input values for driver |
connector-id | The id of the connector used for this link |
- Request Example:
{
job: {
driver-config-values: [
{
id: 7,
inputs: [
{
id: 25,
name: "throttlingConfig.numExtractors",
value: "3",
type: "INTEGER",
sensitive: false
},
{
id: 26,
name: "throttlingConfig.numLoaders",
value: "3",
type: "INTEGER",
sensitive: false
}
],
name: "throttlingConfig",
type: "JOB"
}
],
enabled: true,
from-connector-id: 1,
update-user: "root",
to-config-values: [
{
id: 6,
inputs: [
{
id: 19,
name: "toJobConfig.schemaName",
type: "STRING",
size: 50,
sensitive: false
},
{
id: 20,
name: "toJobConfig.tableName",
value: "text",
type: "STRING",
size: 2000,
sensitive: false
},
{
id: 21,
name: "toJobConfig.sql",
type: "STRING",
size: 50,
sensitive: false
},
{
id: 22,
name: "toJobConfig.columns",
type: "STRING",
size: 50,
sensitive: false
},
{
id: 23,
name: "toJobConfig.stageTableName",
type: "STRING",
size: 2000,
sensitive: false
},
{
id: 24,
name: "toJobConfig.shouldClearStageTable",
type: "BOOLEAN",
sensitive: false
}
],
name: "toJobConfig",
type: "JOB"
}
],
to-connector-id: 2,
creation-date: 1415310157618,
update-date: 1415310157618,
creation-user: "root",
id: -1,
to-link-id: 2,
from-config-values: [
{
id: 2,
inputs: [
{
id: 2,
name: "fromJobConfig.inputDirectory",
value: "hdfs%3A%2F%2Fvbsqoop-1.ent.cloudera.com%3A8020%2Fuser%2Froot%2Fjob1",
type: "STRING",
size: 255,
sensitive: false
}
],
name: "fromJobConfig",
type: "JOB"
}
],
name: "Test Job",
from-link-id: 1
}
}
- Fields of Response:
Field | Description | |
---|---|---|
id | The id assigned for this new created job | ||
validation-result | The validation status for the job config and driver config inputs in the post data |
- ERROR Response Example:
{
"validation-result": [
{
"linkConfig": [
{
"message": "Invalid URI. URI must either be null or a valid URI. Here are a few valid example URIs: hdfs://example.com:8020/, hdfs://example.com/, file:///, file:///tmp, file://localhost/tmp",
"status": "ERROR"
}
]
}
]
}
/v1/job/[jid] - [PUT] - Update Job¶
Update an existing job object with id [jid]. To make the procedure of filling inputs easier, the general practice is get the existing job object first and then change some of the inputs.
- Method: PUT
- Format: JSON
The same as Create Job.
- OK Response Example:
{
"validation-result": [
{}
]
}
/v1/job/[jid] - [DELETE] - Delete Job¶
Delete a job with id jid.
- Method: DELETE
- Format: JSON
- Request Content: None
- Response Content: None
/v1/job/[jid]/enable - [PUT] - Enable Job¶
Enable a job with id jid.
- Method: PUT
- Format: JSON
- Request Content: None
- Response Content: None
/v1/job/[jid]/disable - [PUT] - Disable Job¶
Disable a job with id jid.
- Method: PUT
- Format: JSON
- Request Content: None
- Response Content: None
/v1/job/[jid]/start or /v1/job/[jname]/start - [PUT]- Start Job¶
Start a job with name [jname] or with id [jid] to trigger the job execution
- Method: POST
- Format: JSON
- Request Content: None
- Response Content: Submission Record
- BOOTING Response Example
{
"submission": {
"progress": -1,
"last-update-date": 1415312531188,
"external-id": "job_1412137947693_0004",
"status": "BOOTING",
"job": 2,
"creation-date": 1415312531188,
"to-schema": {
"created": 1415312531426,
"name": "HDFS file",
"columns": []
},
"external-link": "http://vbsqoop-1.ent.cloudera.com:8088/proxy/application_1412137947693_0004/",
"from-schema": {
"created": 1415312531342,
"name": "text",
"columns": [
{
"name": "id",
"nullable": true,
"unsigned": null,
"type": "FIXED_POINT",
"size": null
},
{
"name": "txt",
"nullable": true,
"type": "TEXT",
"size": null
}
]
}
}
}
- SUCCEEDED Response Example
{
submission: {
progress: -1,
last-update-date: 1415312809485,
external-id: "job_1412137947693_0004",
status: "SUCCEEDED",
job: 2,
creation-date: 1415312531188,
external-link: "http://vbsqoop-1.ent.cloudera.com:8088/proxy/application_1412137947693_0004/",
counters: {
org.apache.hadoop.mapreduce.JobCounter: {
SLOTS_MILLIS_MAPS: 373553,
MB_MILLIS_MAPS: 382518272,
TOTAL_LAUNCHED_MAPS: 10,
MILLIS_MAPS: 373553,
VCORES_MILLIS_MAPS: 373553,
OTHER_LOCAL_MAPS: 10
},
org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter: {
BYTES_WRITTEN: 0
},
org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter: {
BYTES_READ: 0
},
org.apache.hadoop.mapreduce.TaskCounter: {
MAP_INPUT_RECORDS: 0,
MERGED_MAP_OUTPUTS: 0,
PHYSICAL_MEMORY_BYTES: 4065599488,
SPILLED_RECORDS: 0,
COMMITTED_HEAP_BYTES: 3439853568,
CPU_MILLISECONDS: 236900,
FAILED_SHUFFLE: 0,
VIRTUAL_MEMORY_BYTES: 15231422464,
SPLIT_RAW_BYTES: 1187,
MAP_OUTPUT_RECORDS: 1000000,
GC_TIME_MILLIS: 7282
},
org.apache.hadoop.mapreduce.FileSystemCounter: {
FILE_WRITE_OPS: 0,
FILE_READ_OPS: 0,
FILE_LARGE_READ_OPS: 0,
FILE_BYTES_READ: 0,
HDFS_BYTES_READ: 1187,
FILE_BYTES_WRITTEN: 1191230,
HDFS_LARGE_READ_OPS: 0,
HDFS_WRITE_OPS: 10,
HDFS_READ_OPS: 10,
HDFS_BYTES_WRITTEN: 276389736
},
org.apache.sqoop.submission.counter.SqoopCounters: {
ROWS_READ: 1000000
}
}
}
}
- ERROR Response Example
{
"submission": {
"progress": -1,
"last-update-date": 1415312390570,
"status": "FAILURE_ON_SUBMIT",
"exception": "org.apache.sqoop.common.SqoopException: GENERIC_HDFS_CONNECTOR_0000:Error occurs during partitioner run",
"job": 1,
"creation-date": 1415312390570,
"to-schema": {
"created": 1415312390797,
"name": "text",
"columns": [
{
"name": "id",
"nullable": true,
"unsigned": null,
"type": "FIXED_POINT",
"size": null
},
{
"name": "txt",
"nullable": true,
"type": "TEXT",
"size": null
}
]
},
"from-schema": {
"created": 1415312390778,
"name": "HDFS file",
"columns": [
]
},
"exception-trace": "org.apache.sqoop.common.SqoopException: GENERIC_HDFS_CONNECTOR_00"
}
}
/v1/job/[jid]/stop or /v1/job/[jname]/stop - [PUT]- Stop Job¶
Stop a job with name [janme] or with id [jid] to abort the running job.
- Method: PUT
- Format: JSON
- Request Content: None
- Response Content: Submission Record
/v1/job/[jid]/status or /v1/job/[jname]/status - [GET]- Get Job Status¶
Get status of the running job with name [janme] or with id [jid]
- Method: GET
- Format: JSON
- Request Content: None
- Response Content: Submission Record
{
"submission": {
"progress": 0.25,
"last-update-date": 1415312603838,
"external-id": "job_1412137947693_0004",
"status": "RUNNING",
"job": 2,
"creation-date": 1415312531188,
"external-link": "http://vbsqoop-1.ent.cloudera.com:8088/proxy/application_1412137947693_0004/"
}
}
/v1/submissions? - [GET] - Get all job Submissions¶
Get all the submissions for every job started in SQoop
/v1/submissions?jname=[jname] - [GET] - Get Submissions by Job¶
Retrieve all job submissions in the past for the given job. Each submission record will have details such as the status, counters and urls for those submissions.
Provide the name of the job in the url [jname] part.
- Method: GET
- Format: JSON
- Request Content: None
- Fields of Response:
Field | Description |
---|---|
progress | The progress of the running Sqoop job |
job | The id of the Sqoop job |
creation-date | The submission timestamp |
last-update-date | The timestamp of the last status update |
status | The status of this job submission |
external-id | The job id of Sqoop job running on Hadoop |
external-link | The link to track the job status on Hadoop |
- Response Example:
{
submissions: [
{
progress: -1,
last-update-date: 1415312809485,
external-id: "job_1412137947693_0004",
status: "SUCCEEDED",
job: 2,
creation-date: 1415312531188,
external-link: "http://vbsqoop-1.ent.cloudera.com:8088/proxy/application_1412137947693_0004/",
counters: {
org.apache.hadoop.mapreduce.JobCounter: {
SLOTS_MILLIS_MAPS: 373553,
MB_MILLIS_MAPS: 382518272,
TOTAL_LAUNCHED_MAPS: 10,
MILLIS_MAPS: 373553,
VCORES_MILLIS_MAPS: 373553,
OTHER_LOCAL_MAPS: 10
},
org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter: {
BYTES_WRITTEN: 0
},
org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter: {
BYTES_READ: 0
},
org.apache.hadoop.mapreduce.TaskCounter: {
MAP_INPUT_RECORDS: 0,
MERGED_MAP_OUTPUTS: 0,
PHYSICAL_MEMORY_BYTES: 4065599488,
SPILLED_RECORDS: 0,
COMMITTED_HEAP_BYTES: 3439853568,
CPU_MILLISECONDS: 236900,
FAILED_SHUFFLE: 0,
VIRTUAL_MEMORY_BYTES: 15231422464,
SPLIT_RAW_BYTES: 1187,
MAP_OUTPUT_RECORDS: 1000000,
GC_TIME_MILLIS: 7282
},
org.apache.hadoop.mapreduce.FileSystemCounter: {
FILE_WRITE_OPS: 0,
FILE_READ_OPS: 0,
FILE_LARGE_READ_OPS: 0,
FILE_BYTES_READ: 0,
HDFS_BYTES_READ: 1187,
FILE_BYTES_WRITTEN: 1191230,
HDFS_LARGE_READ_OPS: 0,
HDFS_WRITE_OPS: 10,
HDFS_READ_OPS: 10,
HDFS_BYTES_WRITTEN: 276389736
},
org.apache.sqoop.submission.counter.SqoopCounters: {
ROWS_READ: 1000000
}
}
},
{
progress: -1,
last-update-date: 1415312390570,
status: "FAILURE_ON_SUBMIT",
exception: "org.apache.sqoop.common.SqoopException: GENERIC_HDFS_CONNECTOR_0000:Error occurs during partitioner run",
job: 1,
creation-date: 1415312390570,
exception-trace: "org.apache.sqoop.common.SqoopException: GENERIC_HDFS_CONNECTOR_0000:Error occurs during partitioner...."
}
]
}