Sqoop REST API Guide ¶

This document will explain how you can use Sqoop REST API to build applications interacting with Sqoop server. The REST API covers all aspects of managing Sqoop jobs and allows you to build an app in any programming language using HTTP over JSON.

Table of Contents

Sqoop REST API Guide

Initialization ¶

Before continuing further, make sure that the Sqoop server is running.

Then find out the details of the Sqoop server: host, port and webapp, and keep them in mind. Note that the sqoop server is running on Apache Tomcat. To exercise a REST API for Sqoop, you could assemble and send a HTTP request to an url corresponding to that API. Generally, the url contains the host on which the sqoop server is running, the port at which the sqoop server is listening to and webapp, the context path at which the Sqoop server is registered in the Apache Tomcat engine.

Certain requests might need to contain some additional query parameters and post data. These parameters could be given via the HTTP headers, request body or both. All the content in the HTTP body is in JSON format.

Understand Connector, Driver, Link and Job ¶

To create and run a Sqoop Job, we need to provide config values for connecting to a data source and then processing the data in that data source. Processing might be either reading from or writing to the data source. Thus we have configurable entities such as the From and To parts of the connectors, the driver that each expose configs and one or more inputs within them.

For instance a connector that represents a relational data source such as MySQL will expose config classes for connecting to the database. Some of the relevant inputs are the connection string, driver class, the username and the password to connect to the database. These configs remain the same to read data from any of the tables within that database. Hence they are grouped under LinkConfiguration.

Each connector can support Reading from a data source and/or writing/to a data source it represents. Reading from and writing to a data source are represented by From and To respectively. Specific configurations are required to peform the job of reading from or writing to the data source. These are grouped in the FromJobConfiguration and ToJobConfiguration objects of the connector.

For instance, a connector that represents a relational data source such as MySQL will expose the table name to read from or the SQL query to use while reading data as a FromJobConfiguration. Similarly a connector that represents a data source such as HDFS, will expose the output directory to write to as a ToJobConfiguration.

Objects ¶

This section covers all the objects that might exist in an API request and/or API response.

Configs and Inputs ¶

Before creating any link for a connector or a job with associated From and To links, the first thing to do is getting familiar with all the configurations that the connector exposes.

Each config consists of the following information

Field	Description
`id`	The id of this config
`inputs`	A array of inputs of this config
`name`	The unique name of this config per connector
`type`	The type of this config (LINK/ JOB)

A typical config object is showing below:

 {
  id:7,
  inputs:[
    {
       id: 25,
       name: "throttlingConfig.numExtractors",
       type: "INTEGER",
       sensitive: false
    },
    {
       id: 26,
       name: "throttlingConfig.numLoaders",
       type: "INTEGER",
       sensitive: false
     }
  ],
  name: "throttlingConfig",
  type: "JOB"
}

Each input object in a config is structured below:

Field	Description
`id`	The id of this input
`name`	The unique name of this input per config
`type`	The data type of this input field
`size`	The length of this input field
`sensitive`	Whether this input contain sensitive information

To send a filled config in the request, you should always use config id and input id to map the values to their correspondig names. For example, the following request contains an input value com.mysql.jdbc.Driver with input id 7 inside a config with id 4 that belongs to a link with id 3

link: {
      id: 3,
      enabled: true,
      link-config-values: [{
          id: 4,
          inputs: [{
              id: 7,
              name: "linkConfig.jdbcDriver",
              value: "com.mysql.jdbc.Driver",
              type: "STRING",
              size: 128,
              sensitive: false
          }, {
              id: 8,
              name: "linkConfig.connectionString",
              value: "jdbc%3Amysql%3A%2F%2Fmysql.ent.cloudera.com%2Fsqoop",
              type: "STRING",
              size: 128,
              sensitive: false
          },
          ...
       }
     }

Exception Response ¶

Each operation on Sqoop server might return an exception in the Http response. Remember to take this into account.The exception code and message could be found in both the header and body of the response.

Please jump to “Header Parameters” section to find how to get exception information from header.

In the body, the exception is expressed in JSON format. An example of the exception is:

{
  "message":"DERBYREPO_0030:Unable to load specific job metadata from repository - Couldn't find job with id 2",
  "stack-trace":[
    {
      "file":"DerbyRepositoryHandler.java",
      "line":1111,
      "class":"org.apache.sqoop.repository.derby.DerbyRepositoryHandler",
      "method":"findJob"
    },
    {
      "file":"JdbcRepository.java",
      "line":451,
      "class":"org.apache.sqoop.repository.JdbcRepository$16",
      "method":"doIt"
    },
    {
      "file":"JdbcRepository.java",
      "line":90,
      "class":"org.apache.sqoop.repository.JdbcRepository",
      "method":"doWithConnection"
    },
    {
      "file":"JdbcRepository.java",
      "line":61,
      "class":"org.apache.sqoop.repository.JdbcRepository",
      "method":"doWithConnection"
    },
    {
      "file":"JdbcRepository.java",
      "line":448,
      "class":"org.apache.sqoop.repository.JdbcRepository",
      "method":"findJob"
    },
    {
      "file":"JobRequestHandler.java",
      "line":238,
      "class":"org.apache.sqoop.handler.JobRequestHandler",
      "method":"getJobs"
    }
  ],
  "class":"org.apache.sqoop.common.SqoopException"
}

Config and Input Validation Status Response ¶

The config and the inputs associated with the connectors also provide custom validation rules for the values given to these input fields. Sqoop applies these custom validators and its corresponding valdation logic when config values for the LINK and JOB are posted.

An example of a OK status with the persisted ID:

{
   "id": 3,
   "validation-result": [
       {}
   ]
}

An example of ERROR status:

{
  "validation-result": [
    {
     "linkConfig": [
       {
         "message": "Invalid URI. URI must either be null or a valid URI. Here are a few valid example URIs: hdfs://example.com:8020/, hdfs://example.com/, file:///, file:///tmp, file://localhost/tmp",
         "status": "ERROR"
       }
     ]
   }
  ]
}

Job Submission Status Response ¶

After starting a job, you could look up the running status of it. There could be 7 possible status:

Status	Description
`BOOTING`	In the middle of submitting the job
`FAILURE_ON_SUBMIT`	Unable to submit this job to remote cluster
`RUNNING`	The job is running now
`SUCCEEDED`	Job finished successfully
`FAILED`	Job failed
`NEVER_EXECUTED`	The job has never been executed since created
`UNKNOWN`	The status is unknown

Header Parameters ¶

For all the responses, the following parameters in the HTTP message header are available:

Parameter	Required	Description
`sqoop-error-code`	false	The error code when some error happen in the server side for this request
`sqoop-error-message`	false	The explanation for a error code

So far, there are only these 2 parameters in the header of response message. They only exist when something bad happen in the server. And they always come along with an exception message in the response body.

REST APIs ¶

The section elaborates all the rest apis that are supported by the Sqoop server.

For all Sqoop requests, the following request parameters will be added automatically. However, this user name is only in simple mode. In Kerberos mode, this user name will be ignored by Sqoop server and user name in UGI which is authenticated by Kerberos server will be used instead.

Parameter	Description
`user.name`	The name of the user who makes the requests

/version - [GET] - Get Sqoop Version ¶

Get all the version metadata of Sqoop software in the server side.

Method: GET
Format: JSON
Request Content: None
Fields of Response:

Field	Description
`source-revision`	The revision number of Sqoop source code
`api-versions`	The version of network protocol
`build-date`	The Sqoop release date
`user`	The user who made the release
`source-url`	The url of the source code trunk
`build-version`	The version of Sqoop in the server side

Response Example:

{
 source-url: "git://vbasavaraj.local/Users/vbasavaraj/Projects/SqoopRefactoring/sqoop2/common",
 source-revision: "418c5f637c3f09b94ea7fc3b0a4610831373a25f",
 build-version: "2.0.0-SNAPSHOT",
 api-versions: [
    "v1"
  ],
 user: "vbasavaraj",
 build-date: "Mon Nov 3 08:18:21 PST 2014"
}

/v1/connectors - [GET] Get all Connectors ¶

Get all the connectors registered in Sqoop

Method: GET
Format: JSON
Request Content: None
Response Example

{
  connectors: [{
      id: 1,
      link-config: [],
      job-config: {},
      name: "hdfs-connector",
      class: "org.apache.sqoop.connector.hdfs.HdfsConnector",
      all-config-resources: {},
      version: "2.0.0-SNAPSHOT"
  }, {
      id: 2,
      link-config: [],
      job-config: {},
      name: "generic-jdbc-connector",
      class: "org.apache.sqoop.connector.jdbc.GenericJdbcConnector",
      all-config - resources: {},
      version: "2.0.0-SNAPSHOT"
  }]
}

/v1/connector/[cname] or /v1/connector/[cid] - [GET] - Get Connector ¶

Provide the id or unique name of the connector in the url [cid] or [cname] part.

Method: GET
Format: JSON
Request Content: None
Fields of Response:

Field	Description
`id`	The id for the connector ( registered as a configurable )
`job-config`	Connector job config and inputs for both FROM and TO
`link-config`	Connector link config and inputs
`all-config-resources`	All config inputs labels and description for the given connector
`version`	The build version required for config and input data upgrades

Response Example:

{
 connector: {
     id: 1,
     job-config: {
         TO: [{
             id: 3,
             inputs: [{
                 id: 3,
                 values: "TEXT_FILE,SEQUENCE_FILE",
                 name: "toJobConfig.outputFormat",
                 type: "ENUM",
                 sensitive: false
             }, {
                 id: 4,
                 values: "NONE,DEFAULT,DEFLATE,GZIP,BZIP2,LZO,LZ4,SNAPPY,CUSTOM",
                 name: "toJobConfig.compression",
                 type: "ENUM",
                 sensitive: false
             }, {
                 id: 5,
                 name: "toJobConfig.customCompression",
                 type: "STRING",
                 size: 255,
                 sensitive: false
             }, {
                 id: 6,
                 name: "toJobConfig.outputDirectory",
                 type: "STRING",
                 size: 255,
                 sensitive: false
             }],
             name: "toJobConfig",
             type: "JOB"
         }],
         FROM: [{
             id: 2,
             inputs: [{
                 id: 2,
                 name: "fromJobConfig.inputDirectory",
                 type: "STRING",
                 size: 255,
                 sensitive: false
             }],
             name: "fromJobConfig",
             type: "JOB"
         }]
     },
     link-config: [{
         id: 1,
         inputs: [{
             id: 1,
             name: "linkConfig.uri",
             type: "STRING",
             size: 255,
             sensitive: false
         }],
         name: "linkConfig",
         type: "LINK"
     }],
     name: "hdfs-connector",
     class: "org.apache.sqoop.connector.hdfs.HdfsConnector",
     all-config-resources: {
         fromJobConfig.label: "From Job configuration",
             toJobConfig.ignored.label: "Ignored",
             fromJobConfig.help: "Specifies information required to get data from Hadoop ecosystem",
             toJobConfig.ignored.help: "This value is ignored",
             toJobConfig.label: "ToJob configuration",
             toJobConfig.storageType.label: "Storage type",
             fromJobConfig.inputDirectory.label: "Input directory",
             toJobConfig.outputFormat.label: "Output format",
             toJobConfig.outputDirectory.label: "Output directory",
             toJobConfig.outputDirectory.help: "Output directory for final data",
             toJobConfig.compression.help: "Compression that should be used for the data",
             toJobConfig.outputFormat.help: "Format in which data should be serialized",
             toJobConfig.customCompression.label: "Custom compression format",
             toJobConfig.compression.label: "Compression format",
             linkConfig.label: "Link configuration",
             toJobConfig.customCompression.help: "Full class name of the custom compression",
             toJobConfig.storageType.help: "Target on Hadoop ecosystem where to store data",
             linkConfig.help: "Here you supply information necessary to connect to HDFS",
             linkConfig.uri.help: "HDFS URI used to connect to HDFS",
             linkConfig.uri.label: "HDFS URI",
             fromJobConfig.inputDirectory.help: "Directory that should be exported",
             toJobConfig.help: "You must supply the information requested in order to get information where you want to store your data."
     },
     version: "2.0.0-SNAPSHOT"
  }
}

/v1/driver - [GET]- Get Sqoop Driver ¶

Driver exposes configurations required for the job execution.

Method: GET
Format: JSON
Request Content: None
Fields of Response:

Field	Description
`id`	The id for the driver ( registered as a configurable )
`job-config`	Driver job config and inputs
`version`	The build version of the driver
`all-config-resources`	Driver exposed config and input labels and description

Response Example:

{
   id: 3,
   job-config: [{
       id: 7,
       inputs: [{
           id: 25,
           name: "throttlingConfig.numExtractors",
           type: "INTEGER",
           sensitive: false
       }, {
           id: 26,
           name: "throttlingConfig.numLoaders",
           type: "INTEGER",
           sensitive: false
       }],
       name: "throttlingConfig",
       type: "JOB"
   }],
   all-config-resources: {
       throttlingConfig.numExtractors.label: "Extractors",
           throttlingConfig.numLoaders.help: "Number of loaders that Sqoop will use",
           throttlingConfig.numLoaders.label: "Loaders",
           throttlingConfig.label: "Throttling resources",
           throttlingConfig.numExtractors.help: "Number of extractors that Sqoop will use",
           throttlingConfig.help: "Set throttling boundaries to not overload your systems"
   },
   version: "1"
}

/v1/links/ - [GET] Get all links ¶

Get all the links created in Sqoop

Method: GET
Format: JSON
Request Content: None
Response Example

{
  links: [
    {
      id: 1,
      enabled: true,
      update-user: "root",
      link-config-values: [],
      name: "First Link",
      creation-date: 1415309361756,
      connector-id: 1,
      update-date: 1415309361756,
      creation-user: "root"
    },
    {
      id: 2,
      enabled: true,
      update-user: "root",
      link-config-values: [],
      name: "Second Link",
      creation-date: 1415309390807,
      connector-id: 2,
      update-date: 1415309390807,
      creation-user: "root"
    }
  ]
}

/v1/links?cname=[cname] - [GET] Get all links by Connector ¶

Get all the links for a given connector identified by [cname] part.

/v1/link/[lname] or /v1/link/[lid] - [GET] - Get Link ¶

Provide the id or unique name of the link in the url [lid] or [lname] part.

Get all the details of the link including the id, name, type and the corresponding config input values for the link

Method: GET
Format: JSON
Request Content: None
Response Example:

{
   link: {
       id: 1,
       enabled: true,
       link-config-values: [{
           id: 1,
           inputs: [{
               id: 1,
               name: "linkConfig.uri",
               value: "hdfs%3A%2F%2Fnamenode%3A8090",
               type: "STRING",
               size: 255,
               sensitive: false
           }],
           name: "linkConfig",
           type: "LINK"
       }],
       update-user: "root",
       name: "First Link",
       creation-date: 1415287846371,
       connector-id: 1,
       update-date: 1415287846371,
       creation-user: "root"
   }
}

/v1/link - [POST] - Create Link ¶

Create a new link object. Provide values to the link config inputs for the ones that are required.

Method: POST
Format: JSON
Fields of Request:

Field	Description
`link`	The root of the post data in JSON
`id`	The id of the link can be left blank in the post data
`enabled`	Whether to enable this link (true/false)
`update-date`	The last updated time of this link
`creation-date`	The creation time of this link
`update-user`	The user who updated this link
`creation-user`	The user who created this link
`name`	The name of this link
`link-config-values`	Config input values for link config for the corresponding connector
`connector-id`	The id of the connector used for this link

Request Example:

{
  link: {
      id: -1,
      enabled: true,
      link-config-values: [{
          id: 1,
          inputs: [{
              id: 1,
              name: "linkConfig.uri",
              value: "hdfs%3A%2F%2Fvbsqoop-1.ent.cloudera.com%3A8020%2Fuser%2Froot%2Fjob1",
              type: "STRING",
              size: 255,
              sensitive: false
          }],
          name: "testInput",
          type: "LINK"
      }],
      update-user: "root",
      name: "testLink",
      creation-date: 1415202223048,
      connector-id: 1,
      update-date: 1415202223048,
      creation-user: "root"
  }
}

Fields of Response:

Field	Description
`id`	The id assigned for this new created link
`validation-result`	The validation status for the link config inputs given in the post data

ERROR Response Example:

{
  "validation-result": [
      {
          "linkConfig": [
              {
                  "message": "Invalid URI. URI must either be null or a valid URI. Here are a few valid example URIs: hdfs://example.com:8020/, hdfs://example.com/, file:///, file:///tmp, file://localhost/tmp",
                  "status": "ERROR"
              }
          ]
      }
  ]
}

/v1/link/[lname] or /v1/link/[lid] - [PUT] - Update Link ¶

Update an existing link object with name [lname] or id [lid]. To make the procedure of filling inputs easier, the general practice is get the link first and then change some of the values for the inputs.

Method: PUT
Format: JSON
OK Response Example:

{
  "validation-result": [
      {}
  ]
}

/v1/link/[lname] or /v1/link/[lid] - [DELETE] - Delete Link ¶

Delete a link with name [lname] or id [lid]

Method: DELETE
Format: JSON
Request Content: None
Response Content: None

/v1/link/[lid]/enable or /v1/link/[lname]/enable - [PUT] - Enable Link ¶

Enable a link with id lid or name lname

Method: PUT
Format: JSON
Request Content: None
Response Content: None

/v1/link/[lid]/disable - [PUT] - Disable Link ¶

Disable a link with id lid or name lname

Method: PUT
Format: JSON
Request Content: None
Response Content: None

/v1/jobs/ - [GET] Get all jobs ¶

Get all the jobs created in Sqoop

Method: GET
Format: JSON
Request Content: None
Response Example:

{
   jobs: [{
      driver-config-values: [],
          enabled: true,
          from-connector-id: 1,
          update-user: "root",
          to-config-values: [],
          to-connector-id: 2,
          creation-date: 1415310157618,
          update-date: 1415310157618,
          creation-user: "root",
          id: 1,
          to-link-id: 2,
          from-config-values: [],
          name: "First Job",
          from-link-id: 1
     },{
      driver-config-values: [],
          enabled: true,
          from-connector-id: 2,
          update-user: "root",
          to-config-values: [],
          to-connector-id: 1,
          creation-date: 1415310650600,
          update-date: 1415310650600,
          creation-user: "root",
          id: 2,
          to-link-id: 1,
          from-config-values: [],
          name: "Second Job",
          from-link-id: 2
     }]
}

/v1/jobs?cname=[cname] - [GET] Get all jobs by connector ¶

Get all the jobs for a given connector identified by [cname] part.

/v1/job/[jname] or /v1/job/[jid] - [GET] - Get Job ¶

Provide the name or the id of the job in the url [jname] part or [jid] part.

Method: GET
Format: JSON
Request Content: None
Response Example:

 {
   job: {
       driver-config-values: [{
               id: 7,
               inputs: [{
                   id: 25,
                   name: "throttlingConfig.numExtractors",
                   value: "3",
                   type: "INTEGER",
                   sensitive: false
               }, {
                   id: 26,
                   name: "throttlingConfig.numLoaders",
                   value: "3",
                   type: "INTEGER",
                   sensitive: false
               }],
               name: "throttlingConfig",
               type: "JOB"
           }],
           enabled: true,
           from-connector-id: 1,
           update-user: "root",
           to-config-values: [{
               id: 6,
               inputs: [{
                   id: 19,
                   name: "toJobConfig.schemaName",
                   type: "STRING",
                   size: 50,
                   sensitive: false
               }, {
                   id: 20,
                   name: "toJobConfig.tableName",
                   value: "text",
                   type: "STRING",
                   size: 2000,
                   sensitive: false
               }, {
                   id: 21,
                   name: "toJobConfig.sql",
                   type: "STRING",
                   size: 50,
                   sensitive: false
               }, {
                   id: 22,
                   name: "toJobConfig.columns",
                   type: "STRING",
                   size: 50,
                   sensitive: false
               }, {
                   id: 23,
                   name: "toJobConfig.stageTableName",
                   type: "STRING",
                   size: 2000,
                   sensitive: false
               }, {
                   id: 24,
                   name: "toJobConfig.shouldClearStageTable",
                   type: "BOOLEAN",
                   sensitive: false
               }],
               name: "toJobConfig",
               type: "JOB"
           }],
           to-connector-id: 2,
           creation-date: 1415310157618,
           update-date: 1415310157618,
           creation-user: "root",
           id: 1,
           to-link-id: 2,
           from-config-values: [{
               id: 2,
               inputs: [{
                   id: 2,
                   name: "fromJobConfig.inputDirectory",
                   value: "hdfs%3A%2F%2Fvbsqoop-1.ent.cloudera.com%3A8020%2Fuser%2Froot%2Fjob1",
                   type: "STRING",
                   size: 255,
                   sensitive: false
               }],
               name: "fromJobConfig",
               type: "JOB"
           }],
           name: "First Job",
           from-link- id: 1
   }
}

/v1/job - [POST] - Create Job ¶

Create a new job object with the corresponding config values.

Method: POST
Format: JSON
Fields of Request:

Field	Description
`job`	The root of the post data in JSON
`from-link-id`	The id of the from link for the job
`to-link-id`	The id of the to link for the job
`id`	The id of the link can be left blank in the post data
`enabled`	Whether to enable this job (true/false)
`update-date`	The last updated time of this job
`creation-date`	The creation time of this job
`update-user`	The user who updated this job
`creation-user`	The uset who creates this job
`name`	The name of this job
`from-config-values`	Config input values for FROM part of the job
`to-config-values`	Config input values for TO part of the job
`driver-config-values`	Config input values for driver
`connector-id`	The id of the connector used for this link

Request Example:

{
  job: {
    driver-config-values: [
      {
        id: 7,
        inputs: [
          {
            id: 25,
            name: "throttlingConfig.numExtractors",
            value: "3",
            type: "INTEGER",
            sensitive: false
          },
          {
            id: 26,
            name: "throttlingConfig.numLoaders",
            value: "3",
            type: "INTEGER",
            sensitive: false
          }
        ],
        name: "throttlingConfig",
        type: "JOB"
      }
    ],
    enabled: true,
    from-connector-id: 1,
    update-user: "root",
    to-config-values: [
      {
        id: 6,
        inputs: [
          {
            id: 19,
            name: "toJobConfig.schemaName",
            type: "STRING",
            size: 50,
            sensitive: false
          },
          {
            id: 20,
            name: "toJobConfig.tableName",
            value: "text",
            type: "STRING",
            size: 2000,
            sensitive: false
          },
          {
            id: 21,
            name: "toJobConfig.sql",
            type: "STRING",
            size: 50,
            sensitive: false
          },
          {
            id: 22,
            name: "toJobConfig.columns",
            type: "STRING",
            size: 50,
            sensitive: false
          },
          {
            id: 23,
            name: "toJobConfig.stageTableName",
            type: "STRING",
            size: 2000,
            sensitive: false
          },
          {
            id: 24,
            name: "toJobConfig.shouldClearStageTable",
            type: "BOOLEAN",
            sensitive: false
          }
        ],
        name: "toJobConfig",
        type: "JOB"
      }
    ],
    to-connector-id: 2,
    creation-date: 1415310157618,
    update-date: 1415310157618,
    creation-user: "root",
    id: -1,
    to-link-id: 2,
    from-config-values: [
      {
        id: 2,
        inputs: [
          {
            id: 2,
            name: "fromJobConfig.inputDirectory",
            value: "hdfs%3A%2F%2Fvbsqoop-1.ent.cloudera.com%3A8020%2Fuser%2Froot%2Fjob1",
            type: "STRING",
            size: 255,
            sensitive: false
          }
        ],
        name: "fromJobConfig",
        type: "JOB"
      }
    ],
    name: "Test Job",
    from-link-id: 1
   }
 }

Fields of Response:

Field		Description
`id` \| The id assigned for this new created job
`validation-result` \| The validation status for the job config and driver config inputs in the post data

ERROR Response Example:

{
  "validation-result": [
      {
          "linkConfig": [
              {
                  "message": "Invalid URI. URI must either be null or a valid URI. Here are a few valid example URIs: hdfs://example.com:8020/, hdfs://example.com/, file:///, file:///tmp, file://localhost/tmp",
                  "status": "ERROR"
              }
          ]
      }
  ]
}

/v1/job/[jid] - [PUT] - Update Job ¶

Update an existing job object with id [jid]. To make the procedure of filling inputs easier, the general practice is get the existing job object first and then change some of the inputs.

Method: PUT
Format: JSON

The same as Create Job.

OK Response Example:

{
  "validation-result": [
      {}
  ]
}

/v1/job/[jid] - [DELETE] - Delete Job ¶

Delete a job with id jid.

Method: DELETE
Format: JSON
Request Content: None
Response Content: None

/v1/job/[jid]/enable - [PUT] - Enable Job ¶

Enable a job with id jid.

Method: PUT
Format: JSON
Request Content: None
Response Content: None

/v1/job/[jid]/disable - [PUT] - Disable Job ¶

Disable a job with id jid.

Method: PUT
Format: JSON
Request Content: None
Response Content: None

/v1/job/[jid]/start or /v1/job/[jname]/start - [PUT]- Start Job ¶

Start a job with name [jname] or with id [jid] to trigger the job execution

Method: POST
Format: JSON
Request Content: None
Response Content: Submission Record
BOOTING Response Example

{
  "submission": {
    "progress": -1,
    "last-update-date": 1415312531188,
    "external-id": "job_1412137947693_0004",
    "status": "BOOTING",
    "job": 2,
    "creation-date": 1415312531188,
    "to-schema": {
      "created": 1415312531426,
      "name": "HDFS file",
      "columns": []
    },
    "external-link": "http://vbsqoop-1.ent.cloudera.com:8088/proxy/application_1412137947693_0004/",
    "from-schema": {
      "created": 1415312531342,
      "name": "text",
      "columns": [
        {
          "name": "id",
          "nullable": true,
          "unsigned": null,
          "type": "FIXED_POINT",
          "size": null
        },
        {
          "name": "txt",
          "nullable": true,
          "type": "TEXT",
          "size": null
        }
      ]
    }
  }
}

SUCCEEDED Response Example

{
  submission: {
    progress: -1,
    last-update-date: 1415312809485,
    external-id: "job_1412137947693_0004",
    status: "SUCCEEDED",
    job: 2,
    creation-date: 1415312531188,
    external-link: "http://vbsqoop-1.ent.cloudera.com:8088/proxy/application_1412137947693_0004/",
    counters: {
      org.apache.hadoop.mapreduce.JobCounter: {
        SLOTS_MILLIS_MAPS: 373553,
        MB_MILLIS_MAPS: 382518272,
        TOTAL_LAUNCHED_MAPS: 10,
        MILLIS_MAPS: 373553,
        VCORES_MILLIS_MAPS: 373553,
        OTHER_LOCAL_MAPS: 10
      },
      org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter: {
        BYTES_WRITTEN: 0
      },
      org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter: {
        BYTES_READ: 0
      },
      org.apache.hadoop.mapreduce.TaskCounter: {
        MAP_INPUT_RECORDS: 0,
        MERGED_MAP_OUTPUTS: 0,
        PHYSICAL_MEMORY_BYTES: 4065599488,
        SPILLED_RECORDS: 0,
        COMMITTED_HEAP_BYTES: 3439853568,
        CPU_MILLISECONDS: 236900,
        FAILED_SHUFFLE: 0,
        VIRTUAL_MEMORY_BYTES: 15231422464,
        SPLIT_RAW_BYTES: 1187,
        MAP_OUTPUT_RECORDS: 1000000,
        GC_TIME_MILLIS: 7282
      },
      org.apache.hadoop.mapreduce.FileSystemCounter: {
        FILE_WRITE_OPS: 0,
        FILE_READ_OPS: 0,
        FILE_LARGE_READ_OPS: 0,
        FILE_BYTES_READ: 0,
        HDFS_BYTES_READ: 1187,
        FILE_BYTES_WRITTEN: 1191230,
        HDFS_LARGE_READ_OPS: 0,
        HDFS_WRITE_OPS: 10,
        HDFS_READ_OPS: 10,
        HDFS_BYTES_WRITTEN: 276389736
      },
      org.apache.sqoop.submission.counter.SqoopCounters: {
        ROWS_READ: 1000000
      }
    }
  }
}

ERROR Response Example

{
  "submission": {
    "progress": -1,
    "last-update-date": 1415312390570,
    "status": "FAILURE_ON_SUBMIT",
    "error-summary": "org.apache.sqoop.common.SqoopException: GENERIC_HDFS_CONNECTOR_0000:Error occurs during partitioner run",
    "job": 1,
    "creation-date": 1415312390570,
    "to-schema": {
      "created": 1415312390797,
      "name": "text",
      "columns": [
        {
          "name": "id",
          "nullable": true,
          "unsigned": null,
          "type": "FIXED_POINT",
          "size": null
        },
        {
          "name": "txt",
          "nullable": true,
          "type": "TEXT",
          "size": null
        }
      ]
    },
    "from-schema": {
      "created": 1415312390778,
      "name": "HDFS file",
      "columns": [
      ]
    },
    "error-details": "org.apache.sqoop.common.SqoopException: GENERIC_HDFS_CONNECTOR_00"
  }
}

/v1/job/[jid]/stop or /v1/job/[jname]/stop - [PUT]- Stop Job ¶

Stop a job with name [janme] or with id [jid] to abort the running job.

Method: PUT
Format: JSON
Request Content: None
Response Content: Submission Record

/v1/job/[jid]/status or /v1/job/[jname]/status - [GET]- Get Job Status ¶

Get status of the running job with name [janme] or with id [jid]

Method: GET
Format: JSON
Request Content: None
Response Content: Submission Record

{
    "submission": {
        "progress": 0.25,
        "last-update-date": 1415312603838,
        "external-id": "job_1412137947693_0004",
        "status": "RUNNING",
        "job": 2,
        "creation-date": 1415312531188,
        "external-link": "http://vbsqoop-1.ent.cloudera.com:8088/proxy/application_1412137947693_0004/"
    }
}

/v1/submissions? - [GET] - Get all job Submissions ¶

Get all the submissions for every job started in SQoop

/v1/submissions?jname=[jname] - [GET] - Get Submissions by Job ¶

Retrieve all job submissions in the past for the given job. Each submission record will have details such as the status, counters and urls for those submissions.

Provide the name of the job in the url [jname] part.

Method: GET
Format: JSON
Request Content: None
Fields of Response:

Field	Description
`progress`	The progress of the running Sqoop job
`job`	The id of the Sqoop job
`creation-date`	The submission timestamp
`last-update-date`	The timestamp of the last status update
`status`	The status of this job submission
`external-id`	The job id of Sqoop job running on Hadoop
`external-link`	The link to track the job status on Hadoop

Response Example:

{
  submissions: [
    {
      progress: -1,
      last-update-date: 1415312809485,
      external-id: "job_1412137947693_0004",
      status: "SUCCEEDED",
      job: 2,
      creation-date: 1415312531188,
      external-link: "http://vbsqoop-1.ent.cloudera.com:8088/proxy/application_1412137947693_0004/",
      counters: {
        org.apache.hadoop.mapreduce.JobCounter: {
          SLOTS_MILLIS_MAPS: 373553,
          MB_MILLIS_MAPS: 382518272,
          TOTAL_LAUNCHED_MAPS: 10,
          MILLIS_MAPS: 373553,
          VCORES_MILLIS_MAPS: 373553,
          OTHER_LOCAL_MAPS: 10
        },
        org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter: {
          BYTES_WRITTEN: 0
        },
        org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter: {
          BYTES_READ: 0
        },
        org.apache.hadoop.mapreduce.TaskCounter: {
          MAP_INPUT_RECORDS: 0,
          MERGED_MAP_OUTPUTS: 0,
          PHYSICAL_MEMORY_BYTES: 4065599488,
          SPILLED_RECORDS: 0,
          COMMITTED_HEAP_BYTES: 3439853568,
          CPU_MILLISECONDS: 236900,
          FAILED_SHUFFLE: 0,
          VIRTUAL_MEMORY_BYTES: 15231422464,
          SPLIT_RAW_BYTES: 1187,
          MAP_OUTPUT_RECORDS: 1000000,
          GC_TIME_MILLIS: 7282
        },
        org.apache.hadoop.mapreduce.FileSystemCounter: {
          FILE_WRITE_OPS: 0,
          FILE_READ_OPS: 0,
          FILE_LARGE_READ_OPS: 0,
          FILE_BYTES_READ: 0,
          HDFS_BYTES_READ: 1187,
          FILE_BYTES_WRITTEN: 1191230,
          HDFS_LARGE_READ_OPS: 0,
          HDFS_WRITE_OPS: 10,
          HDFS_READ_OPS: 10,
          HDFS_BYTES_WRITTEN: 276389736
        },
        org.apache.sqoop.submission.counter.SqoopCounters: {
          ROWS_READ: 1000000
        }
      }
    },
    {
      progress: -1,
      last-update-date: 1415312390570,
      status: "FAILURE_ON_SUBMIT",
      error-summary: "org.apache.sqoop.common.SqoopException: GENERIC_HDFS_CONNECTOR_0000:Error occurs during partitioner run",
      job: 1,
      creation-date: 1415312390570,
      error-details: "org.apache.sqoop.common.SqoopException: GENERIC_HDFS_CONNECTOR_0000:Error occurs during partitioner...."
    }
  ]
}

/v1/authorization/roles/create - [POST] - Create Role ¶

Create a new role object. Provide values to the link config inputs for the ones that are required.

Method: POST
Format: JSON
Fields of Request:

Field	Description
`role`	The root of the post data in JSON
`name`	The name of this role

Request Example:

{
  role: {
      name: "testRole",
  }
}

/v1/authorization/role/[role-name] - [DELETE] - Delete Role ¶

Delete a role with name [role-name]

Method: DELETE
Format: JSON
Request Content: None
Response Content: None

/v1/authorization/roles?principal_type=[principal-type]&principal_name=[principal-name] - [GET] Get all Roles by Principal ¶

Get all the roles or for a given principal identified by [principal-type] and [principal-name] part.

/v1/authorization/principals?role_name=[rname] - [GET] Get all Principals by Role ¶

Get all the principals for a given role identified by [rname] part.

/v1/authorization/roles/grant - [PUT] - Grant a Role to a Principal ¶

Grant a role with [role-name] to a principal with [principal-type] and [principal-name].

Method: PUT
Format: JSON
Fields of Request:

The same as Create Role and

Field	Description
`principals`	The root of the post data in JSON
`name`	The name of this principal
`type`	The type of this principal, (“USER”, “GROUP”, “ROLE”)

Request Example:

{
  roles: [{
      name: "testRole",
  }],
  principals: [{
      name: "testPrincipalName",
      type: "USER",
  }]
}

Response Content: None

/v1/authorization/roles/revoke - [PUT] - Revoke a Role from a Principal ¶

Revoke a role with [role-name] to a principal with [principal-type] and [principal-name].

Method: PUT
Format: JSON
Fields of Request:

The same as Grant Role

Response Content: None

/v1/authorization/privileges/grant - [PUT] - Grant a Privilege to a Principal ¶

Grant a privilege with [resource-name], [resource-type], [action] and [with-grant-option] to a principal with``[principal-type]`` and [principal-name].

Method: PUT
Format: JSON
Fields of Request:

The same as Principal and

Field	Description
`privileges`	The root of the post data in JSON
`resource-name`	The resource name of this privilege
`resource-type`	The resource type of this privilege, (“CONNECTOR”, “LINK”, “JOB”)
`action`	The action type of this privilege, (“READ”, “WRITE”, “ALL”)
`with-grant-option`	The resource type of this privilege

Request Example:

{
  privileges: [{
      resource-name: "testResourceName",
      resource-type: "LINK",
      action: "READ",
      with-grant-option: false,
  }]
  principals: [{
      name: "testPrincipalName",
      type: "USER",
  }]
}

Response Content: None

/v1/authorization/privileges/revoke - [PUT] - Revoke a Privilege to a Principal ¶

Revoke a privilege with [resource-name], [resource-type], [action] and [with-grant-option] to a principal with``[principal-type]`` and [principal-name].

Method: PUT
Format: JSON
Fields of Request:

The same as Grant Privilege

Response Content: None

/v1/authorization/privilieges?principal_type=[principal-type]&principal_name=[principal-name]&resource_type=[resource-type]&resource_name=[resource-name] - [GET] Get all Roles by Principal (and Resource)¶

Get all the privileges or for a given principal identified by [principal-type] and [principal-name] (and a given resource identified by [resource-type] and [resource-name]).