IBM Cloud Docs
Assembling and compiling a custom Cloud Pak for Data connector

Assembling and compiling a custom Cloud Pak for Data connector

You package a number of component files together to create a custom connector.

IBM Cloud Pak for Data IBM Cloud Pak for Data only

This information applies only to installed deployments.

Custom connector components

A custom connector package is a compressed file that contains the following components:

Connector components
Path Description
config/template.xml A configuration template
config/messages.properties A properties file for UI messages
lib/*.jar JAR files required by the custom connector, not including the connector code that you write

Configuration template

The configuration template is an XML file that is divided into sections. Each section contains related settings. The XML snippets are taken from the example template.xml file whose location is listed in Understanding the custom-crawler-docs.zip file.

Declaration settings

Declared settings are represented by the <declare /> element. The element has the following attributes:

Declare element attributes
Attribute name Description
type Data type; one of string, long, boolean, list of strings, or enum
name The name of the setting
initial-value The initial value of the setting
enum-value A list of enum values separated by vertical bars (|)
required Indicates that the setting is required
hidden Indicates whether to hide the setting from the UI. Specify a value of true to hide the setting.

In the current release, the required and hidden attributes are not applied in the Discovery product user interface.

Declaration setting examples

To declare an enum type, use code similar to the following snippet:

<declare type="enum" name="type" enum-values="PROXY|BASIC|NTLM" initial-value="BASIC"/>

To declare a hidden string with an initial value, use code similar to the following snippet:

<declare type="string" name="custom_config_class" hidden="true" initial-value="com.example.ExampleCrawlerConfig" />

To declare a required long, use code similar to the following snippet:

<declare type="long" name="port" required="required" initial-value="22"/>

Conditional settings

Conditional settings are represented by the <condition /> element. A conditional setting is displayed only if the condition is satisfied. The element has the following attributes:

Condition element attributes
Attribute name Description
name The name of the setting
enable Enable the setting if the value of the name attribute equals the value of the enable attribute
in Enable the setting if the value of the name attribute is included in a specified list of values

In the current release, conditional settings are not applied in the Discovery product user interface.

Conditional setting examples

To enable a section by using a boolean condition, use code similar to the following snippet:

<declare type="boolean" name="use_key" initial-value="true" />

<condition name="use_key" enabled="true">
  <declare type="string" name="key" hidden="false" />
</condition>

To enable a section by using an enum condition, use code similar to the following snippet:

<declare type="enum" name="type" enum-values="PROXY|BASIC|NTLM" initial-value="BASIC"/>

<condition name="type" in="BASIC|NTLM|PROXY">
</condition>

<condition name="type" in="PROXY">

Template sections

Each section includes one <declare /> element for each of its settings.

Template sections
XPath expression Description
/function/@name The name (type) of the crawler. Not a display name for the UI. Cannot contain spaces.
/function/prototype/proto-section A section of the configuration.

Section: general_settings

The XPath expression is /function/prototype/proto-section[@section="general_settings"]. It includes common settings for all crawlers, including the following settings:

<declare type="string" name="crawler_name" />
<declare type="string" name="description" />
<declare type="long" name="fetch_interval" initial-value="0" />
<declare type="long" name="number_of_max_threads" initial-value="10" />
<declare type="long" name="number_of_max_documents" initial-value="2000000000" />
<declare type="long" name="max_page_length" initial-value="32768" />

The custom crawler is initialized with the following settings in the general_settings section. For information about the interfaces, see Developing custom connector code.

General settings section defaults
Name Value
custom_config_class The name of a class that implements the com.ibm.es.ama.custom.crawler.CustomCrawlerConfiguration interface
custom_crawler_class The name of a class that implements the com.ibm.es.ama.custom.crawler.CustomCrawler interface
custom_security_class The name of a class that implements the com.ibm.es.ama.custom.crawler.CustomCrawlerSecurityHandler interface
document_level_security_supported Specifies whether document-level security is enabled (true) or disabled (false)

To specify the interfaces, use code similar to the following snippet:


  <declare type="string" name="custom_config_class" hidden="true" initial-value="com.ibm.es.ama.custom.crwler.sample.sftp.SftpCrawler" />

  <declare type="string" name="custom_crawler_class" hidden="true" initial-value="com.ibm.es.ama.custom.crwler.sample.sftp.SftpCrawler" />

  <declare type="string" name="custom_security_class" hidden="true" initial-value="com.ibm.es.ama.custom.crwler.sample.sftp.SftpCrawler" />

  <declare type="boolean" name="document_level_security_supported" initial-value="true" hidden="true"/>

If you built a custom connector with an SDK package that was bundled with version 2.2.1 or earlier, document_level_security_supported must be disabled (set to false). Document-level security is not supported in 2.2.1 and earlier releases. However, the Enable Document Level Security option is displayed in Discovery even when document-level security is not supported. Do not select this option when you create a new collection.

To hide the Enable Document Level Security option from Discovery if the custom connector was built with an SDK package that was bundled with version 2.2.1 or earlier, complete the following steps:

  1. Change the document_level_security_supported parameter in the config/template.xml file to read as follows:

    <declare type="boolean" name="document_level_security_supported" hidden="true" initial-value="false"/>
    
  2. Rebuild the connector package, and then upload it again.

Section: datasource_settings

The XPath expression is /function/prototype/proto-section[@section="datasource_settings"]. It includes settings specific to the data source.


  <proto-section section="datasource_settings">
    
      <declare type="string" name="host" required="required" initial-value="localhost"/>
      <declare type="long" name="port" required="required" initial-value="22"/>
      <declare type="string" name="user" required="required" />
    
      <declare type="boolean" name="use_key" initial-value="true" />
    
      <condition name="use_key" enabled="true">
        <declare type="string" name="key" hidden="false" />
        <declare type="password" name="passphrase" hidden="false" />
      </condition>
    
      <condition name="use_key" enabled="false">
        <declare type="password" name="secret_key" hidden="false" />
      </condition>
  </proto-section>

Section: crawlspace_settings

The XPath expression is /function/prototype/proto-section[@section="crawlspace_settings"]. The section contains only one <declare /> element to specify the path. The value of the path is provided by the connector code.


  <proto-section section="crawlspace_settings" cardinality="multiple">
    <declare type="string" name="path" hidden="true" />
  </proto-section>

Properties file

For an example of a properties file, see the example messages.properties file whose location is listed in Understanding the custom-crawler-docs.zip file.

JAR files

The JAR files for any interfaces used by your custom connector code, including the ama-zing-custom-crawler-{version_numbers}.jar file whose location is listed in Understanding the custom-crawler-docs.zip file. The ama-zing-custom-crawler-{version_numbers}.jar file includes the com.ibm.es.ama.custom.crawler Java package that is described in Developing custom connector code.

Compiling and packaging the custom connector

After you write the source code and configuration files for your custom connector, you need to compile and package it.

Prerequisites

To compile a custom connector, you need to have the following items on your local system. See Custom connector example for details.

  • Java SDK 1.8 or higher

  • Gradle

  • The custom-crawler-docs.zip file from an installed Discovery instance

  • The JSch package

  • The following files for the example custom connector:

    • Java source code (SftpCrawler.java and SftpSecurityHandler.java)
    • XML definition file (template.xml)
    • Properties file (messages.properties)

    Do not change the names or paths of the example custom connector files. Doing so can result in problems, including build failures.

Compiling and packaging the source code

  1. Ensure you are in the custom connector development directory on your local system:

    cd {local_directory}
    
  2. Use Gradle to compile your Java source code and to create a compressed file that includes all of the required components for the custom connector:

    gradle build packageCustomCrawler
    

Gradle creates a file in {local_directory}/build/distributions/{built_connector_zip_file}, where the name of the {built_connector_zip_file} is based on the rootProject.name value of settings.gradle. For example, if the line reads as follows, Gradle generates a file that is named {local_directory}/build/distributions/my-sftp-connector.zip.

rootProject.name = 'my-sftp-connector'

Next step

Proceed to Installing and uninstalling a custom connector to install the custom connector to your Discovery instance.