Getting started with IBM Cloud Data Engine

IBM Cloud® Data Engine is deprecated. As of 18 February 2024 you can't create new instances, and access to free instances will be removed. Existing Standard plan instances are supported until 18 January 2025. Any instances that still exist on that date will be deleted.

In this Data Engine tutorial, we set you up to begin querying rectangular data in IBM Cloud® Object Storage and storing the results in Object Storage.

Before you begin

Before you can run SQL queries, you need to have one or more Cloud Object Storage buckets to hold the data to be analyzed and to hold the query results. Cloud Object Storage offers several plans, including a Lite plan at no cost. To create a Cloud Object Storage instance, do the following steps.

Go to the IBM Cloud catalog > Storage > Object Storage page.
Select one of the plans and create the service. If you want to find your Cloud Object Storage instance later, go to your IBM Cloud resource list > Storage.

You can now manage and browse the buckets and data the instance contains. Click here for more information about how to use Cloud Object Storage.

Create your Data Engine service instance

Go to the IBM Cloud catalog and search for Data Engine.
Click Data Engine to open the catalog details page.
Select the Lite plan and click Create to create an instance of the service.
Click Launch Data Engine UI on the Dashboard page to open the Data Engine Console.

When you open the Console for the first time, the Data Engine service automatically creates a bucket for you in your Cloud Object Storage instance. It uses this bucket as the default target for your query results.

Run one of the samples to see how to use the service

Select a sample query. The query automatically gets loaded into the editor. The input data that is used by the sample query is available in a publicly accessible bucket.
The Target field is automatically completed with the unique resource identifier (URI) of your default bucket. You can use this bucket or specify an INTO clause in your query.
Click Run to run the query. The query result is displayed.
Each sample data set is available in each of the supported input data formats (CSV, JSON, ORC, and Parquet). If you want to experiment with different formats, edit the selected sample query and change the specified file name and format. For example, change orders.parquet STORED AS PARQUET to orders.orc STORED AS ORC.

Next steps

To analyze your own data (CSV, JSON, ORC, or Parquet), upload it to a Cloud Object Storage instance and run SQL queries.

For more advanced capabilities, check out the following video tutorials.