Amazon Redshift

Prev Next

Overview

Syniti Replicate allows you to replicate data from relational database tables to Amazon Redshift using the following replication modes:

  • Refresh or Snapshot Replication: A one-time complete replication from any major relational database source to Amazon Redshift as a target, according to replication settings and scripts. You can schedule the replication, identify the columns to be replicated, and add scripts to transform data during replication. Source databases include Oracle, Microsoft SQL Server, IBM Db2 for i, IBM Db2 LUW, Sybase, Informix, MySQL, and so on.

  • Continous Refresh: A regularly scheduled refresh replication as described above. The schedule is defined in the replication settings.

  • One-Way Mirroring (Change Data Capture (CDC)): A continuous update of replicated tables based on changes to the source database that have been recorded in the database server log. Typically, this involves an initial refresh operation, as described above, to set up the target table. Then you can define the replication settings to check the transaction log on the source database at regular intervals. Any changes found in the log are applied to data on the Amazon Redshift platform.

Prerequisites

Before you begin, ensure that you complete the following actions:

  • Download and install the Amazon Redshift 64-bit ODBC v1.5.16 driver. Click here to download.

  • Install the AWS SDK for .NET toolkit. Refer to AWS SDK for .NET for more information.

  • Syniti Replicate will later need path names for two of the DLLs in the SDK. Make sure you have the path names available when you configure the target connection to Amazon Redshift.

  • Set up your AWS account with an Access Key and store both the Access Key and Secret Access Key where you will be able to find them when configuring your target connection to Amazon Redshift below. The Access Key is needed to access an S3 bucket for bulk insert of data using the COPY command.

Limitations

  • Mirroring (CDC) mode is not supported as source with Amazon Redshift.

  • Access via Proxy is not supported. Syniti Replicate needs to connect directly with Amazon Redshift.

  • Data Encryption is not supported for Amazon Redshift.

  • Data Masking is not supported for this database in Syniti Replicate.

User Permissions

Syniti Replicate require a grant READ user permission, for example:

Grant COPY
Grant INSERT
Grant DELETE
Grant UPDATE
Grant SELECT
Create SCHEMA
Grant CREATE TABLE
Grant ALTER TABLE
Grant DROP TABLE

Add Target Connection Wizard

The following screens require specific information for Amazon Redshift.

Select Provider Screen

Enter the following values accordingly:

  • Database: Select Amazon Redshift from the drop-down list.

  • Provider: The default value is Amazon Redshift ODBC Driver.

  • Assembly: For .NET Data Provider connections only. No value is needed.

Set Connection String Screen

  1. Click the Connection String value field to open the Amazon Redshift ODBC Connection dialog box.
    Values entered in this dialog box should reflect those needed to make a successful connection to an Amazon Redshift cluster via an ODBC connection string, with or without Syniti Replicate. There are no fields specific to Syniti Replicate.

  2. Enter the following values accordingly:

    • Server: Specify the Amazon Redshift cluster endpoint, as defined in the Amazon Redshift console.

    • Port: Type the port number for the database. The default is 5439.

    • Database: Type the name of the Amazon Redshift database.

    • Auth Type: Syniti Replicate has been tested using the Standard authentication type, with an AWS user and password.
      However, Syniti Replicate uses the values entered here as part of an ODBC connection string passed to the server. Therefore, any set of values that can connect to the server are supported. Check the AWS documentation for all the connection options.

  3. Click Next.

Set Staging Connection String

Amazon Redshift setup requires a staging connection in Amazon S3. Add values for properties using the following table:

Property

Description

Output Folder

Local path to temporarily store replication output files.

Output Folder Archive

(Optional) Set a path to archive files that are generated and uploaded.

Use IAM Role

  • Set to False by default.

  • Set to True to use an IAM role for connections instead of the Access Key/Secret Key approach.

For more information about using IAM roles, refer to ID Roles. Note that IAM roles work only for access between AWS services (such as when using S3 as a staging connection for bulk uploads to Redshift.)

IAM Role Parameter

(Optional - for use with Redshift only) The IAM role parameter should be configured as follows to access S3: arn:aws:iam:<aws-account-id>:role/<role-name>

The parameter information is required when using S3 in conjunction with Redshift in the COPY from statement to access files you imported from S3 to Redshift. For more information about using IAM roles with Redshift, refer to Authorizing Redshift Service.

S3 Access Key

The Access Key from your AWS login ID.

S3 Secret Key

The Secret Access Key from your AWS login ID.

Endpoint URL

(Optional) Use the connection URL to supply connection information to the data store that you are accessing.

S3 Bucket Name

The bucket must already exist. It can either be a bucket that you are already using, or one you have created for this purpose.

S3 Folder Name

(Optional) Specify a folder for the existing S3 bucket.

S3 Region Name

Specify a region endpoint.

AWS SDK S3 Assembly Path

The path name to AWSSDK.S3.dll installed with the AWS .NET SDK.

AWS SDK Core Assembly Path

The path name to the AWSSDK.Core.dll installed with the AWS .NET SDK.

Note

Complete the rest of the fields in the Add Target Connection wizard. Refer to Add Target Connection Wizard for more information.

Now you can set up replications from whichever source connection you have defined.

Connection Properties Dialog Box

The default insert mode for replicated data has been set as follows:

  • Refresh replications: Bulk Insert

  • Mirroring replications: Single Insert

To change these settings, and to complete setup for Amazon Redshift, you need to edit the connection properties for the connection created using the above Create Target Connection wizard.

  1. In the Metadata Explorer, select the Amazon Redshift target connection you have created.

  2. Right-click the menu and select Connection Properties.

  3. Scroll and access the Refresh Options section to change the Bulk Insert settings, as needed. Bulk Insert is the default and offers better performance.

  4. Scroll and access the Mirroring Options section to change the Bulk Insert settings, as needed. Bulk Insert is NOT set as the default but it offers better performance, so you may want to change the settings.

  5. Click OK to save your changes.

Data Type Mapping

When Amazon Redshift is used as a target system, the mapping of Syniti Replicate data types to the corresponding data type in Redshift is listed in the following table:

Syniti Replicate Mapping

Amazon Redshift

AutoDecimal

bigserial

AutoInteger

serial

AutoNumeric

bigserial

AutoSmallint

serial

AutoTinyint

serial

Bigint

bigint

Binary

bytea

Bit

Bit - bit varying

Blob

bytea

Char

char, character, ""char""

Clob

text

Date

date

Decimal

decimal

Double

float8, double precision

Float

real

GUID

char(36)

Integer

integer, oid, cid, xid

Json

json

LongVarbinary

bytea

LongVarchar

text

Money

decimal

NChar

char

NText

text

Number

numeric

Numeric

numeric

NVarchar

varchar

Real

real

Smalldatetime

timestamp

Smallint

smallint

Smallmoney

decimal(10,2)

Sysname

varchar(128), name

Time

timestamp

Timestamp

timestamp without time zone

TimestampZ

timestamp with time zone

Tinyint

decimal(3)

Varbinary

bytea

Varchar

varchar, character varying

Xml

xml