Overview
Syniti Replicate allows you to replicate data from relational database tables to Amazon Redshift using the following replication modes:
Refresh or Snapshot Replication: A one-time complete replication from any major relational database source to Amazon Redshift as a target, according to replication settings and scripts. You can schedule the replication, identify the columns to be replicated, and add scripts to transform data during replication. Source databases include Oracle, Microsoft SQL Server, IBM Db2 for i, IBM Db2 LUW, Sybase, Informix, MySQL, and so on.
Continous Refresh: A regularly scheduled refresh replication as described above. The schedule is defined in the replication settings.
One-Way Mirroring (Change Data Capture (CDC)): A continuous update of replicated tables based on changes to the source database that have been recorded in the database server log. Typically, this involves an initial refresh operation, as described above, to set up the target table. Then you can define the replication settings to check the transaction log on the source database at regular intervals. Any changes found in the log are applied to data on the Amazon Redshift platform.
Prerequisites
Before you begin, ensure that you complete the following actions:
Download and install the Amazon Redshift 64-bit ODBC v1.5.16 driver. Click here to download.
Install the AWS SDK for .NET toolkit. Refer to AWS SDK for .NET for more information.
Syniti Replicate will later need path names for two of the DLLs in the SDK. Make sure you have the path names available when you configure the target connection to Amazon Redshift.
Set up your AWS account with an Access Key and store both the Access Key and Secret Access Key where you will be able to find them when configuring your target connection to Amazon Redshift below. The Access Key is needed to access an S3 bucket for bulk insert of data using the COPY command.
Limitations
Mirroring (CDC) mode is not supported as source with Amazon Redshift.
Access via Proxy is not supported. Syniti Replicate needs to connect directly with Amazon Redshift.
Data Encryption is not supported for Amazon Redshift.
Data Masking is not supported for this database in Syniti Replicate.
User Permissions
Syniti Replicate require a grant READ user permission, for example:
Grant COPY
Grant INSERT
Grant DELETE
Grant UPDATE
Grant SELECT
Create SCHEMA
Grant CREATE TABLE
Grant ALTER TABLE
Grant DROP TABLEAdd Target Connection Wizard
The following screens require specific information for Amazon Redshift.
Select Provider Screen
Enter the following values accordingly:
Database: Select Amazon Redshift from the drop-down list.
Provider: The default value is Amazon Redshift ODBC Driver.
Assembly: For .NET Data Provider connections only. No value is needed.
Set Connection String Screen
Click the Connection String value field to open the Amazon Redshift ODBC Connection dialog box.
Values entered in this dialog box should reflect those needed to make a successful connection to an Amazon Redshift cluster via an ODBC connection string, with or without Syniti Replicate. There are no fields specific to Syniti Replicate.Enter the following values accordingly:
Server: Specify the Amazon Redshift cluster endpoint, as defined in the Amazon Redshift console.
Port: Type the port number for the database. The default is 5439.
Database: Type the name of the Amazon Redshift database.
Auth Type: Syniti Replicate has been tested using the Standard authentication type, with an AWS user and password.
However, Syniti Replicate uses the values entered here as part of an ODBC connection string passed to the server. Therefore, any set of values that can connect to the server are supported. Check the AWS documentation for all the connection options.
Click Next.
Set Staging Connection String
Amazon Redshift setup requires a staging connection in Amazon S3. Add values for properties using the following table:
Property | Description |
|---|---|
Output Folder | Local path to temporarily store replication output files. |
Output Folder Archive | (Optional) Set a path to archive files that are generated and uploaded. |
Use IAM Role |
For more information about using IAM roles, refer to ID Roles. Note that IAM roles work only for access between AWS services (such as when using S3 as a staging connection for bulk uploads to Redshift.) |
IAM Role Parameter | (Optional - for use with Redshift only) The IAM role parameter should be configured as follows to access S3: arn:aws:iam:<aws-account-id>:role/<role-name> The parameter information is required when using S3 in conjunction with Redshift in the COPY from statement to access files you imported from S3 to Redshift. For more information about using IAM roles with Redshift, refer to Authorizing Redshift Service. |
S3 Access Key | The Access Key from your AWS login ID. |
S3 Secret Key | The Secret Access Key from your AWS login ID. |
Endpoint URL | (Optional) Use the connection URL to supply connection information to the data store that you are accessing. |
S3 Bucket Name | The bucket must already exist. It can either be a bucket that you are already using, or one you have created for this purpose. |
S3 Folder Name | (Optional) Specify a folder for the existing S3 bucket. |
S3 Region Name | Specify a region endpoint. |
AWS SDK S3 Assembly Path | The path name to AWSSDK.S3.dll installed with the AWS .NET SDK. |
AWS SDK Core Assembly Path | The path name to the AWSSDK.Core.dll installed with the AWS .NET SDK. |
Note
Complete the rest of the fields in the Add Target Connection wizard. Refer to Add Target Connection Wizard for more information.
Now you can set up replications from whichever source connection you have defined.
Connection Properties Dialog Box
The default insert mode for replicated data has been set as follows:
Refresh replications: Bulk Insert
Mirroring replications: Single Insert
To change these settings, and to complete setup for Amazon Redshift, you need to edit the connection properties for the connection created using the above Create Target Connection wizard.
In the Metadata Explorer, select the Amazon Redshift target connection you have created.
Right-click the menu and select Connection Properties.
Scroll and access the Refresh Options section to change the Bulk Insert settings, as needed. Bulk Insert is the default and offers better performance.
Scroll and access the Mirroring Options section to change the Bulk Insert settings, as needed. Bulk Insert is NOT set as the default but it offers better performance, so you may want to change the settings.
Click OK to save your changes.
Data Type Mapping
When Amazon Redshift is used as a target system, the mapping of Syniti Replicate data types to the corresponding data type in Redshift is listed in the following table:
Syniti Replicate Mapping | Amazon Redshift |
|---|---|
AutoDecimal | bigserial |
AutoInteger | serial |
AutoNumeric | bigserial |
AutoSmallint | serial |
AutoTinyint | serial |
Bigint | bigint |
Binary | bytea |
Bit | Bit - bit varying |
Blob | bytea |
Char | char, character, ""char"" |
Clob | text |
Date | date |
Decimal | decimal |
Double | float8, double precision |
Float | real |
GUID | char(36) |
Integer | integer, oid, cid, xid |
Json | json |
LongVarbinary | bytea |
LongVarchar | text |
Money | decimal |
NChar | char |
NText | text |
Number | numeric |
Numeric | numeric |
NVarchar | varchar |
Real | real |
Smalldatetime | timestamp |
Smallint | smallint |
Smallmoney | decimal(10,2) |
Sysname | varchar(128), name |
Time | timestamp |
Timestamp | timestamp without time zone |
TimestampZ | timestamp with time zone |
Tinyint | decimal(3) |
Varbinary | bytea |
Varchar | varchar, character varying |
Xml | xml |
