Amazon S3

Prev Next

Overview

This page describes how to set up your Syniti Replicate environment to replicate data to an Amazon S3 bucket. The replicated data is stored in CSV or JSON format in Amazon S3. When replicating from a relational database to Amazon S3, Syniti Replicate allows you to replicate data in the following two methods:

Prerequisites

  • Install the AWS SDK for .NET toolkit. Refer to AWS SDK for .NET for more information.

  • Syniti Replicate will later need path names for two of the DLLs in the SDK. Make sure you have the path names available when you configure the target connection to Amazon S3.

  • Set up your AWS account with either an IAM role or an Access Key and store the access information (either IAM role details or both the Access Key and Secret Access Key) where you will be able to find them when configuring your target connection to Amazon S3 below. The information is needed to access an S3 bucket.

Limitations

  • Default Block Size for CSV and JSON files define the file size generated in Amazon S3. It is not possible to define files with a specific number of records. For example, if you set Refresh Block Size to 10,000, Syniti Replicate will create files with a size of 10 MB in your Amazon S3 Bucket.

  • Only Single Insert mode is supported for Amazon S3.

  • Mirroring or CDC mode is not supported as source with Amazon S3.

  • Access via Proxy is not supported. Syniti Replicate needs to connect directly with Amazon S3.

  • Data Encryption is not supported for Amazon S3.

  • Data Masking is not supported for this database in Syniti Replicate.

Set Up Target Connection to Amazon S3

This section assumes you have selected the Enable Transactional Replication Wizard option to open the Add Target Connection wizard. If not, open the wizard from the Management Center and complete the following steps:

  1. In the Metadata Explorer, click Targets.

  2. Right-click the menu and click Add New Connection.

  3. Set the Property values using the following table as reference:

    Property

    Description

    Output Folder

    Local path to temporarily store replication output files.

    Output folder Archive

    (Optional) Set a path to archive files that are generated and uploaded.

    Use IAM Role

    • Set to False by default.

    • Set to True to use an IAM role for connections instead of the Access Key and Secret Key approach.

    For more information about using IAM roles, see IAM Roles. Note that IAM roles work only for access between AWS services (such as when using S3 as a staging connection for bulk uploads to Redshift or when installing Syniti Replicate on a EC2 instance).

    IAM Role Parameter

    (Optional - for use with Redshift only)

    The IAM role parameter should be configured as follows to access S3: arn:aws:iam:<aws-account-id>:role/<role-name>The parameter information is required when using S3 in conjunction with Redshift in the COPY from statement to access files you imported from S3 to Redshift.

    For more information about using IAM roles with Redshift, refer to Amazon Redshift Service.

    S3 Access Key

    The Access Key from your AWS login ID.

    S3 Secret Key

    The Secret Access Key from your AWS login ID.

    Endpoint URL

    (Optional) Use the connection URL to supply connection information to the data store that you are accessing.

    S3 Bucket Name

    The bucket must already exist. It can either be a bucket that you are already using or one you have created for this purpose.

    S3 Folder Name

    (Optional) Specify a folder for the existing S3 bucket.

    Create Folder per Table

     

    File for each replicated table will be uploaded in S3 in a folder based on the option selected. The default value is No (no separate folder will be created).

    This property is not applicable for an Amazon Redshift staging connection.

    S3 Region Name

    Specify a region Endpoint

    AWS SDK S3 Assembly Path

    The path name to AWSSDK.S3.dll installed with the AWS .NET SDK.

    AWS SDK Core Assembly Path

    The path name to the AWSSDK.Core.dll installed with the AWS .NET SDK.

    Create Manifest

    (Optional) Default is False.

    If set to True, a manifest file is created for the data file output.

    File Format

    (Optional) Default is 1-CSV for an output file containing comma-separated values.

    Choose 0-JSON to generate a .JSON format output file.

    File Name Format

    (Optional) False by default. Output file format will be: <replication_name><date+time><serial_number>.<"ref" or "mir">

    Set the value to True to generate output files in the following format:

    <replication_name><date+time><serial_number>.<"ref" or "mir">.<csv or json>

    Use Column Names

    (Optional) Default is True.

    When True, output file contains a list of column names as the first row.

    Add Transactional Info

    (Optional) Default is Yes, first column. In this case, transaction details (operation type, transaction ID, transaction timestamp , and so on) are added to the output file in the first column.

    Alternative values are None (no transaction details reported) or Yes, last column (transaction details are included in the last column,)

    Transactional Column Prefix

     

    Prefix to add to transactional fields (if selected). This option will allow the use of a custom format for the transactional fields to distinguish them from the data fields.

    Column Separator

    (Optional) Default is ,.

    Column Escape

    (Optional) Default is \.

    Indicates the excape character used for the column separator.

    String Separator

    (Optional) Default is .

    String Escape

    (Optional) Default is \.

    Indicates the escape character used for the string separator.

    Use One File Per Group

    (Optional) Default is False. One output file per replication is generated.

    When set to True, one output file per replication group is generated, assuming that one or more Replication Groups have been established.

    Extended Properties

    No additional properties are currently supported.

CSV File Structure

Files have a .mir or .mir.csv extension when used for Mirroring, and a .ref or .ref.csv extension when used for Snapshot or Refresh replications.

Data files replicated to Amazon S3 using the Mirroring replications have the following structure, if the Yes, first column option is selected.

Column

Value

1

Sequence number

2

Time stamp of a transaction

3

User, if provided by the database

4

Table name

5

Operation code:

  • I: Insert

  • U: Update

  • D: Delete

6

Previous primary key value for the database column. This column is used for update operations to identify the correct record.

7

Next primary key value. This column is used for update operations to identify the correct record in cases where the primary key is changed during the update operation.

8

Database column value

Example CSV File

CSV File name, for example, corp_20150115T161123155064_1.mir.csv

10,2015-01-22 15:32:39.000000,,"glb","I",,2,"test insert"
11,2015-01-22 15:32:40.000000,," glb ","D",10,,
12,2015-01-22 15:32:42.000000,," glb ","U",104,104,"my updated"
13,2015-01-22 15:32:44.000000,," glb ","U",1,12,"sd"

Example JSON File

JSON File name, for example, corp_20150115T161123155064_1.mir.json

{"ID":0,"TransactionID":"00000083000000ec0003-1","TransactionTS":"2021-07-01T15:30:47.593Z","Table":"DOTNET","Operation":"I","After":{"ID":2,"NAME":"new name "}}

Data Type Mapping

When Amazon S3 is used as a target system, the mapping of Syniti Replicate data types to the corresponding data type in Amazon S3 is listed in the following table:

Syniti Replicate Mapping

Amazon S3 Mapping

AutoDecimal

decimal

AutoInteger

integer

AutoNumeric

decimal

AutoSmallint

smallint

AutoTinyint

tinyint

Bigint

bigint

Binary

binary

Bit

bit

Blob

binary

Char

string

Clob

string

Date

date

Decimal

decimal

Double

double

Float

float

GUID

string

Integer

integer

LongVarbinary

binary

LongVarchar

string

Money

decimal

NChar

string

NText

string

Number

decimal

Numeric

decimal

NVarchar

string

Real

real

Smalldatetime

datetime

Smallint

smallint

Smallmoney

decimal

Sysname

string

Time

time

Timestamp

timestamp

TimestampZ

timestamptz

Tinyint

tinyint

Varbinary

binary

Varchar

string

Xml

string

Further Resources

For a complete up-to-date list of supported sources and targets, and information on where to obtain a .NET provider, check the Supported .NET Providers article in the Help Center.