Azure Databricks

Overview

With Replicate Preview, creating a powerful replication pipeline to Azure Databricks is a streamlined process. Users can configure connections, select tables, and deploy a fully managed pipeline in just a few steps—without requiring deep expertise in cluster management or distributed systems. This significantly reduces time-to-value while maintaining enterprise-grade performance and reliability.

Note
Azure Databricks is currently supported only as a target endpoint in Replicate Preview, limited to Snapshot (full load) replication mode, with Change Data Capture (CDC) not yet supported.

Microsoft Azure Prerequisites

Replicate Preview requires appropriate authentication and permissions to interact with Azure Databricks resources. Ensure the following prerequisites are met:

Access to an active Azure Databricks workspace
A running SQL Warehouse (formerly SQL Endpoint) configured in the workspace
Permissions to create and write to target schemas and tables
Network access to the Databricks workspace endpoint

Refer to Create a Connection for Databricks for more information on creating Databricks connections.
For secure authentication, we recommend using OAuth 2.0 via Microsoft Entra ID.

Generate a Personal Access Token (PAT)

A Personal Access Token (PAT) can be used for authentication when connecting to Azure Databricks.

To generate a token:

Log in to your Azure Databricks workspace.
Access User Settings and select Developer.
Click Create New Token.
Copy and securely store the generated token

Note: Personal Access Tokens are optional when using OAuth 2.0 (Microsoft Entra ID) authentication, which is the recommended approach for enterprise deployments.

Connection Considerations

Ensure the correct Databricks Server Hostname and HTTP Path (SQL Warehouse endpoint) are configured in the connection settings.
Verify that the SQL Warehouse is running before initiating replication jobs.
Confirm that the target schema exists or that the configured user has permissions to create it.
Review object naming conventions when replicating SAP or legacy systems to ensure compatibility with Databricks constraints.

Limitations