Blog Post

Azure Data Factory Blog
4 MIN READ

Configuring Private endpoint to Snowflake Private link using managed Virtual Network

sgsshankar's avatar
sgsshankar
Brass Contributor
Oct 13, 2022

Azure Data Factory supports managed virtual network with the scalable Azure Integration Runtime. Azure Integration Runtime is deployed and managed by Microsoft, eliminating the need to have a self-hosted integration runtime by the customer. Azure Integration Runtime managed virtual network uses private endpoints to securely connect to Snowflake, utilizing the Azure Private Link for Snowflake.

 

NOTE: Snowflake Business Critical edition or higher is required to setup Azure private link with Snowflake.

 

Setting up Azure Managed Virtual Network Integration Runtime

 

The Azure managed virtual network integration runtime is recommended to be deployed in the same region where Snowflake is deployed. For details on how to setup managed virtual network for Data Factory, please refer to this Microsoft guide.

 

Setting up managed endpoint for Snowflake Private Link

 

Azure Private Link provides private connectivity to Snowflake by ensuring that access to Snowflake is through a private IP address. The Network traffic flows to the Snowflake Virtual Network using the Microsoft backbone and avoids the public Internet. This significantly simplifies the network configuration by keeping access rules private while providing secure and private communication.

 

1. Execute SYSTEM$WHITELIST and SYSTEM$WHITELIST_PRIVATELINK to obtain the SNOWFLAKE_DEPLOYMENT, SNOWFLAKE_DEPLOYMENT_REGIONLESS and OCSP_CACHE values for public and whitelist hosts. You can also execute the following query with a warehouse to get it in a readable format:

 

 

use role accountadmin;
select t.value:type as type,
       t.value:host as host,
       pt.value:host as privatelink_host,
       t.value:port as port
from table(flatten(input=>parse_json(system$whitelist()))) t,
     table(flatten(input=>parse_json(system$whitelist_privatelink()))) pt
where t.value:type = pt.value:type;

 

 

Property

Public Host

Private Link Host

SNOWFLAKE_DEPLOYMENT

<account_name>.<region_id>.azure.snowflakecomputing.com

<account_name>.<region_id>.privatelink.snowflakecomputing.com

SNOWFLAKE_DEPLOYMENT_REGIONLESS

<org_name>-<account_name>.snowflakecomputing.com

<org_name>-<account_name>.privatelink.snowflakecomputing.com

OCSP_CACHE

ocsp.snowflakecomputing.com

ocsp.<account_name>.<region_id>.privatelink.snowflakecomputing.com

 

SNOWFLAKE_DEPLOYMENT_REGIONLESS will be available only for Organization accounts. If organization URL is not used to connect, then it could be skipped.

 

2. Execute SYSTEM$GET_SNOWFLAKE_PLATFORM_INFO() as ACCOUNTADMIN to obtain the snowflake-vnet-subnet-ids values. The Subscription ID for the Private Link of the Snowflake’s Azure tenant is obtained from this. You can also execute the following query with a warehouse to get it in a readable format.

 

 

use role accountadmin;
select t.key, v.value 
from table(flatten(input=>parse_json(system$get_snowflake_platform_info()))) t, table(flatten(t.value)) v;

 

 

Property

Value

snowflake-vnet-subnet-id

/subscriptions/ae0c1e4e-d49e-4115-b3ba-888d77ea97a3/resourceGroups/azure-prod/providers/Microsoft.Network/virtualNetworks/azure-prod/subnets/xp

 

The subscription id is the one which has the resource group azure-prod or if it is not present, then it would be the deployment-infra-rg or something similar. The Resource group and Subscription ID may change in the future. Please contact Snowflake support to get the actual Subscription ID details. In this case, the Subscription ID is ae0c1e4e-d49e-4115-b3ba-888d77ea97a3.

 

3. Create a private endpoint for private link under the Managed private endpoints section in the Manage menu of the Data Factory Studio. For more information on Data Factory managed private endpoints, please refer to the Microsoft documentation.

 

 

4. Select Private link service to configure the managed endpoint for the Snowflake private link.

 

 

5. The resource ID for the target Snowflake private link is of the following format:

 

/subscriptions/<subscription_id>/resourcegroups/az<region>-privatelink/providers/microsoft.network/privatelinkservices/sf-pvlinksvc-az<region>

 

The Subscription ID is from Step 2 and the region is the Snowflake region obtained from the region_id of Step 1 without any hyphen (-).

 

For example, if the Snowflake region id is east-us-2, then it would be eastus2. So, the URL would look like:

 

/subscriptions/ae0c1e4e-d49e-4115-b3ba-888d77ea97a3/resourcegroups/azeastus2-privatelink/providers/microsoft.network/privatelinkservices/sf-pvlinksvc-azeastus2

 

6. For the fully qualified name, add the SNOWFLAKE_DEPLOYMENT, SNOWFLAKE_DEPLOYMENT_REGIONLESS and OCSP_CACHE of both the public and private link host from Step 1.

 

 

7. The Provisioning state should be Succeeded, and the Approval state should be pending. If the Provisioning state has failed, check the values provided and repeat the process with the correct values.

 

 

8. Since the Private link is in Snowflake’s Azure Subscription, the support ticket needs to be raised to Snowflake to approve the private endpoint connection. Provide the managed private endpoint resource id and the managed approvals link in the Azure portal.

 

 

9. Once the Snowflake support confirmed the connection is approved, the managed endpoint approval state should be approved and ready to be utilized.

 

 

10. Execute SYSTEM$GET_PRIVATELINK_AUTHORIZED_ENDPOINTS() to check the connection in Snowflake. You can also execute the following query with a warehouse to get it in a readable format:

 

 

use role accountadmin;
select value: endpointId as endpoint
from table(flatten(input => parse_json(system$get_privatelink_authorized_endpoints())));

 

 

11. The linked service is configured with the Snowflake connection, and it will make use of the Private endpoint to connect to Snowflake via private link.

 

Setting up Snowflake Network policies

 

The Snowflake network policies affect the private link connections even though the communications are through private Azure backbone. If network policies are enabled at account level, then the private CIDR 10.0.0.0/8 of the Azure managed virtual network should be allowed. This is a private IP range and would only allow the applications that is configured via private endpoint to Snowflake private link.

 

For additional security, network policies can be enforced for the Snowflake user accounts configured in the Data Factory Linked Services to allow the CIDR 10.0.0.0/8.

 

Hope this guide helps!

 

Updated Oct 13, 2022
Version 1.0
  • PardhaV's avatar
    PardhaV
    Copper Contributor

    sgsshankar How will it function for disaster recovery? We've established a Multi-Region Environment (MPE) for Snowflake Private Link, which is operating smoothly in the eastus2 location. Now, we're considering adding a Private Link in the westus region. If we incorporate Private Links in both locations, which region will the Snowflake Linked Service (LS) attempt to connect to? Moreover, during a disaster recovery scenario, if the eastus2 region goes offline, how will the traffic be directed solely to the westus region? Does the Linked Service possess a mechanism to detect the operational region and direct traffic accordingly, or do we need to manually manage the addition and removal of Private Links for specific locations as necessary?

  • I wanted to share this helpful article from Snowflake. 

     

    How to set up a managed private endpoint from Azure Data Factory or Synapse to Snowflake

     

    Because the Private link is on the Snowflake side of the house (deployment wise) - you need to request that information from Snowflake.

     

    "the function SYSTEM$GET_PRIVATELINK_CONFIG() currently only returns the PrivateLink alias, you must contact Snowflake Technical Support to request Snowflake's endpoint service resource ID of the Azure region of your Snowflake account."

     

     

     

  • MorrisMZin's avatar
    MorrisMZin
    Copper Contributor

    Hi Shankar,

     

    I have been doing some POC for the copy activities. The following is what I have done

     

    • I have successfully create ADF managed private endpoint to snowflake
    • I have successfully create ADF managed private endpoint to my blob ( my blob is not accesible from public but able to run copy activities from ADF )
    • I have Copy Activities that
      • source is linked to Snowflake and I am able to preview data
      • sink is linked to my blob storage

    Once this is all set and done, I run the copy activities the error I am getting is as follow :

     

    ErrorCode=UserErrorOdbcOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=ERROR [42501] Failed to access remote file: access denied. Please check your credentials,Source=Microsoft.DataTransfer.Runtime.GenericOdbcConnectors,''Type=System.Data.Odbc.OdbcException,Message=ERROR [42501] Failed to access remote file: access denied. Please check your credentials,Source=SnowflakeODBC_sb64.dll,

     

    So I went and check in snowflake's activities/query history to find that it's failing to write to my blob and I wondering if snowflake is writing to my blob via public URL. 

     

    COPY INTO 'azure://<my-blob-name-xxxx>.blob.core.windows.net/<container-xxxx>/<id-xxxx>/SnowflakeExportCopyCommand/' FROM (<select-statement-xxxx>) CREDENTIALS = (AZURE_SAS_TOKEN = '☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺☺') FILE_FORMAT = (type = CSV COMPRESSION = NONE RECORD_DELIMITER = '
    ' FIELD_OPTIONALLY_ENCLOSED_BY = '"' ESCAPE = '\\' NULL_IF = '') HEADER = TRUE SINGLE = FALSE OVERWRITE = TRUE

     

     

    Am I missing anything here? Appreciate if you can share any step I might have missed. 

     

     

  • sgsshankar's avatar
    sgsshankar
    Brass Contributor

    Hi Morris

     

    • How would you copy the data out of snowfalke to your azure storage? - Copy activity would be able to do that.
    • What would your source & sink would be in the copy activity? I assume the source would be snowflake and for the sink, would you create a linked service to your ADLS (or blob) via managed private endpoint? Or, would you enable staging at the copy activity? Will both work? - To copy out of Snowflake, you would still need to configure an blog with SAS URI as a stage. The source would be Snowflake and the sink would be ADLS along with the blob as staging.
    • Would it be possible to use snowflake internal stage via the managed private endpoint connections- if so,  would you know how the link service be authenticated? - Data Factory does not have the support to utilize internal stage. You would need to configure the blob with SAS URI as the stage.
  • MorrisMZin's avatar
    MorrisMZin
    Copper Contributor

    Hi, 

     

    Thank you for the article. This is really useful and this is exactly what we are trying to achive it with our current project. We have thirdparty data provider using snowflake DW and we are trying to extract data out of it & do our ETL in ADF. I have a few question if you don't mind.

     

    • How would you copy the data out of snowfalke to your azure storage? 
    • What would your source & sink would be in the copy activity? I assume the source would be snowflake and for the sink, would you create a linked service to your ADLS (or blob) via managed private endpoint? Or, would you enable staging at the copy activity? Will both work? 
    • Would it be possible to use snowflake internal stage via the managed private endpoint connections- if so,  would you know how the link service be authenticated? 

    Hope you can help. Thanks a lot!