My first introduction to Azure Bicep involved creating a large repository of Azure Bicep deployment templates. I was able to see firsthand the benefits of using Bicep to simplify Azure deployments. Recently I have utilised Azure Data Factory (ADF) for integrating a new enterprise solution. It is an effective data integration tool which streamlines data processing. Due to my experience with both technologies and the growing adoption rate of Bicep and ADF, I decided to document and share my experiences on both.
This article intends to cover a sample Bicep deployment of an Azure Data Factory along with linked services and an integrated repository. Alongside this demonstration will be explanations of the basics of ADF, Git integration, using Bicep to deploy an ADF and restrictions that arise when utilising it or any other infrastructure as code (IaC) tools to deploy an ADF with an existing repository.
The final deployment will resemble the diagram below:
The completed deployment in Azure will contain:
On the side is a repository for the ADF to integrate with – while it is possible to migrate and use the same repository to handle infrastructure deployments, I would recommend keeping it separate from your Bicep deployment scripts to prevent developers from modifying deployment templates and exposing secrets.
The templates provided can be separated into two sections: basic infrastructure and additional configuration.
The templates provided can be separated into two sections: core infrastructure and additional configuration to give the ADF access to linked services.
The basic Infrastructure will be deployed in parallel without dependency:
Then configuration will be deployed once the associated resources have been deployed:
An example deployment template is provided to show these dependencies.
This is a standard deployment script for a Blob Storage Account.
This is a standard deployment script for a Key Vault.
This deploys a simple User Assigned Identity. The Principal ID is output as a means of easily assigning the Key Vault secret reader role later on.
This is a modified deployment script for an Azure Data Factory with an integrated git repository and a User Assigned Identity. This module depends on the User Assigned Identity shown prior.
Adding the Blob access key to the Key Vault as a secret. Depends on the deployment of both the Blob Storage Account and Key Vault in order to reference both.
Deploys a role assignment to the User Assigned Identity deployed earlier, which should be configured to allow access to the Key Vault secrets to resources with this role. The specific role for reading Key Vault secrets is shown in the example deployment script. This module depends only depends on the deployment of the Key Vault and can happen in parallel with the ADF deployment.
While it is possible to deploy ADF components via Bicep, I would not recommend it as it will conflict with the Git integration and be overridden. Therefore, in the next steps we will be setting up the linked services for the Data Factory manually, opposed to via a Bicep deployment which would be overridden. These will be preserved in the Git repository. Consider reviewing how to set up a linked service using the GUI for a direct walkthrough as this section will cover configurations specific to the deployments above.
Your Key Vault linked service configuration should look like the following:
You can find the base URL under your Key Vault overview, as “Vault URI”:
The Authentication Method should be User Assigned Managed Identity deployed previously.
You can create a new credential using the following configuration:
You can then add another connection for the Azure Blob storage using the access key retrieved from the Key Vault:
Now your ADF is complete and ready to use. You can now redeploy the ADF and shared resources with ease, along with any created pipelines and activities added to your ADF via the developer interface. To verify the deployment, you should try creating some pipelines on ADF, then delete all resources before rerun the Bicep deployment. Upon redeploying the resources, you may have to alter the name of your Key Vault due to Azure protection preventing a new Key Vault with the same name from being deployed for several days after its deletion. This change will not follow through to the linked service saved in your repository.
Hopefully this has inspired you to learn more about Bicep or Azure Data Factory as both are powerful tool. I would recommend using this exercise as an intro to the vast topics of Bicep deployment and ADF development which were lightly touched. For more information or help with any integration dilemmas, reach out to our team or have a look through our other blogs.