Niall's Data Blog

A Data Engineer / Architect writing about Tech, Data and the Community

Introducing AzureDataPipelineTools

A few months ago my friend Richard Swinbank posted a blog, More Get Metadata in ADF, about the limitations of using the Get Metadata activity in ADF to get information about files in a data lake. This to a twitter conversation as a bunch of other data engineers had been building the same tools for different companies. Due to "popular" demand I've released the definition of my #Azure #DataFactory pipeline to Get Metadata recursively https://t.

Azure Data Factory: Dev Mode vs Published Code

I’ve worked with quite a few people new to Azure Data Factory, and one thing that seems to confuse new users is the difference between the developer sandbox where we build pipelines, and the published/deployed code. Understanding this is key to working with Git and using CI/CD pipelines to deploy your code, and getting other Azure services to integrate nicely to call your pipelines. Connecting to ADF A good first place to start is to understand the different ways we can interact with a data factory.

Azure Data Factory: Making Non-Dynamic Linked Services Dynamic

Linked Service Options Using the UI Note: The example here is the Salesforce linked service, but this technique also works for other linked services where the UI does not support adding parameterised properties. One of my clients has been adding data from multiple Salesforce instances to their data platform this week. One of their developers asked me if the Salesforce linked service could be made dynamic, as there is no place in the GUI to add parameters, or a dynamic values for the URL, user name or credentials.

Azure Data Factory Lookup: First Row Only & Empty Result Sets

When using the lookup activity in Azure Data Factory V2 (ADFv2), we have the option to retrieve either a multiple rows into an array, or just the first row of the result set by ticking a box in the UI. The 'First Row Only' Checkbox at the bottom This allows us to either use the lookup as a source when using the foreach activity, or to lookup some static or configuration data.