TenPoint7 | My First Experience on Azure Data Factory (ADF)
My First Experience on Azure Data Factory (ADF)

I have worked in data analysis, database design and data integration systems for more than 10 years spending most of my time in Microsoft Business Intelligence technologies such as MS SQL Server, SSIS, SSAS, SSRS. As Hadoop in the cloud has popularized for its convenience, capacity and scalability with Big Data, I have been wondering how Microsoft would expose the benefits

of these new technologies. In 2014, Microsoft released Azure Data Factory (ADF) with full capabilities of an ETL platform in the cloud. It can work with an HDFS data store on Azure cloud, Azure SQL Server, or an on-premise DBMS as well as a hybrid of them all.

I was compelled to understand how ADF works so I proceeded to load a simple dataset from a HDFS data store to an Azure SQL Server database then show the Azure SQL Server data in Microsoft excel visualization. The “How To” link at the end of this blog provides step-by-step instructions to achieve that objective.

Before we get to the detailed instructions I would like to share a few advantages of ADF compared to Microsoft SSIS that I observed.

Advantages of ADF:
• Extreme scalability
• Low startup cost
• Fast time to prototype

(Instant compute provisioning)
• Can build custom C# transformations
• Superior error handling
• Easy to access and high

• Built-in reprocess capability (can re-process any data slice)
• Automatic scheduling

Advantages SSIS:
• Dozens of data source connectors supported
• Very well integration with Microsoft data stores
• Advanced development tools and strong GUI
• Library of pre-built transformations
• Better monitoring including data lineage
• Debugging features
• Scheduling with SQL Agent jobs or Window services

My step-by-step experience with Azure Data Factory is detailed in the following link below:

How To – Setup Azure Data Factory (ADF)

Linh Nguyen (Lead Data Consultant, linh@TenPoint7.com)

