Modern analytics relies on scalable data infrastructure. This course prepares participants to design and implement data solutions on Microsoft Azure, covering both batch and real-time processing. They will learn to work with key Azure services through structured modules and practical exercises.
Learning Outcomes:
Design data storage and transformation strategies
Implement ingestion and processing with Azure tools
Optimise data solutions for performance and cost
Secure and monitor end-to-end data workflows
Key Topics:
Azure Synapse Analytics and Data Lake Storage
Data ingestion with Azure Data Factory and Event Hubs
Stream processing using Azure Stream Analytics
Security and monitoring in data engineering
Integration with Power BI and external systems
Exam Details
This course is designed to build participants’ understanding of key concepts and domains covered in the DP-203: Data Engineering on Microsoft Azure certification.
While the course provides technical training aligned with certification objectives, the certification exam is not bundled and must be registered separately.
To maximise success, participants are strongly encouraged to complement the course with additional self-study, revision of course materials, and dedicated practice before attempting the exam.
1. Introduction to Azure Synapse Analytics
Learn about the features and capabilities of Azure Synapse Analytics - a cloud-based platform for big data processing and analysis.
2. Explore Azure Databricks
Azure Databricks is a cloud service that provides a scalable platform for data analytics using Apache Spark.
3. Introduction to Azure Data Lake storage
Learn how Azure Data Lake Storage provides a cloud storage service that is highly available, secure, durable, scalable, and redundant and brings new efficiencies to processing big data analytics workloads.
4. Work with data streams by using Azure Stream Analytics
Explore how Azure Stream Analytics integrates with your applications or Internet of Things (IoT) devices to gain insights with real-time streaming data. Learn how to consume and analyze data streams and derive actionable results.
5. Use Azure Synapse serverless SQL pool to query files in a data lake
With Azure Synapse serverless SQL pool, you can leverage your SQL skills to explore and analyze data in files, without the need to load the data into a relational database.
6. Create a lake database in Azure Synapse Analytics
Why choose between working with files in a data lake or a relational database schema? With lake databases in Azure Synapse Analytics, you can combine the benefits of both.
7. Secure data and manage users in Azure Synapse serverless SQL pools
Learn how you can set up security when using Azure Synapse serverless SQL pools
8. Use Apache Spark in Azure Databricks
Azure Databricks is built on Apache Spark and enables data engineers and analysts to run Spark jobs to transform, analyze and visualize data at scale.
9. Use Delta Lake in Azure Databricks
Delta Lake is an open source relational storage area for Spark that you can use to implement a data lakehouse architecture in Azure Databricks.
10. Analyze data with Apache Spark in Azure Synapse Analytics
Apache Spark is a core technology for large-scale data analytics. Learn how to use Spark in Azure Synapse Analytics to analyze and visualize data in a data lake.
11. Integrate SQL and Apache Spark pools in Azure Synapse Analytics
Learn how to integrate SQL and Apache Spark pools in Azure Synapse Analytics.
12. Use data loading best practices in Azure Synapse Analytics
Learn the best practices you need to adopt to load data into a data warehouse in Azure Synapse Analytics.
13. Petabyte-scale ingestion with Azure Data Factory or Azure Synapse Pipeline
In this module, you will learn the various methods that can be used to ingest data between various data stores using Azure Data Factory.
14. Integrate data with Azure Data Factory or Azure Synapse Pipeline
In this module, you will examine Azure Data Factory and the core components that enable you to create large scale data ingestion solutions in the cloud
15. Perform code-free transformation at scale with Azure Data Factory or Azure Synapse Pipeline
In this module, you will learn how to perform common data transformation and cleansing activities within Azure Data Factory without using code.
16. Orchestrate data movement and transformation in Azure Data Factory or Azure Synapse Pipeline
In this module, you will learn how Azure Data Factory can orchestrate large scale data movement by using other Azure Data Platform and Machine Learning technologies.
17. Plan hybrid transactional and analytical processing using Azure Synapse Analytics
Learn how hybrid transactional / analytical processing (HTAP) can help you perform operational analytics with Azure Synapse Analytics.
18. Implement Azure Synapse Link with Azure Cosmos DB
Azure Synapse Link for Cosmos DB enables HTAP integration between operational data in Cosmos DB and Azure Synapse Analytics runtimes for Spark and SQL.
19. Secure a data warehouse in Azure Synapse Analytics
Learn how to approach and implement security to protect your data with Azure Synapse Analytics.
20. Configure and manage secrets in Azure Key Vault
Storing and handling secrets, encryption keys, and certificates directly is risky, and every usage introduces the possibility of unintentional data exposure. Azure Key Vault provides a secure storage area for managing all your app secrets so you can properly encrypt your data in transit or while it's being stored.
21. Implement compliance controls for sensitive data
Explore data classification capabilities and degrees of confidentiality. Implement security options to maintain private data safe, including Azure SQL auditing, Microsoft Defender for SQL, row-level security, Dynamic Data Masking and Azure SQL Database Ledger.
22. Enable reliable messaging for Big Data applications using Azure Event Hubs
Connect sending and receiving applications with Event Hubs so you can handle extremely high loads without losing data.