Description
We are seeking a Senior Azure Cloud Engineer to own mission-critical cloud infrastructure on Microsoft Azure. You will drive architecture decisions, enforce governance and security standards, lead Azure Landing Zone design, and own the Microsoft Fabric platform — all codified through IaC. This role requires deep hands-on platform engineering expertise combined with strong cross-functional collaboration to align cloud capabilities with business objectives.
Key Responsibilities
Landing Zone & Platform Engineering
- Azure Landing Zone: management group hierarchy, subscription design, hub-and-spoke networking, and policy guardrails.
- Codify all infrastructure in Terraform; build reusable, versioned modules and templates — zero manual Azure setup.
- Automate environment promotion (NProd → Prod) with approval gates, rollback controls, and deployment stage validation.
- Enforce naming conventions, tagging standards, and resource group structure at scale via Azure Policy.
- Provision and govern Fabric platform components via IaC: workspaces, capacities (F SKUs), OneLake, RBAC, Git integration, and deployment pipelines.
Infrastructure, Network, & Security
- Architect scalable compute using VMs, VMSS, Availability Zones, App Service, AKS, Azure Functions, and Container Apps.
- Design and maintain networking topologies: VNets, peering, hub-and-spoke, NSGs, UDRs, Bastion, Private Endpoints, Azure Firewall, Front Door, and DDoS Protection.
- Support hybrid connectivity via ExpressRoute, Site-to-Site VPN, and Azure Virtual WAN; design Fabric Private Link and Managed VNet configurations.
- Manage secrets and encryption through Azure Key Vault; conduct security posture reviews via Defender for Cloud and remediate findings.
Governance, Cost & Reliability
- Own Azure Policy, Management Groups, subscription governance, tagging, and naming conventions.
- Drive cost optimization through Reserved Instances, right-sizing, auto-scaling, and Cost Management dashboards; administer Fabric F SKU capacity and workload optimization.
- Define SLOs/SLAs; implement chaos engineering, backup, and DR solutions with defined RPO/RTO targets.
- Administer the Fabric admin portal and tenant settings to enforce governance, workspace isolation, and usage controls.
Automation, CI/CD & DevOps
- Build and maintain IaC pipelines using Terraform (primary) and ARM in Azure DevOps or GitHub Actions, with deployment stages, approvals, and Key Vault secrets integration.
- Develop operational automation using PowerShell, Azure CLI, Python, and the Fabric PowerShell SDK.
- Automate Fabric operations — workspace lifecycle, capacity scaling, deployment pipelines, and REST API workflows — through scripted and pipeline-driven tooling
Microsoft Fabric Platform
- Provision and manage Fabric workspaces, capacities, Lakehouses, and OneLake via the Fabric Terraform provider and REST API.
- Design and maintain OneLake medallion architecture (Bronze / Silver / Gold) and enforce data tier governance.
- Manage Fabric Git integration, Deployment Pipelines (Dev → Test → Prod), workspace RBAC, and tenant governance.
- Configure Fabric monitoring and integrate telemetry into the broader observability platform; maintain F SKU awareness and capacity cost optimization.
Monitoring, Observability & Incident Response
- Implement monitoring, alerting, and dashboards using Azure Monitor, Log Analytics, Application Insights, and Microsoft Sentinel (SIEM).
- Lead incident response for platform issues; conduct post-incident reviews and drive permanent remediation.
Leadership & Collaboration
- Translate business requirements into scalable, secure cloud architectures; engage and advise stakeholders at all levels.
- Maintain technical documentation: runbooks, architecture diagrams, Landing Zone blueprints, and operational procedures.
- Mentor junior engineers; evaluate emerging Azure and Fabric services and recommend adoption roadmaps.
Required Qualifications
Experience & Platform
- 8+ years of hands-on Azure cloud engineering across identity, networking, compute, storage, monitoring, and automation.
- Proven experience designing and building Azure Landing Zones in production; demonstrable record of full IaC adoption eliminating manual setup.
- Strong networking fundamentals: TCP/IP, DNS, BGP, TLS applied to cloud environments.
Terraform (Mandatory)
- Expert-level Terraform: modules, variables, remote state, workspaces/environments, and Policy deployment, reusable architecture patterns.
- Production experience with the Microsoft Fabric Terraform provider: workspace, capacity, Lakehouse, RBAC, and Git integration provisioning via IaC.
CI/CD
- Strong experience with Azure DevOps or GitHub Actions: pipeline automation, deployment stages, approval workflows, and secrets integration.
- Hands-on experience with NProd → Prod environment promotion and automated validation.
Operation Automation
- Proficiency in PowerShell, Azure CLI, Python, and the Fabric PowerShell SDK for automation and operational tasks.
Microsoft Fabric
- Working knowledge of the Fabric Terraform provider, REST API, Deployment Pipelines, OneLake, medallion architecture, and capacity administration.
- Familiarity with Fabric admin portal, tenant settings, workspace governance, monitoring, Private Link, Managed VNets, and F SKU cost implications.
Preferred Qualifications
- Microsoft Certified: Azure Solutions Architect Expert or DevOps Engineer Expert.
- Experience in regulated industries (healthcare, finance, government) with ISO 27001, SOC 2, HIPAA, or FedRAMP compliance frameworks.
- FinOps practices and tooling for cloud financial management.
- Multi-cloud exposure (AWS, GCP) and cross-cloud networking strategies.
- Experience with Microsoft Sentinel, Defender for Cloud, and the broader Microsoft security portfolio.
- Microsoft 365 administration (Exchange Online, Teams, SharePoint, OneDrive, Entra ID, Intune) and endpoint management via Autopilot.
- IaC automation testing experience.
