Building a Modern Microservices Platform: From Zero to Production with Terraform, AKS, and Helm
A comprehensive guide to building a scalable, infrastructure-as-code microservices platform on Azure Kubernetes Service
Introduction: Why This Approach Matters
In today’s fast-paced software development landscape, organisations are increasingly adopting microservices architecture to build scalable, maintainable applications. However, setting up the underlying infrastructure to support microservices can be complex and time-consuming. This article demonstrates how to create a robust, automated microservices platform using modern DevOps tools and practices.
The Challenge: Traditional vs. Automated Infrastructure
Traditional application deployment often involves manual server configuration, inconsistent environments, and complex deployment processes. Let’s examine the stark difference between manual and automated approaches:
Traditional Approach (Manual):
Developer: "I need a new environment for testing"
↓
Ops Team: Manually provisions servers (2-3 days)
↓
Ops Team: Manually installs Kubernetes (1-2 days)
↓
Ops Team: Manually configures networking (1 day)
↓
Ops Team: Manually sets up monitoring (1-2 days)
↓
Ops Team: Manually configures SSL certificates (half day)
↓
Ready after 1-2 weeks (and prone to configuration drift)Our Automated Approach:
# Single command creates entire microservices platform
terraform apply# 15 minutes later: Complete platform ready with:
# ✅ Kubernetes cluster
# ✅ Load balancer with external IP
# ✅ SSL certificate management
# ✅ Monitoring stack (Prometheus + Grafana)
# ✅ Ingress controller for API gateway
# ✅ Auto-scaling capabilitiesThis dramatic difference in deployment time and consistency is just the beginning. As applications grow, teams using traditional approaches face several escalating challenges:
- Environment Inconsistency: “It works on my machine” syndrome multiplies across teams
- Manual Infrastructure Management: Time-consuming and error-prone processes that don’t scale
- Scaling Difficulties: Hard to scale individual components independently without automation
- Monitoring Gaps: Limited visibility into application performance across distributed services
- Security Concerns: Inconsistent security configurations across environments and services
- Developer Productivity: Developers spend more time on infrastructure than on business logic
- Time-to-Market: Weeks of infrastructure setup delay feature delivery
Our Solution
We’ll build a modern microservices platform that addresses these challenges using:
- Infrastructure as Code (IaC) with Terraform
- Container Orchestration with Azure Kubernetes Service (AKS)
- Package Management with Helm
- Automated Monitoring with Prometheus and Grafana
- SSL/TLS Management with Cert-Manager
- API Gateway with Nginx Ingress Controller
Architecture Overview
Our platform follows cloud-native principles and consists of several key components:
┌─────────────────────────────────────────────────────────────┐
│ Azure Cloud │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ AKS Cluster │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Ingress │ │ Monitoring │ │ Apps │ │ │
│ │ │ Controller │ │ Stack │ │ Namespace │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ • Nginx │ │ • Prometheus│ │ • Your Apps │ │ │
│ │ │ • LoadBalancer│ │ • Grafana │ │ • Services │ │ │
│ │ │ • SSL Certs │ │ • Alerting │ │ • Ingress │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘Prerequisites
Before we begin, ensure you have:
- Azure Account with sufficient permissions
- Azure CLI installed and configured
- Terraform (v1.0+) installed
- kubectl installed
- Helm (v3.0+) installed
- Basic knowledge of Docker, Kubernetes, and cloud concepts
Step 1: Infrastructure as Code with Terraform
Why Terraform?
Terraform provides several advantages for infrastructure management:
- Version Control: Infrastructure changes can be tracked and reviewed
- Reproducibility: Identical environments across dev, staging, and production
- Automation: Reduces manual errors and deployment time
- Cost Management: Easy to destroy and recreate environments
- Documentation: Infrastructure is self-documenting through code
Setting Up the Foundation
First, create the main Terraform configuration:
# main.tf
terraform {
required_version = ">= 1.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.0"
}
helm = {
source = "hashicorp/helm"
version = "~> 2.0"
}
}
}provider "azurerm" {
features {}
}# Data source for current Azure client configuration
data "azurerm_client_config" "current" {}# Resource Group
resource "azurerm_resource_group" "aks" {
name = var.resource_group_name
location = var.location tags = {
Environment = var.environment
Project = "microservices-platform"
ManagedBy = "terraform"
}
}
Key Benefits of This Approach:
- Tagging Strategy: Enables cost tracking and resource management
- Version Constraints: Ensures consistent provider behavior
- Data Sources: Leverages existing Azure configuration
AKS Cluster Configuration
# aks.tf
resource "azurerm_kubernetes_cluster" "aks" {
name = var.cluster_name
location = azurerm_resource_group.aks.location
resource_group_name = azurerm_resource_group.aks.name
dns_prefix = "${var.cluster_name}-dns"
kubernetes_version = var.kubernetes_versiondefault_node_pool {
name = "default"
node_count = var.node_count
vm_size = var.node_vm_size
type = "VirtualMachineScaleSets"
availability_zones = ["1", "2", "3"]
enable_auto_scaling = true
min_count = 1
max_count = 10 # Enable system-assigned managed identity
upgrade_settings {
max_surge = "10%"
}
} identity {
type = "SystemAssigned"
} network_profile {
network_plugin = "azure"
load_balancer_sku = "standard"
network_policy = "azure"
} # Enable Azure Policy and monitoring
azure_policy_enabled = true
oms_agent {
log_analytics_workspace_id = azurerm_log_analytics_workspace.aks.id
} tags = {
Environment = var.environment
Project = "microservices-platform"
}
}
Why These Configuration Choices:
- Auto-scaling: Automatically adjusts cluster size based on demand
- Availability Zones: Ensures high availability across Azure regions
- Network Policy: Provides microsegmentation for security
- Monitoring Integration: Built-in observability with Azure Monitor
- System-Assigned Identity: Secure authentication without managing credentials
Variables Configuration
# variables.tf
variable "resource_group_name" {
description = "Name of the Azure Resource Group"
type = string
default = "rg-aks-terraform"
}variable "location" {
description = "Azure region for resources"
type = string
default = "East US"
}variable "cluster_name" {
description = "Name of the AKS cluster"
type = string
default = "aks-terraform-cluster"
}variable "kubernetes_version" {
description = "Kubernetes version"
type = string
default = "1.28.3"
}variable "node_count" {
description = "Initial number of nodes in the default node pool"
type = number
default = 2
}variable "node_vm_size" {
description = "Size of the virtual machines in the default node pool"
type = string
default = "Standard_DS2_v2"
}variable "environment" {
description = "Environment name (dev, staging, prod)"
type = string
default = "dev"
}
Step 2: Helm Integration for Application Management
Why Helm?
Helm provides several advantages for Kubernetes application management:
- Package Management: Pre-built charts for common applications
- Templating: Dynamic configuration based on environments
- Release Management: Easy rollbacks and upgrades
- Dependency Management: Handle complex application dependencies
- Community Support: Extensive library of maintained charts
Helm Provider Configuration
# helm.tf
provider "helm" {
kubernetes {
host = azurerm_kubernetes_cluster.aks.kube_config.0.host
client_certificate = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.client_certificate)
client_key = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.client_key)
cluster_ca_certificate = base64decode(azurerm_kubernetes_cluster.aks.kube_config.0.cluster_ca_certificate)
}
}# Ingress Controller
resource "helm_release" "nginx_ingress" {
name = "nginx-ingress"
repository = "https://kubernetes.github.io/ingress-nginx"
chart = "ingress-nginx"
namespace = "ingress-nginx"
version = "4.8.3" create_namespace = true set {
name = "controller.service.type"
value = "LoadBalancer"
} set {
name = "controller.service.annotations.service\\.beta\\.kubernetes\\.io/azure-load-balancer-health-probe-request-path"
value = "/healthz"
} depends_on = [azurerm_kubernetes_cluster.aks]
}
Key Configuration Decisions:
- LoadBalancer Type: Provides external access through Azure Load Balancer
- Health Probes: Ensures reliable traffic routing
- Namespace Isolation: Separates ingress components from applications
- Version Pinning: Ensures consistent deployments
Step 3: Monitoring Stack with Prometheus and Grafana
Why Monitoring Matters
Effective monitoring is crucial for microservices because:
- Distributed Complexity: Multiple services require comprehensive observability
- Performance Optimization: Identify bottlenecks and optimize resource usage
- Incident Response: Quick detection and resolution of issues
- Capacity Planning: Data-driven decisions for scaling
- SLA Compliance: Track and ensure service level objectives
Prometheus and Grafana Setup
# monitoring.tf
resource "helm_release" "prometheus" {
name = "prometheus"
repository = "https://prometheus-community.github.io/helm-charts"
chart = "kube-prometheus-stack"
namespace = "monitoring"
version = "54.0.1"create_namespace = true # Grafana configuration
set {
name = "grafana.adminPassword"
value = var.grafana_admin_password
} set {
name = "grafana.service.type"
value = "ClusterIP"
} set {
name = "grafana.ingress.enabled"
value = "false"
} # Prometheus configuration
set {
name = "prometheus.prometheusSpec.retention"
value = "15d"
} set {
name = "prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage"
value = "10Gi"
} depends_on = [azurerm_kubernetes_cluster.aks]
}
Monitoring Stack Benefits:
- Metrics Collection: Automatic collection of cluster and application metrics
- Alerting: Proactive notification of issues
- Visualization: Rich dashboards for system insights
- Historical Data: Trend analysis and capacity planning
- Multi-tenancy: Separate monitoring per namespace/team
Step 4: SSL/TLS Certificate Management
Why Cert-Manager?
SSL/TLS certificates are essential for:
- Security: Encrypted communication between clients and services
- Compliance: Meeting regulatory requirements
- Trust: Building user confidence in your applications
- SEO: Search engine ranking benefits
# cert-manager.tf
resource "helm_release" "cert_manager" {
name = "cert-manager"
repository = "https://charts.jetstack.io"
chart = "cert-manager"
namespace = "cert-manager"
version = "v1.13.2"create_namespace = true set {
name = "installCRDs"
value = "true"
} set {
name = "global.leaderElection.namespace"
value = "cert-manager"
} depends_on = [azurerm_kubernetes_cluster.aks]
}
Cert-Manager Advantages:
- Automation: Automatic certificate provisioning and renewal
- Let’s Encrypt Integration: Free SSL certificates
- Multiple Issuers: Support for different certificate authorities
- Kubernetes Native: Integrates seamlessly with ingress controllers
Step 5: Deployment and Testing
Deploying the Infrastructure
# Initialize Terraform
terraform init# Review the execution plan
terraform plan# Deploy the infrastructure
terraform apply# Get cluster credentials
az aks get-credentials --resource-group rg-aks-terraform --name aks-terraform-cluster
Testing the Platform
Create a test application to verify everything works:
# test-app.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-app
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: test-app
template:
metadata:
labels:
app: test-app
spec:
containers:
- name: test-app
image: nginx:latest
ports:
- containerPort: 80
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi---
apiVersion: v1
kind: Service
metadata:
name: test-app-service
namespace: default
spec:
selector:
app: test-app
ports:
- port: 80
targetPort: 80
type: ClusterIP---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: test-app-ingress
namespace: default
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: test-app.local
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: test-app-service
port:
number: 80
Verification Commands
# Deploy test application
kubectl apply -f test-app.yaml
# Check cluster status
kubectl get nodes
kubectl get pods --all-namespaces
helm list --all-namespaces# Get external IP
kubectl get svc -n ingress-nginx# Test application access
kubectl port-forward svc/test-app-service 8081:80# Access Grafana
kubectl port-forward -n monitoring svc/prometheus-grafana 3000:80
Advantages of This Approach
1. Infrastructure as Code Benefits
- Version Control: All infrastructure changes are tracked
- Reproducibility: Identical environments across stages
- Documentation: Infrastructure is self-documenting
- Automation: Reduces manual errors and deployment time
- Cost Control: Easy to destroy and recreate environments
2. Microservices Platform Benefits
- Scalability: Independent scaling of services
- Technology Diversity: Different services can use different tech stacks
- Team Independence: Teams can deploy independently
- Fault Isolation: Issues in one service don’t affect others
- Continuous Deployment: Fast, frequent deployments
3. Operational Excellence
- Monitoring: Comprehensive observability out of the box
- Security: Built-in network policies and SSL/TLS
- High Availability: Multi-zone deployment for resilience
- Auto-scaling: Automatic resource adjustment based on demand
- Maintenance: Automated certificate management and updates
4. Developer Experience
- Consistent Environments: Same configuration across all stages
- Easy Onboarding: New team members can spin up environments quickly
- Local Development: Port forwarding for easy testing
- CI/CD Ready: Foundation for automated deployment pipelines
Production Readiness Considerations
While this setup provides a solid foundation, additional considerations for production include:
Security Enhancements
- Network Segmentation: Implement network policies between services
- RBAC: Role-based access control for team isolation
- Secret Management: Use Azure Key Vault for sensitive data
- Image Scanning: Container vulnerability scanning
- Pod Security Standards: Enforce security policies
Reliability Improvements
- Multi-Region Deployment: Disaster recovery across regions
- Backup Strategy: Regular backups of persistent data
- Circuit Breakers: Implement resilience patterns
- Chaos Engineering: Regular failure testing
- SLA Monitoring: Track and alert on service level objectives
Performance Optimization
- Resource Limits: Proper CPU and memory allocation
- Horizontal Pod Autoscaling: Automatic scaling based on metrics
- Cluster Autoscaling: Node pool scaling for cost optimisation
- Caching Strategies: Implement appropriate caching layers
- CDN Integration: Content delivery for global users
Operational Maturity
- CI/CD Pipelines: Automated testing and deployment
- GitOps: Infrastructure and application configuration in Git
- Observability: Comprehensive logging, metrics, and tracing
- Incident Management: Runbooks and automated response
- Cost Optimisation: Regular review and optimisation of resources
Conclusion
Building a modern microservices platform requires careful consideration of multiple components working together. By using Infrastructure as Code with Terraform, container orchestration with AKS, and package management with Helm, we’ve created a foundation that addresses many common challenges in microservices deployment.
This approach provides:
- Automation: Reduced manual effort and errors
- Scalability: Ability to grow with your application needs
- Observability: Comprehensive monitoring and alerting
- Security: Built-in security best practices
- Developer Productivity: Faster development and deployment cycles
The platform we’ve built serves as a starting point for production workloads. With additional security, reliability, and operational considerations, this foundation can support enterprise-scale microservices applications.
Ready to build your own microservices platform? Start with this foundation and adapt it to your specific needs. Remember, the best architecture is one that evolves with your organisation’s requirements.
Code Repository
The complete code for this tutorial is available on GitHub: microservices-platform-terraform