Fortify Your Logs: A Guide to Secure, Cross-Zone Log Aggregation in Azure with Loki

Published:5 February 2026 - 11 min. read

Audit your Active Directory for weak passwords and risky accounts. Run your free Specops scan now!

Your DMZ sends logs somewhere. You know it does—you’ve seen the disk space disappear. But where those logs actually end up, whether anyone can read them, and whether an auditor three months from now can prove what happened at 3 AM last Tuesday? That’s where things get interesting.

Log aggregation across network zones isn’t just an observability problem. It’s a security architecture challenge where every solution that makes your life easier also punches a hole through your firewall. Traditional push-based logging—where DMZ agents connect directly to your internal log server—violates the fundamental principle that DMZ systems should never initiate connections to your trusted network. Auditors notice this. Penetration testers definitely notice this.

Grafana Loki offers a way out. Combined with Azure Event Hubs as a secure buffer and Grafana Alloy as the collection agent, you can build a pull-based log pipeline that keeps your network zones separate while still centralizing every authentication failure, API call, and “unexpected null pointer” your DMZ produces. No inbound firewall rules to the internal zone. No connection strings hardcoded in config files. Just logs flowing where they need to go through Azure Private Link.

This guide walks through implementing that architecture—from the Event Hubs namespace that acts as your buffer to the Loki deployment that stores everything in Azure Blob Storage.

Prerequisites

Before you start configuring components, verify you have the necessary Azure resources and permissions in place.

You’ll need:

  • Azure subscription with permissions to create Event Hubs namespaces, storage accounts, and assign RBAC roles

  • Two Virtual Networks (VNets) representing your DMZ and Internal zones (or subnets within a single VNet with appropriate NSG rules)

  • Azure CLI installed locally with an active authenticated session

  • Managed Identity configured for your workload (AKS cluster, VM, or container instance where agents will run)

  • Helm 3 if deploying Loki to Kubernetes

Verify your Azure CLI authentication:

az account show

You should see your subscription details. If not, run az login first.


Pro Tip: Set up separate managed identities for DMZ and Internal zone workloads from the start. Trying to retrofit identity isolation after you’ve deployed everything is like trying to add seat belts to a moving car.


Understanding the Pull-Based Architecture

The core challenge: your DMZ can’t initiate connections to internal systems, but your internal systems need the logs. Traditional syslog forwarding fails here because it requires the DMZ to push directly to your internal log server—exactly what firewall rules prohibit.

The solution is a three-tier architecture:

Zone Component Role Connection Direction
DMZ Grafana Alloy (Producer) Scrapes local logs, pushes to Event Hubs Outbound to Azure PaaS
Azure PaaS Event Hubs (Buffer) Stores logs temporarily, provides Kafka endpoint N/A (managed service)
Internal Grafana Alloy (Consumer) Pulls from Event Hubs, forwards to Loki Outbound to Azure PaaS
Internal Loki Stores logs in Azure Blob Storage Outbound to Azure Storage

This design ensures the Internal zone only makes outbound connections. The DMZ never touches internal infrastructure directly.

Creating the Event Hubs Buffer

Azure Event Hubs serves as your log buffer. It provides a Kafka-compatible endpoint, allowing Grafana Alloy to interact using standard Kafka protocols without Azure SDK dependencies.

Critical requirement: You need at minimum the Standard tier. The Basic tier does not support Kafka endpoints.

Create the Event Hubs namespace:

az eventhubs namespace create \
  --name logs-buffer-ns \
  --resource-group rg-logging \
  --location eastus \
  --sku Standard \
  --capacity 1

The --capacity parameter sets Throughput Units. One TU supports up to 1 MB/s ingress. Standard tier allows up to 40 TUs with auto-inflate enabled.

Create an Event Hub (Kafka topic equivalent) within the namespace:

az eventhubs eventhub create \
  --name dmz-logs \
  --namespace-name logs-buffer-ns \
  --resource-group rg-logging \
  --partition-count 4 \
  --message-retention 1

Partition count determines parallelism—match this to the number of Alloy consumer instances you’ll run in the Internal zone. Message retention is in days; 1 day is sufficient for a buffer since Loki will persist logs long-term.

Retrieve the Kafka endpoint for configuration later:

az eventhubs namespace show \
  --name logs-buffer-ns \
  --resource-group rg-logging \
  --query "serviceBusEndpoint" -o tsv

You’ll see something like https://logs-buffer-ns.servicebus.windows.net:443/. For Kafka connections, use port 9093: logs-buffer-ns.servicebus.windows.net:9093.

Securing Event Hubs with Private Link

Right now, your Event Hubs namespace is exposed to the public internet. Any client with credentials can connect. Azure Private Link restricts access to traffic originating from within your VNets.

Create a private endpoint in your DMZ VNet:

az network private-endpoint create \
  --name pe-eventhubs-dmz \
  --resource-group rg-logging \
  --vnet-name vnet-dmz \
  --subnet snet-dmz-private \
  --private-connection-resource-id $(az eventhubs namespace show --name logs-buffer-ns --resource-group rg-logging --query id -o tsv) \
  --group-id namespace \
  --connection-name dmz-to-eventhubs

Create a second private endpoint in your Internal VNet:

az network private-endpoint create \
  --name pe-eventhubs-internal \
  --resource-group rg-logging \
  --vnet-name vnet-internal \
  --subnet snet-internal-private \
  --private-connection-resource-id $(az eventhubs namespace show --name logs-buffer-ns --resource-group rg-logging --query id -o tsv) \
  --group-id namespace \
  --connection-name internal-to-eventhubs

Both zones can now reach Event Hubs via private IP addresses within their respective VNets. Configure your Event Hubs firewall to deny all public traffic:

az eventhubs namespace network-rule-set update \
  --namespace-name logs-buffer-ns \
  --resource-group rg-logging \
  --default-action Deny \
  --public-network Disabled

Reality Check: If you skip Private Link and leave Event Hubs publicly accessible, you’re just moving your security problem from the firewall to the identity layer. Both DMZ and Internal agents will authenticate via Managed Identity, but traffic crosses the public internet.


Verify the private endpoint resolved correctly from within a VM in the DMZ:

nslookup logs-buffer-ns.servicebus.windows.net

You should see a private IP address (e.g., 10.0.1.5) instead of a public Azure IP range.

Configuring Azure Blob Storage for Loki

Loki stores log data in object storage. Azure Blob Storage provides the necessary durability and cost efficiency—significantly cheaper than managed disks for cold log data.

Create a storage account:

az storage account create \
  --name stlokiprod001 \
  --resource-group rg-logging \
  --location eastus \
  --sku Standard_LRS \
  --kind StorageV2

Create two containers—one for log chunks (the actual log content) and one for the index metadata:

az storage container create \
  --name loki-chunks \
  --account-name stlokiprod001

az storage container create \
  --name loki-ruler \
  --account-name stlokiprod001

Apply a private endpoint for the storage account:

az network private-endpoint create \
  --name pe-storage-internal \
  --resource-group rg-logging \
  --vnet-name vnet-internal \
  --subnet snet-internal-private \
  --private-connection-resource-id $(az storage account show --name stlokiprod001 --resource-group rg-logging --query id -o tsv) \
  --group-id blob \
  --connection-name internal-to-storage

Disable public network access:

az storage account update \
  --name stlokiprod001 \
  --resource-group rg-logging \
  --default-action Deny \
  --public-network-access Disabled

Your Loki deployment will access storage via private IP from within the Internal zone.

Assigning Managed Identity Permissions

Managed Identities eliminate the “secret zero” problem—you don’t store connection strings or SAS tokens in configuration files.

Assign the Azure Event Hubs Data Sender role to the DMZ workload identity:

az role assignment create \
  --assignee <dmz-managed-identity-principal-id> \
  --role "Azure Event Hubs Data Sender" \
  --scope $(az eventhubs namespace show --name logs-buffer-ns --resource-group rg-logging --query id -o tsv)

Replace <dmz-managed-identity-principal-id> with the principal ID of your DMZ workload’s managed identity (retrieve via az identity show if using a User-Assigned Managed Identity).

Assign the Azure Event Hubs Data Receiver role to the Internal workload identity:

az role assignment create \
  --assignee <internal-managed-identity-principal-id> \
  --role "Azure Event Hubs Data Receiver" \
  --scope $(az eventhubs namespace show --name logs-buffer-ns --resource-group rg-logging --query id -o tsv)

Assign Storage Blob Data Contributor to the Loki workload identity:

az role assignment create \
  --assignee <loki-managed-identity-principal-id> \
  --role "Storage Blob Data Contributor" \
  --scope $(az storage account show --name stlokiprod001 --resource-group rg-logging --query id -o tsv)

Verify role assignments:

az role assignment list \
  --assignee <managed-identity-principal-id> \
  --all -o table

You should see the appropriate Event Hubs and Storage roles listed.

Deploying Loki in the Internal Zone

Loki deployment modes range from monolithic (single binary, suitable for testing) to microservices (every component independent, massive scale). For most production environments, the Simple Scalable mode provides the right balance—separate read and write paths without microservices complexity.

Deployment Mode Use Case Components
Monolithic Development, <100 GB/day Single binary runs all services
Simple Scalable Production, <1 TB/day Read, Write, Backend targets
Microservices Massive scale, >1 TB/day Distributor, Ingester, Querier, etc. run separately

Install Loki via Helm in your Internal zone AKS cluster:

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

helm install loki grafana/loki \
  --namespace logging \
  --create-namespace \
  --set deploymentMode=SimpleScalable \
  --set loki.storage.type=azure \
  --set loki.storage.azure.accountName=stlokiprod001 \
  --set loki.storage.azure.useManagedIdentity=true \
  --set loki.storage.azure.container=loki-chunks \
  --set loki.storage.azure.endpointSuffix=core.windows.net

This configuration tells Loki to authenticate to Azure Blob Storage using the AKS cluster’s managed identity (via Azure Workload Identity or Pod Identity, depending on your setup).

Verify Loki pods are running:

kubectl get pods -n logging

You should see loki-read, loki-write, and loki-backend pods in a Running state.

Check Loki logs for storage connection issues:

kubectl logs -n logging deployment/loki-write --tail=50

If you see authentication errors, verify the managed identity has the Storage Blob Data Contributor role and that the private endpoint DNS resolution is working.

Configuring Grafana Alloy in the DMZ (Producer)

Grafana Alloy replaces Promtail as the recommended log collection agent. Promtail is feature-complete but in maintenance mode—Alloy is the actively developed successor with OpenTelemetry support.

Agent Status Use Case
Promtail Long-Term Support (EOL March 2026) Existing deployments, Loki-only environments
Grafana Alloy Active development New deployments, multi-signal observability (logs, metrics, traces)

Create an Alloy configuration file for the DMZ producer:

// Scrape logs from local files
local.file_match "app_logs" {
  path_targets = [{
    __path__ = "/var/log/app/*.log",
  }]
}

loki.source.file "app_logs" {
  targets    = local.file_match.app_logs.targets
  forward_to = [loki.write.eventhubs.receiver]
}

// Write logs to Azure Event Hubs via Kafka protocol
loki.write "eventhubs" {
  endpoint {
    url = "kafka://logs-buffer-ns.servicebus.windows.net:9093/dmz-logs"

    kafka {
      brokers = ["logs-buffer-ns.servicebus.windows.net:9093"]

      authentication {
        mechanism = "oauth"
      }

      tls {
        enabled = true
      }
    }
  }
}

The mechanism = "oauth" setting tells Alloy to retrieve Azure AD tokens automatically using the Azure SDK for Go (leveraging the VM or container’s managed identity via Azure Instance Metadata Service).

Deploy Alloy as a DaemonSet in your DMZ Kubernetes cluster or as a systemd service on VMs. When the agent starts, it authenticates to Event Hubs using the managed identity assigned the Azure Event Hubs Data Sender role.

Configuring Grafana Alloy in the Internal Zone (Consumer)

The Internal zone Alloy instance pulls logs from Event Hubs and forwards them to Loki.

Create an Alloy configuration for the consumer:

// Pull logs from Azure Event Hubs
loki.source.azure_event_hubs "dmz_logs" {
  fully_qualified_namespace = "logs-buffer-ns.servicebus.windows.net:9093"
  event_hubs                = ["dmz-logs"]

  authentication {
    mechanism = "oauth"
  }

  forward_to = [loki.write.local.receiver]
}

// Forward to local Loki instance
loki.write "local" {
  endpoint {
    url = "http://loki-write.logging.svc.cluster.local:3100/loki/api/v1/push"
  }
}

The loki.source.azure_event_hubs component consumes from Event Hubs using the Kafka protocol. It automatically retrieves OAuth tokens via the managed identity assigned the Azure Event Hubs Data Receiver role.

Deploy this Alloy configuration in your Internal zone. Logs now flow: DMZ app → DMZ Alloy → Event Hubs → Internal Alloy → Loki → Azure Blob Storage.


Quick Win: Use Alloy’s built-in metrics endpoint (/metrics) to monitor lag between Event Hubs and Loki ingestion. If consumer lag grows, you’re not pulling fast enough—add more Alloy consumer instances or increase Event Hubs partitions.


Validating the Pipeline

Verify logs are flowing through each stage of the pipeline.

Check Event Hubs metrics for incoming messages:

az monitor metrics list \
  --resource $(az eventhubs namespace show --name logs-buffer-ns --resource-group rg-logging --query id -o tsv) \
  --metric IncomingMessages \
  --interval PT1M

You should see non-zero message counts if the DMZ Alloy agent is successfully pushing logs.

Query Loki directly via its API from within the Internal zone:

curl -G -s "http://loki-read.logging.svc.cluster.local:3100/loki/api/v1/query" \
  --data-urlencode 'query={job="dmz-logs"}' \
  --data-urlencode 'limit=10' | jq .

If Loki returns log entries, your pipeline is operational.

Connect Grafana to Loki as a data source and query using LogQL:

“`plain text
{job=”dmz-logs”} |= “error”

This retrieves all log lines from the DMZ containing "error"—useful for filtering authentication failures, API errors, or application exceptions.

## Network Security Group Rules

Your NSG rules enforce zone isolation. The DMZ should only make outbound connections to Event Hubs. The Internal zone should make outbound connections to both Event Hubs and Azure Blob Storage.

Example DMZ NSG rule (outbound):

| Priority | Name | Source | Destination | Port | Action |
| --- | --- | --- | --- | --- | --- |
| 100 | Allow-EventHubs-Outbound | VirtualNetwork | 10.1.0.5/32 (Event Hubs private IP) | 9093 | Allow |
| 200 | Deny-Internal-Outbound | VirtualNetwork | 10.2.0.0/16 (Internal VNet CIDR) | Any | Deny |

Example Internal NSG rule (outbound):

| Priority | Name | Source | Destination | Port | Action |
| --- | --- | --- | --- | --- | --- |
| 100 | Allow-EventHubs-Outbound | VirtualNetwork | 10.1.0.5/32 | 9093 | Allow |
| 110 | Allow-Storage-Outbound | VirtualNetwork | 10.1.0.6/32 (Storage private IP) | 443 | Allow |

The Internal zone never receives inbound connections from the DMZ. All connections are outbound from Internal to Azure PaaS services.

## Retention and Compliance

Loki manages retention via the Compactor component. Configure retention in your Loki Helm values:

loki:
storage:
type: azure
limits_config:
retention_period: 30d
“`

The Compactor scans the index and marks chunks outside the retention window for deletion. This ensures logs older than 30 days are purged from Azure Blob Storage.

For compliance environments (PCI DSS Requirement 10, NIST SP 800-53 AU-2), you may need longer retention. Adjust retention_period accordingly. Azure Blob Storage Lifecycle Management policies can act as a fail-safe, but let Loki manage retention primarily to maintain index consistency.

If you need immutable logs (PCI DSS Requirement 10.5), enable Azure Blob Immutable Storage with a time-based retention policy. This prevents log tampering by making blobs write-once-read-many (WORM).

Monitoring Alloy and Loki

Both Alloy and Loki expose Prometheus-compatible /metrics endpoints. Scrape these with your existing Prometheus instance or Azure Monitor.

Key Alloy metrics to monitor:

Metric Purpose
loki_write_sent_bytes_total Total bytes sent to downstream (Event Hubs or Loki)
loki_source_azure_event_hubs_lag Consumer lag in Event Hubs (messages waiting to be pulled)
loki_write_errors_total Failed write attempts (indicates connectivity or auth issues)

Key Loki metrics to monitor:

Metric Purpose
loki_ingester_chunks_flushed_total Chunks successfully written to Azure Blob Storage
loki_request_duration_seconds Query performance (p95, p99 latency)

Set up alerts for consumer lag exceeding thresholds (e.g., >10,000 messages) and write errors exceeding zero.

Troubleshooting Common Issues

Event Hubs authentication failures:

  • Verify the managed identity has the correct role assignment (Azure Event Hubs Data Sender/Receiver)

  • Check that the private endpoint DNS resolution is working (nslookup logs-buffer-ns.servicebus.windows.net)

  • Ensure mechanism = "oauth" is set in the Alloy authentication block

Loki storage write errors:

  • Verify the managed identity has Storage Blob Data Contributor role

  • Check private endpoint connectivity to Azure Blob Storage

  • Review Loki pod logs for 403 Forbidden or connection timeout errors

Consumer lag increasing:

  • Scale Alloy consumer instances to match Event Hubs partition count

  • Increase Loki write component concurrency

  • Check if Loki ingesters are overwhelmed (review ingester metrics)

Logs missing from Loki queries:

  • Verify labels are consistent between producer and consumer Alloy configs

  • Check Event Hubs retention period—logs older than retention are lost if not consumed

  • Query Loki for label combinations ({job="dmz-logs"}) to confirm logs exist

Cost Optimization

This architecture incurs costs across Event Hubs, Azure Blob Storage, and compute (AKS or VMs running Alloy and Loki).

Component Cost Driver Optimization
Event Hubs Throughput Units, ingress events Use Standard tier with auto-inflate disabled; tune retention to 1 day
Azure Blob Storage Data stored, API operations Configure Loki retention; use Cool tier for long-term archived logs
Compute (AKS/VM) VM hours Right-size Alloy and Loki nodes; use Azure Spot VMs for non-critical workloads

Loki’s architecture—indexing only metadata, not full log content—significantly reduces storage costs compared to full-text indexers like Elasticsearch. Expect 5-10x lower storage requirements for equivalent log volume.

What You’ve Built

Centralizing logs across network zones without compromising security requires architectural discipline. The pull-based pattern—DMZ pushes to buffer, Internal pulls from buffer—maintains firewall integrity while delivering the observability you need. Azure Event Hubs provides the Kafka-compatible buffer, Grafana Alloy handles collection with modern OAuth authentication, and Loki stores everything efficiently in Azure Blob Storage.

You’ve eliminated inbound firewall rules to the Internal zone. You’ve removed hardcoded credentials from configuration files. And when the auditor asks where your DMZ authentication logs from last quarter are, you’ll have an answer that doesn’t involve “I think they’re still on that server Dave used to manage.”

Hate ads? Want to support the writer? Get many of our tutorials packaged as an ATA Guidebook.

Explore ATA Guidebooks

Looks like you're offline!