Distributed Tracing using Correlation ID

Summary

Distributed tracing with a Correlation ID allows for seamless tracking of a request as it flows through different services. For instance, when an order processing issue occurs, the Correlation ID provides a single, consistent identifier that can be traced across all logs and telemetry data within Application Insights. Engineers can quickly filter logs, traces, and telemetry entries by this Correlation ID, which enables them to follow the order’s journey across Logic Apps, Functions, and Service Bus. This makes it easier to identify where the issue first occurred, whether it was a failure in a Function, a timeout in a Service Bus queue, or an error within a Logic App action.

Azure Application Insights

By tracing the Correlation ID back to the exact point of failure, it is possible to pinpoint the root cause efficiently, reducing the time and effort needed to resolve the issue and ensuring minimal disruption to the system.

When using Azure Service Bus, messages are transmitted using the AMQP (Advanced Message Queuing Protocol) schema. The Correlation ID is included as part of the message’s application properties, allowing it to be traced across different services.

Message {
    Header {
        durable: true,
        priority: 4,
        ttl: 60000
    },
    Properties {
        message-id: "abc123",
        user-id: "user@example.com",
        to: "queue-name",
        subject: "Order Processing",
        reply-to: "response-queue",
        correlation-id: "order-456",
        content-type: "application/json",
        content-encoding: "utf-8",
        absolute-expiry-time: <timestamp>,
        creation-time: <timestamp>,
        group-id: "order-group",
        group-sequence: 1,
        reply-to-group-id: "response-group"
    },
    Application Properties {
        "CorrelationId": "order-456",
        "OrderNumber": "12345",
        "CustomerId": "cust-789"
    },
    Body {
        ... (JSON or binary message content) ...
    }
}

How the Message with the Correlation ID Appears in an AMQP Schema:

In AMQP, a message is composed of several parts, including the Header, Properties, and Application Properties. The Correlation ID is typically placed in the Application Properties section, which is used to carry custom metadata about the message.

Essential Elements of Distributed Tracing:

  • Correlation IDs – Unique identifiers propagated throughout services to correlate related events.
  • Telemetry/Logging Services – Use Azure Application Insights to collect and analyse logs and traces.
  • Trace Propagation – Ensure tracing information, such as correlation IDs and request IDs, is propagated through all services, achieving end-to-end traceability.

Implementation Steps:

1. Set Up Application Insights for Logging and Tracing

  • Enable Azure Application Insights for both Azure Logic Apps and Azure Functions. This can be done through the Azure portal by navigating to the respective service and linking it to an Application Insights resource.
  • In Azure Functions, configure the host.json file to include Application Insights integration:
{
  "logging": {
    "applicationInsights": {
      "samplingSettings": {
        "isEnabled": true
      }
    }
  }
}

2. Generate and Propagate a Correlation ID

  • At the entry point, such as an HTTP-triggered Azure Function or a Logic App with an HTTP trigger, generate a Correlation ID:
    • For an HTTP-triggered Azure Function, check for an incoming Correlation ID in the headers (e.g., x-correlation-id). If not present, generate a new one:
string correlationId = req.Headers.ContainsKey("x-correlation-id")
                       ? req.Headers["x-correlation-id"].ToString()
                       : Guid.NewGuid().ToString();
  • In Logic Apps, use the “Request” action to extract or generate a Correlation ID. Store it in a variable for further use.

Ensure the Correlation ID is included in outgoing requests and Service Bus messages. For HTTP calls, add the Correlation ID to the headers:

HttpClient client = new HttpClient();
client.DefaultRequestHeaders.Add("x-correlation-id", correlationId);

When sending a message to Azure Service Bus from an Azure Function:

var message = new ServiceBusMessage("Message Body")
{
    ApplicationProperties = { ["CorrelationId"] = correlationId }
};
await sender.SendMessageAsync(message);

3. Service Bus Message Enrichment

  • Ensure that messages sent to Azure Service Bus contain the Correlation ID in their properties. This allows downstream services to extract and use this ID for logging and tracing.
  • In Azure Functions triggered by Service Bus, extract the Correlation ID
string correlationId = message.ApplicationProperties.ContainsKey("CorrelationId")
                        ? message.ApplicationProperties["CorrelationId"].ToString()
                        : Guid.NewGuid().ToString();

4. Logging and Tracing in Application Insights

  • Log the Correlation ID at each stage using Application Insights. This involves adding custom dimensions to your telemetry:
var telemetryClient = new TelemetryClient();
var telemetry = new TraceTelemetry("Processing Service Bus Message");
telemetry.Properties.Add("CorrelationId", correlationId);
telemetryClient.TrackTrace(telemetry);
  • In Logic Apps, use the “Tracked Properties” feature to include the Correlation ID in your telemetry for each action.

5. Monitor and Visualise Traces

  • Use Azure Application Insights to query and visualise traces. A typical Kusto Query Language (KQL) query might look like:
traces
| where customDimensions.CorrelationId == "your-correlation-id"
| order by timestamp asc
  • Create dashboards and alerts in Application Insights based on these queries to monitor system performance and identify issues.

6. Error Handling and Retries

  • Ensure that your error handling logic includes the Correlation ID during retries or error flows, maintaining traceability across all retries and failure points.