(Cross-posted on DevZone.ChannelAdam.com)
The other day I tracked down an issue my team was seeing with an Azure App Service Logic App. The Logic App was being triggered by an Azure Service Bus Connector that was configured to work with a Service Bus Topic. With a message published to the Service Bus Topic, an instance of the Logic App would execute successfully — as expected on the trigger's configured schedule. But then, unexpectedly, on the next scheduled time period, often a new instance of the Logic App would execute and reprocess the exact same message from the Service Bus Topic. It was as if the Service Bus Topic message was never successfully "completed" and thus removed from the Topic.
In order to understand why the same Service Bus Topic message was being processed twice, this is what you need to know.
Now that we understand how the Complete method is called as part of the Service Bus Connector's trigger mechanism, it is easy to understand the cause of the problem.
If the second call to the trigger fails (the call that would complete the message), for instance, because of a transient error on the gateway, then the Service Bus message will not be completed — until perhaps the next scheduled trigger.
However, after the default lock duration of 1 minute expires, the lock is lost and Service Bus makes the message available to be received again.
This is exactly what happened — the message was not completed and when the next scheduled trigger executes, the same message is processed again.
From the Trigger History of the Logic App, we can see the trigger that failed in the image below.
The call to this trigger returned a 502 "Bad Gateway" error.
502 - Web server received an invalid response while acting as a gateway or proxy server.
The full JSON of the output message for the failed trigger is:
{
"headers": {
"date": "Mon, 07 Sep 2015 23:28:32 GMT",
"server": "Microsoft-IIS/8.0"
},
"body": "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">\r\n<html xmlns=\"http://www.w3.org/1999/xhtml\">\r\n<head>\r\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=iso-8859-1\"/>\r\n<title>502 - Web server received an invalid response while acting as a gateway or proxy server.</title>\r\n<style type=\"text/css\">\r\n<!--\r\nbody{margin:0;font-size:.7em;font-family:Verdana, Arial, Helvetica, sans-serif;background:#EEEEEE;}\r\nfieldset{padding:0 15px 10px 15px;} \r\nh1{font-size:2.4em;margin:0;color:#FFF;}\r\nh2{font-size:1.7em;margin:0;color:#CC0000;} \r\nh3{font-size:1.2em;margin:10px 0 0 0;color:#000000;} \r\n#header{width:96%;margin:0 0 0 0;padding:6px 2% 6px 2%;font-family:\"trebuchet MS\", Verdana, sans-serif;color:#FFF;\r\nbackground-color:#555555;}\r\n#content{margin:0 0 0 2%;position:relative;}\r\n.content-container{background:#FFF;width:96%;margin-top:8px;padding:10px;position:relative;}\r\n-->\r\n</style>\r\n</head>\r\n<body>\r\n<div id=\"header\"><h1>Server Error</h1></div>\r\n<div id=\"content\">\r\n <div class=\"content-container\"><fieldset>\r\n <h2>502 - Web server received an invalid response while acting as a gateway or proxy server.</h2>\r\n <h3>There is a problem with the page you are looking for, and it cannot be displayed. When the Web server (while acting as a gateway or proxy) contacted the upstream content server, it received an invalid response from the content server.</h3>\r\n </fieldset></div>\r\n</div>\r\n</body>\r\n</html>\r\n"
}
This situation should not happen often, but unfortunately, it was occurring multiple times per day, and only ever on the second call to the trigger method. The only real solution is to follow Microsoft's advice and design your systems to be idempotent, because it can and occasionally will happen.
In addition to that, to reduce the frequency of the situation, hopefully Microsoft will release a change in the near future so that failed calls to the trigger method are automatically retried.