Monday, January 8, 2018

Exchange Unified Messaging - Does Anybody Really Know What Time it is? (Your Edge Servers Really Should...)

Greetings and Happy New Year! I hope everyone's 2018 is off to a great start.

I received an interesting problem recently that I wanted to share. Below is an email from a client (I'm paraphrasing):

"Josh,

Exchange Unified Messaging (UM) is no longer working internally. People can call in from the outside, reach UM, and leave messages just fine. However, our own internal users cannot call eachother and leave a voicemial. Calls internally either go to dead air or receive a busy signal. 


Is this something you can help us out with?"

Of course I can help! 😃

The client has a hybrid environment with their S4B severs on premise and Exchange environment in the cloud. Since UM must reside in the same place as the Exchange mailboxes, their UM environment resides in the cloud as well. The customer connects to their O365 tenant via Edge federation (this will be important later 😉).

I decided to take a trace via CLS logger on the FE server of a failed internal VM call and noticed the following error:


SIP 429 Provide Referrer Identity

ms-diagnostics: 1020;reason="Identity of the referrer could not be verified with the ms-identity parameter";ErrorType="Invalid signature";Referrer="user1@contoso.com";HRESULT="0xC3E93EE0(SIP_E_CRYPT_REFERRER_DATE_SKEWED)";cause="Invalid signature";signer="skypefe2.internal.contoso.com";source="sip.contoso.com"
ms-edge-proxy-message-trust: ms-source-type=EdgeProxyGenerated;ms-ep-fqdn=skypeedgepool1.internal.contoso.com;ms-source-verified-user=verified
$$end_record

 I've bolded the part of the error above that helped me fix this issue. I only half understood what the error was trying to tell me - the date on the Edge pool is somehow 'skewed' and incorrect. I hadn't seen this error before, so some further research was needed.

A quick Google search pointed me to this article mentioning the exact same error the client was having. I logged onto one of the two Edge servers in the pool and noticed the same error mentioned in the article linked above:





The clock on the second Edge Server in the pool was six hours behind the current time. The customer had rebooted to install updates approximately two weeks before the error occurred and the clock never re-synced. Judging from the error in the event logs above, my guess as to why UM was not working is that for 'crytographic verification' to occur (i.e. TLS traffic via the Edge Server to O365) the time of the request must match between the two systems.

After adjusting the clock manually on the offending Edge Server, I had the client re-test and everything was working again. Yay!

Before I sign off, I must pay homage to the band Chicago. They were the true 'inspiration' for the title of this post 😉. 


Heck yes that is a keytar