Biztalk 2004 Integration Experiences

Thursday, August 18, 2005

High Availability - Enterprise SSO Issue

I had to spend last couple of weeks with lot of Server related Issues and was very busy, couldn't blog as before, Anyway I wanted to write about a critical issue I ran into.

We have two Biztalk Servers load balanced in production in the Biztalk Server group. One of the Server (say Server1) is primary, hosts the Master Secret Single Sign-on Server (SSO) and the other (Server2) is secondary. The SQL Server is running on a different server.

This architecture was designed using the Microsoft High Availability White Paper here

The SSO Server was designed to be "Available not Highly Available", See the transcript from the whitepaper below

Available, but not highly available.

All the SSO servers have the master secret cached in memory, and run-time operations will
continue even if the master secret server fails. However, you will not be able to change the
configuration of ports or the SSO configuration. The BizTalk Server runtime will continue
working without problems, but you cannot make any design changes. You can create a
Microsoft Operations Manager (MOM) event to notify you when the master secret server
becomes unavailable, and you can then manually promote an SSO server to master secret
server and restore the master secret on this server.

Even if this configuration is not highly available, it can be satisfactory for most scenarios and
it is consistent with scaling out the receiving, sending, and processing hosts.

Unfortunately Production Server1 crashed last week and the Secondary Server (Server2) worked fine as per the documentation. However this server has to be rebooted due to the recent Security Patch install. Once it got re-started, the SSO Service did not start, We got "RPC Server is un-available" error in the Event log.

It took me sometime to realize that the Master SSO Server is needed whenever the secondary (Server2) SSO Service is re-started. To avoid this we have to promote the Secondary (Server2) server as Master SSO Secret Server. This kind of change might not be easy to do in Production, since the protocols involved in getting some configuration changed in Production Servers.

I think the article must have stressed the importance of not to restart the Secondary Server, until the Primary Server is up.

Anyway it was a costly experience to learn.


Post a Comment

<< Home