Monday, June 25, 2012

SharePoint: High Availability

Recently, I have been asked to develop a plan to make SharePoint 2010 farm Highly Available.
This short post will outline the main phases that farm needs to undergo in order to be Highly Available.

In my case, we have 2 datacenters. One of them is Primary, and another - Secondary. In case primary goes down, we should provide a seamless switch to the Secondary without loss functionality and preferably performance. 


In my personal opinion, I prefer to give performance boost in case I have resources, even it might mean that end-users will notice some performance degradation in case of failover.

Here is a schema to implement for High Availability:




Here are key notes:

1. Web Front Ends (WFEs) - When Primary is up, all 4 WFEs (primary and secondary) are serving the requests. In case the Primary is out, the Secondary WFEs will be getting all requests.
Here I see, that it may impact performance, since in the usual scenario end-users uses all 4 WFE (that load-balanced via ISA).
One of the option to keep WFE performance steady, is to keep only 2 WFEs available in case Primary is up.
The seamless switch will be provided via ISA

2. SharePoint application server - both of them engaged in case Primary is up. In case of failover, the second application server should have the same services running as in Primary to maintain the same functionality as before. 
The weakest point is a timer job. I can imagine that in some scenarios WFs that been served by primary server at the time of failover will never get back.
One more note on this- plan your search architecture: Search Service Application: Architecture in one page


3. SSRS servers - in case Primary is active, engage both. In case of failover the second SSRS will get all requests. The seamless switch will be provided via ISA


4. Often enterprise SharePoint solution interact with external system via BDC (Trying to figure out what's the difference between 2007 BDC and 2010 BCS?)
We need to plan how we can provide access to these external system in case of failover.
That means extensive communication with teams who support such system.
In my example, we accumulate all external systems calls via web services developed on BizTalk server. From our side, we need to configure ISA to have additional BizTalk availabled on the Secondary datacenter.


5.On SQL side - we are implementing async mirroring. In case primary goes down, we don't have any loss and ready to switch to the Secondary.
I prefer to have a witness on the Secondary based on assumption that we use Secondary in case Primary is down, not vice versa.

In case report db servers fail in the Primary, we need to have extra work from front-ent side. Report connection files have connection information inside them. We need to make sure that all connection files have alias name instead of the actual. In the case of failover, we just modifing alias on Report server side. Keep in the mind that SSAS type of connection file won't work with SQL alias and we can do it via host file.

We need to setup failover settings for SharePoint databases.
Refer to this post for what can be mirrored in SharePoint 2010:
The 2010 SharePoint databases, purposes and mirroring supportability
You will find that some of the SharePoint DBs are not required and are not design to setup for failover since they are not critical and easy to re-create.
Based on what recover time you have , adjust for yourself what needs to be configured with failover and what can be omitted. If a SharePoint db is not configured with failover, plan ahead what necessarily actions you should pefrom to put the bd back in case of failure, and what impact will be if the db is not ready right away. As an example, most likely the StateService will not be in high demand right after switching to the reserved (Secondary) datacenter.


Here is an outline plan how to introduce HA in SharePoint 2010 farm gradually:


1. Decide how many additional servers are needed, and their configuration. 
2. WFE. Test first without including in ISA. Then include in the prod farm.
3. App server. Build the server cautiously. I believe once the app server is joined, it will used by prod WFE. No ISA configuration is needed. All requests goes from the SharePoint farm configuration directly.
4. Work on external systems HA
5. Work on SQL servers, failover settings on SharePoint side 
6. SSRS servers

Wish you happy HA to you.