Problem:
You have a bunch of databases on a production server and you decided to go with SQL Server mirroring as a DR solution. Your IT dept, in its infinite wisdom, thought that it is a waste of resources to have a witness server and you were asked to somehow adjust with a Principal and a Mirror server. Your databases are critical and though the option of automated failovers is ruled out due to lack of witness, you would like to put process in place such that you are always battle ready to deal with a possible disaster and recover quickly. The solution should be designed in such a way that even the most junior of your DBAs should be able to handle it.
Solution:
For the sake of this explanation, lets us assume my production server is YIN and Mirror server is YANG. So what happens when a disaster occurs? Assuming that you already have mirroring setup between YIN and YANG and it is functional, you might have to do the following after a disaster.
You have a bunch of databases on a production server and you decided to go with SQL Server mirroring as a DR solution. Your IT dept, in its infinite wisdom, thought that it is a waste of resources to have a witness server and you were asked to somehow adjust with a Principal and a Mirror server. Your databases are critical and though the option of automated failovers is ruled out due to lack of witness, you would like to put process in place such that you are always battle ready to deal with a possible disaster and recover quickly. The solution should be designed in such a way that even the most junior of your DBAs should be able to handle it.
Solution:
For the sake of this explanation, lets us assume my production server is YIN and Mirror server is YANG. So what happens when a disaster occurs? Assuming that you already have mirroring setup between YIN and YANG and it is functional, you might have to do the following after a disaster.
- When YIN goes down, databases on YANG server will be in restoring/disconnected state. You'll bring all those databases online by performing a forced failover with allow data loss.
- When the failover is done, you might want to fix orphaned users
- You now make necessary changes to whichever mechanism your application uses to connect to the databases to point to the new server. In our case, we store database names and connection strings in a SQL that sits on a well maintained cluster (lets call this MOTHERSHIP), so we update those connection strings with IP address of YANG.
- We have lot of business specific SQL Agent jobs running on our production servers apart from regular maint jobs. These jobs exist both on YIN and YANG. The jobs are kept in disabled state on mirror server. Now that we had a disaster and we failover to the evil YANG, we might have to enable these jobs.
- Send a notification that failover of some sort occurred.
What Next?
Ok, you did all of that and we were back in business with minimal downtime. Our former Principal server was resurrected and is back online. YANG is currently the Principal and you might want to add YIN back to mirroring setup as the new mirror. After a couple of days, you are sure that YIN is now stable and it should be primary because it probably has better resources compared to YANG. You might have to do a soft (planned) failover. So following are various scenarios that are possible in a setup like this:
- Planned Failover from Yin to Yang. Stop jobs on Yin, enable jobs on Yang, fix users, update connection strings
- Planned Failover from Yang to Yin etc.,
- Forced Failover on Yin
- Forced Failover on Yang
- Resume Mirroring on Yin
- Resume Mirroring on Yang
Enough of theory. Lets get our hands dirty. Creating batch scripts for each of these scenarios appeared like the easiest simplest (if not elegant) solution to me. The actual logic that performs a failover can be a stored procedures and the batch scripts can be stored on a server which is highly available.
Create these stored procedures on both Principal and Mirror Servers.
Example procedure below. Click links above for other sps.
USE [master] GO SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE PROCEDURE [dbo].[usp_ForcedFailover] AS /* stored procedure for doing a forced failover of all database to YANG example usage: EXEC master.dbo.usp_MirrorFailover */ DECLARE @sql NVARCHAR(300) ,@maxid INT ,@i INT ,@dbname SYSNAME DECLARE @MirrorDBs TABLE ( [dbid] INT identity(1, 1) NOT NULL PRIMARY KEY ,dbname SYSNAME NOT NULL ) SET NOCOUNT ON INSERT INTO @MirrorDBs (dbname) SELECT DB_NAME(database_id) FROM sys.database_mirroring WHERE mirroring_role = 2 -- Mirror AND mirroring_state IN (0,1) -- Disconnected or Suspended AND mirroring_safety_level = 2 --select * from @MirrorDBs SELECT @maxid = max([dbid]) ,@i = 1 FROM @MirrorDBs; WHILE @i <= @maxid BEGIN SELECT @dbname = dbname FROM @MirrorDBs WHERE [dbid] = @i SET @sql = 'ALTER DATABASE ' + QUOTENAME(@dbname) + ' SET PARTNER FORCE_SERVICE_ALLOW_DATA_LOSS;' PRINT @sql EXEC (@sql); SET @i = @i + 1 END SET NOCOUNT OFF
Next, create scripts for enabling/disabling jobs:
declare @sql nvarchar(max) = ''; select @sql += N'exec msdb.dbo.sp_update_job @job_name = ''' + name + N''', @enabled = 1; ' from msdb.dbo.sysjobs where enabled = 0 order by name; print @sql; exec (@sql);
Create scripts for fixing users
DECLARE @sql NVARCHAR(300) ,@maxid INT ,@i INT ,@dbname SYSNAME DECLARE @Mirrors TABLE ( [dbid] INT identity(1, 1) NOT NULL PRIMARY KEY ,dbname SYSNAME NOT NULL ) SET NOCOUNT ON INSERT INTO @mirrors (dbname) SELECT DB_NAME(database_id) FROM sys.database_mirroring WHERE mirroring_role = 1 -- Principal AND mirroring_state = 4 -- Synchronized AND mirroring_safety_level = 2 --select * from @mirrors SELECT @maxid = max([dbid]) ,@i = 1 FROM @mirrors; WHILE @i <= @maxid BEGIN SELECT @dbname = dbname FROM @mirrors WHERE [dbid] = @i SET @sql = 'USE ' + QUOTENAME(@dbname) + ' EXEC sp_change_users_login ''Auto_Fix'', ''myUser1''; EXEC sp_change_users_login ''Auto_Fix'', ''myUser2''; ' PRINT @sql EXEC (@sql); SET @i = @i + 1 ENDUpdate connection strings on MOTHERSHIP server
USE ConnectionDB GO UPDATE COMPANY SET ConnectionString='Data Source=10.10.2.10;User ID=fancyuser;Password=fancypass;Max Pool Size=200' WHERE ConnectionString='Data Source=10.10.2.11;User ID=fancyuser;Password=fancypass;Max Pool Size=200';Finally, the batch file that brings all these together:
@echo off goto :startbatch Description ============================================================= This batch file is used for 1. Performing a soft failover of YIN to YANG. YANG will become the primary 2. Fix Database users 3. Update connection strings in connectiondb database post failover 4. Disable jobs on YIN 5. Enable jobs on YANG 6. Send a notification email =============================================================== :startbatch set source_server=10.10.2.10 set mothership_server=10.10.2.23 set mirror_server=10.10.2.11 set serverlogin=fancyuser set serverpass=fancypass set connectiondb=connectiondb set home_dir=d:\mirrorme\Failover_YIN_TO_YANG set failover_script=%home_dir%\failover.sql set failover_out=%home_dir%\failover_out.txt set fixusers_script=%home_dir%\fixusers.sql set fixusers_out=%home_dir%\fixusers_out.txt set update_connectiondb=%home_dir%\update_connectiondb.sql set update_connectiondb_out=%home_dir%\update_connectiondb_out.txt set disablejobs_script=%home_dir%\disablejobs.sql set disablejobs_out=%home_dir%\disablejobs_out.txt set enablejobs_script=%home_dir%\enablejobs.sql set enablejobs_out=%home_dir%\enablejobs_out.txt REM Failover from YIN to YANG sqlcmd -S tcp:%source_server%,1433 -d master -U %serverlogin% -P %serverpass% -s "," -W -h -1 -i %failover_script% -o %failover_out% REM Run fix users script on YANG to deal with orphan users sqlcmd -S tcp:%mirror_server%,1433 -d master -U %serverlogin% -P %serverpass% -s "," -W -h -1 -i %fixusers_script% -o %fixusers_out% REM update connectiondb database with new connection string sqlcmd -S tcp:%connectiondb_server%,1433 -d %connectiondb% -U %serverlogin% -P %serverpass% -s "," -W -h -1 -i %update_connectiondb% -o %update_connectiondb_out% REM disable jobs on YIN because we failedover to YANG sqlcmd -S tcp:%source_server%,1433 -d master -U %serverlogin% -P %serverpass% -s "," -W -h -1 -i %disablejobs_script% -o %disablejobs_out% REM Enable jobs on YANG since it is the new primary sqlcmd -S tcp:%mirror_server%,1433 -d master -U %serverlogin% -P %serverpass% -s "," -W -h -1 -i %enablejobs_script% -o %enablejobs_out% REM send emails on completion C:\work\tool\sendemail -t allcoolsdudes@Nirvanainc.com -cc -u "Mirroring failover of YIN to YANG" -m "Database Mirroring Failover of YIN to YANG has been triggered and completed. SQL Agent jobs on YIN were disabled and jobs on YANG were enabled. Check logs attached. " -s 192.24.2.110:25 -f "LordOfServers@NirvanaInc.com" -a %failover_out% -a %fixusers_out% -a %update_connectiondb_out% -a %disablejobs_out% -a %enablejobs_out%
To do:
Update this post with scripts for all other possible failover scenarios:
No comments:
Post a Comment