Wednesday, April 25, 2018

He's Back.... and he has an Always On post

I had an AOAG issue yesterday, the 'fix' was to remove and re-add the DB's to the AG..... I tried simply removing and rejoining the 3rd replica but alas that failed.  My best guess is something in the metadata on the primary was hosed and not reading right.

Cause
Network outage due to changes made by another engineer caused a 'blip' in AOAG.  There were a slew of errors where the SQL Servers complained about not being able to talk to each other, and then the AGs were back online and syncing

Symptoms
All 4 AGs were online, with #4 having 12 of 15 databases syncing just fine, 3 would not.  There were no errors in the log when everything came back.

There were no errors in the log when I attempted steps 1 and 2 down below.

When I did step 4 I got the dreaded 35250 error.. BUT as I said 12 of 15 DB's were working in this AG, as well as 3 other AG's on the same primary and secondary replicas.

SO it wasn't the usual suspects of port not open, or permissions issues or anything of the like.

Things I tried
1) HADR Resume from secondary and primary
ALTER DATABASE database_name SET HADR RESUME

2) Suspend and Resume on Primary
ALTER DATABASE database_name SET HADR suspend

ALTER DATABASE database_name SET HADR RESUME

3)  Restarting Secondary Replica (and endpoints as well)
4) Removing and re-adding Replica
5) Removing DB from AG and re-adding DB to AG (which worked)

/*Check health of AOAG*/

SELECT
       ag.name
     , sb.name
     , ar.replica_server_name
     , ar.availability_mode_desc
     , drs.is_local
     , drs.is_primary_replica
     , drs.synchronization_state_desc
     , drs.synchronization_health_desc
FROM sys.dm_hadr_database_replica_states drs
     INNER JOIN sys.databases sb
          ON drs.database_id = sb.database_id
     INNER JOIN sys.availability_groups ag
          ON ag.group_id = drs.group_id
     INNER JOIN sys.availability_replicas ar
          ON ar.replica_id = drs.replica_id
ORDER BY
         ag.name
       , sb.name
       , ar.replica_server_name;

Labels: , , ,

0 Comments:

Post a Comment

<< Home