//
you're reading...
Configuration Manager, Software Center Troubleshooting, System Center

ConfigMgr: Database Replication – Link Failed between CAS and Primary Site

Recently I was troubleshooting a case on Database Replication – Link has Failed between the CAS and one of the Primary Sites.

SiteHierarchy_Failed2.png

Case Summary

Problem: The main issue is with the Data Replication Service (DRS). It was found out that the Primary Site’s Database Server is running low on storage. However after increasing the database storage the link between the CAS and Primary site remained broken.

Findings and Resolution:  There was a drive issue on Primary Site’s database site system which caused replication to break.

Actions taken upon investigation:

  • Restarted sms execuitve and sms component services on the Primary Site (PS3) server
  • Restarted the SQL services in CAS and the affected Primary Site’s Database server

Back Story

Symptoms: It was reported that applications deployed to Windows 10 machines are not coming down in Software Center and only the machines under the Primary Site (PS3) are the ones affected.

During investigation: When checked, it was confirmed that package contents are not getting distributed to all Distribution Points (DP) under the Primary Site (PS3).

Checking the Database Replication: Link state = Link Failed

DB_Rep001

Checking System Status > Site Status: Status all OK however the free space in Primary Site’s database is running low.

DB_Rep002

Resolution 1: Worked with Server and Database team to increase the storage for the database.

Problem solve?: NO, Link state is still Link Failed

Further investigation and actions: CAS site is confirmed active and Primary Site is in replication maintenance.

The following query on was run against the Primary Site’s database server:

select * from RCM_DrsInitializationTracking where InitializationStatus not in (6,7)

 

but did not return any output.

We then ran another query:

select * from RCM_ReplicationLinkStatus where SnapshotApplied <>1

it returned the following:

QueryResult01

From the SCCM console, the Link State was Link Initializing 

DB_Rep003

 

 

 

 

 

Next step was to restart the sms execuitve and sms component services in Primary Site (PS3) server after the service restart, we were getting vlogs from PS3 database which corresponds to same status as what was seen in the console for the two replication groups below:

QueryResult02

Requests got stuck.

QueryResult03.png

We did the same in the CAS database server, there were no cab files or dumps SQL tables for the Init request in the rcm.box.

The final step was to restart the SQL services on the CAS and PS3 the databases. After the restart it took some time to get the PS3 link Active. After running a query we got the current site status:  ReplicationActive SMS_REPLICATION_CONFIGURATION_MONITOR

xxx End of troubleshooting xxx

Hope this post helps someone who’s encountering the same problem.

Have a nice day!

Reference: ConfigMgr 2012 Data Replication Service (DRS) Unleashed

Discussion

No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: