Pacemaker/DRBD: Auto-failback kills active DRBD Sync Primary to Secondary. How to prevent this?
Posted by Ushan_Destiny@reddit | linuxadmin | View on Reddit | 11 comments
Hi everyone,
I am testing a 2-node Pacemaker/Corosync + DRBD cluster (Active/Passive). Node 1 is Primary; Node 2 is Secondary.
I have a setup where node1 has a location preference score of 50.
**The Scenario:**
1. I simulated a failure on Node 1. Resources successfully failed over to Node 2.
2. While running on Node 2, I started a large file transfer (SCP) to the DRBD mount point.
3. While the transfer was running, I brought Node 1 back online.
4. Pacemaker immediately moved the resources back to Node 1.
**The Result:** The SCP transfer on Node 2 was killed instantly, resulting in a partial/corrupted file on the disk.
**My Question:** I assumed Pacemaker or DRBD would wait for active write operations or data sync to complete before switching back, but it seems to have just killed the processes on Node 2 to satisfy the location constraint on Node 1.
1. Is this expected behavior? (Does Pacemaker not care about active user sessions/jobs?)
2. How do I configure the cluster to stay on Node 2 until sync complete? My requirement is to keep the Node1 always as the master.
3. Is there a risk of filesystem corruption doing this, or just interrupted transactions?
**My Config:**
* stonith-enabled=false (I know this is bad, just testing for now)
* default-resource-stickiness=0
* Location Constraint: Resource prefers node1=50
Thanks for the help!
*(used Gemini to enhance the grammar and readability)*
11 Comments
vdvelde_t@reddit
aieronpeters@reddit
srekkas@reddit
DerBootsMann@reddit
srekkas@reddit
aieronpeters@reddit
posixUncompliant@reddit
Fighter_M@reddit
posixUncompliant@reddit
LinuxLeafFan@reddit
aioeu@reddit