Monday, October 20, 2014

Network problems between MongoDB nodes

PROBLEM:
MongoDB replica sets provide high availability through replication and automated failover. We have a cluster comprising three nodes: replicas "mentos-a" and "mentos-b", plus an arbiter. The problem is that every X seconds, the PRIMARY steps down and the cluster failover to the other node.

SOLUTION:
The way we detect a downed node is by a loss of heartbeats and heartbeat responses. Heartbeat responses time out after 10 seconds and then if we have not received a heartbeat from them in the past two seconds (they are sent every two seconds), we mark them as down. So it is common for the election process to take 10 seconds before it starts.

We can change the number of seconds that the replica set members wait for a successful heartbeat from each other. If a member does not respond in time, other members mark the delinquent member as inaccessible.
In the following example we will change the default 2 seconds heartbeat to 30 seconds

rs0:PRIMARY> cfg = rs.conf();
{
 "_id" : "rs0",
 "version" : 2,
 "members" : [
  {
   "_id" : 0,
   "host" : "mentos-a:27017"
  },
  {
   "_id" : 1,
   "host" : "mentos-b:27017"
  },
  {
   "_id" : 2,
   "host" : "mentos-c:27017",
   "arbiterOnly" : true
  }
 ]
}
rs0:PRIMARY> cfg["settings"] = { heartbeatTimeoutSecs : 30 }
{ "heartbeatTimeoutSecs" : 30 }
rs0:PRIMARY> rs.reconfig(cfg);
{ "down" : [ "mentos-a:27017" ], "ok" : 1 }
rs0:PRIMARY> rs.conf()
{
 "_id" : "rs0",
 "version" : 3,
 "members" : [
  {
   "_id" : 0,
   "host" : "mentos-a:27017"
  },
  {
   "_id" : 1,
   "host" : "mentos-b:27017"
  },
  {
   "_id" : 2,
   "host" : "mentos-c:27017",
   "arbiterOnly" : true
  }
 ],
 "settings" : {
  "heartbeatTimeoutSecs" : 30
 }
}

If you find this useful, you are welcome to press one of the ads in this page.. Thanks!

2 comments:

  1. Somewhere the content of the blog surrounded by little arguments. Yes it is healthy for readers. They can include this kind of language in their writing skill as well as while group discussion in college.APC KVM USB

    ReplyDelete
  2. Nice explanations of the software development basics, it's good to know that! A friend of mine has implemented a company which is the ERP software in Hyderabad right now, she provides cloud based ERP software in Hyderabad, so I hope it goes well for her.
    Best Regards

    ReplyDelete