Percona Monitoring for Cacti (for MongoDB) doesn't show slave lag info

Bug #1379706 reported by Nilnandan Joshi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Monitoring Plugins
Triaged
Medium
Unassigned

Bug Description

"Percona MongoDB Slave Lag" graph data on Cacti is gathered by following code from "db.serverStatus()". But "db.serverStatus()" outputs don't contain "lagSeconds" metric. How to fix this issue. Thanks.

if (preg_match('/"lagSeconds" : ([0-9]+)/', $output, $matches) == 0) {
$result["MONGODB_slave_lag"] = -1;
} else {
$result["MONGODB_slave_lag"] = $matches[1];
}

mongodb db.serverStatus() as below.

admin> db.serverStatus()
{
"host" : "sjprmongodb02.ehealthinsurance.com",
"version" : "2.4.6",
"process" : "mongod",
"pid" : 9803,
"uptime" : 2396648,
"uptimeMillis" : NumberLong("2396648412"),
"uptimeEstimate" : 2372137,
"localTime" : ISODate("2014-10-10T02:21:15.616Z"),
"asserts" : {
"regular" : 0,
"warning" : 0,
"msg" : 0,
"user" : 915,
"rollovers" : 0
},
"backgroundFlushing" : {
"flushes" : 39944,
"total_ms" : 337394,
"average_ms" : 8.446675345483676,
"last_ms" : 4,
"last_finished" : ISODate("2014-10-10T02:21:11.101Z")
},
"connections" : {
"current" : 152,
"available" : 16232,
"totalCreated" : NumberLong(316571)
},
"cursors" : {
"totalOpen" : 0,
"clientCursors_size" : 0,
"timedOut" : 11
},
"dur" : {
"commits" : 30,
"journaledMB" : 0,
"writeToDataFilesMB" : 0,
"compression" : 0,
"commitsInWriteLock" : 0,
"earlyCommits" : 0,
"timeMs" : {
"dt" : 3070,
"prepLogBuffer" : 0,
"writeToJournal" : 0,
"writeToDataFiles" : 0,
"remapPrivateView" : 0
}
},
"extra_info" : {
"note" : "fields vary by platform",
"heap_usage_bytes" : 165209192,
"page_faults" : 0
},
"globalLock" : {
"totalTime" : NumberLong("2396648412000"),
"lockTime" : NumberLong("4454160971"),
"currentQueue" : {
"total" : 0,
"readers" : 0,
"writers" : 0
},
"activeClients" : {
"total" : 0,
"readers" : 0,
"writers" : 0
}
},
"indexCounters" : {
"accesses" : 24560534,
"hits" : 24561206,
"misses" : 0,
"resets" : 0,
"missRatio" : 0
},
"locks" : {
......
},
"network" : {
"bytesIn" : 1665717529,
"bytesOut" : 7183342824,
"numRequests" : 24601510
},
"opcounters" : {
"insert" : 1,
"query" : 394523,
"update" : 380,
"delete" : 0,
"getmore" : 6680,
"command" : 24419737
},
"opcountersRepl" : {
"insert" : 57766,
"query" : 0,
"update" : 35296,
"delete" : 905135,
"getmore" : 0,
"command" : 5
},
"recordStats" : {
"accessesNotInMemory" : 0,
"pageFaultExceptionsThrown" : 0
......
},
"repl" : {
"setName" : "replPROD",
"ismaster" : false,
"secondary" : true,
"hosts" : [
"sjprmongodb02.ehealthinsurance.com:27017",
"sjprmongodb01.ehealthinsurance.com:27017"
],
"primary" : "sjprmongodb01.ehealthinsurance.com:27017",
"me" : "sjprmongodb02.ehealthinsurance.com:27017"
},
"writeBacksQueued" : false,
"mem" : {
"bits" : 64,
"resident" : 777,
"virtual" : 7630,
"supported" : true,
"mapped" : 3263,
"mappedWithJournal" : 6526
},
"metrics" : {
"document" : {
"deleted" : NumberLong(0),
"inserted" : NumberLong(1),
"returned" : NumberLong(236354),
"updated" : NumberLong(380)
},
"getLastError" : {
"wtime" : {
"num" : 0,
"totalMillis" : 0
},
"wtimeouts" : NumberLong(0)
},
"operation" : {
"fastmod" : NumberLong(380),
"idhack" : NumberLong(29),
"scanAndOrder" : NumberLong(53)
},
"queryExecutor" : {
"scanned" : NumberLong(1265192)
},
"record" : {
"moves" : NumberLong(349)
},
"repl" : {
"apply" : {
"batches" : {
"num" : 129023,
"totalMillis" : 55910
},
"ops" : NumberLong(998202)
},
"buffer" : {
"count" : NumberLong(0),
"maxSizeBytes" : 268435456,
"sizeBytes" : NumberLong(0)
},
"network" : {
"bytes" : NumberLong(139342727),
"getmores" : {
"num" : 736325,
"totalMillis" : 2396186780
},
"ops" : NumberLong(998512),
"readersCreated" : NumberLong(31421)
},
"oplog" : {
"insert" : {
"num" : 998512,
"totalMillis" : 37366
},
"insertBytes" : NumberLong(112851278)
},
"preload" : {
"docs" : {
"num" : 35296,
"totalMillis" : 1
},
"indexes" : {
"num" : 3864628,
"totalMillis" : 671
}
}
},
"ttl" : {
"deletedDocuments" : NumberLong(0),
"passes" : NumberLong(39866)
}
},
"ok" : 1
}

Percona MongoDB Monitoring Template for Cacti
http://www.percona.com/doc/percona-monitoring-plugins/1.1/cacti/mongodb-templates.html

Tags: mongodb i46675
Revision history for this message
Nilnandan Joshi (nilnandan-joshi) wrote :
summary: - Percona Monitoring Tool for MongoDB doesn't show slave lag info
+ Percona Monitoring for Cacti (for MongoDB) doesn't show slave lag info
Changed in percona-monitoring-plugins:
status: New → Confirmed
tags: added: i46675
Changed in percona-monitoring-plugins:
importance: Undecided → Medium
Revision history for this message
Roman Vynar (roman-vynar) wrote :

Hi Nil,

I have reviewed the code and I didn't see that the script runs ""db.serverStatus()". It runs "db._adminCommand({serverStatus:1, repl:2})", see http://bazaar.launchpad.net/~percona-toolkit-dev/percona-monitoring-plugins/1.1/view/head:/cacti/scripts/ss_get_by_ssh.php#L1381
Can you please verify the output from it?

Revision history for this message
Nilnandan Joshi (nilnandan-joshi) wrote :

Hi Roman,

Yes, that's true but there is no "lagSeconds" metric in "db._adminCommand({serverStatus:1, repl:2})" output too. Can you please check and tell how we can get "Percona MongoDB Slave Lag" info/graph data on Cacti ?

Revision history for this message
Roman Vynar (roman-vynar) wrote :

Looks like the current command to get the lag is outdated, may be for MongoDB 1.x it was ok.

Now, in order to get replication lag It turns to be a complex solution. Here http://blog.mongolab.com/2013/03/replication-lag-the-facts-of-life/#How_do_I_measure_lag it is described how to measure the lag. Moreover, the lag has a different meaning, not like Seconds_behind_master in MySQL.

tags: added: mongodb
Changed in percona-monitoring-plugins:
status: Confirmed → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.