Canonical Juju

mongodb reporting saslStart and isMaster as being slow

Bug #1696739 reported by John A Meinel on 2017-06-08

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Canonical Juju	Expired	High	Unassigned

Bug Description

While running under a slightly quieter time, I issued a 'juju status'. I don't know that it is exactly correlated (I would have also recently run 'juju models', and would have background tasks running).

Inspecting /var/log/syslog I saw:
Jun 8 12:23:22 juju-ch2-g004 mongod.37017[1304]: [conn90982] command admin.$cmd command: isMaster { ismaster: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:429 locks:{} protocol:op_query 596ms
Jun 8 12:23:22 juju-ch2-g004 mongod.37017[1304]: [conn91776] command admin.$cmd command: isMaster { ismaster: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:429 locks:{} protocol:op_query 634ms
Jun 8 12:23:22 juju-ch2-g004 mongod.37017[1304]: [conn88977] command admin.$cmd command: saslStart { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: "xxx" } keyUpdates:0 writeConflicts:0 numYields:0 reslen:155 locks:{} protocol:op_query 131ms
Jun 8 12:23:22 juju-ch2-g004 mongod.37017[1304]: [conn88092] command admin.$cmd command: saslStart { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: "xxx" } keyUpdates:0 writeConflicts:0 numYields:0 reslen:155 locks:{} protocol:op_query 133ms

Note that there are near 1500 entries listing "isMaster" all tagged 12:23:22.
What would possibly need that many calls to isMaster...

Out of a tail of syslog that spans only a single minute (from 12:23:22 to 12:24:22), which is 20,000 lines long, we have 14,752 lines that say "command: saslStart", and 4333 lines that say "isMaster".

The remaining lines are:

411 times: Jun 8 12:23:22 juju-ch2-g004 mongod.37017[1304]: [conn92039] command juju.controllers command: find { find: "controllers", filter: { _id: "apiHostPorts" }, skip: 0, limit: 1, batchSize: 1, singleBatch: true
243 times: Jun 8 12:23:25 juju-ch2-g004 mongod.37017[1304]: [conn85282] command juju.settings command: find { find: "settings", filter: { _id: "6bf26857-5f04-4a53-8970-b57969b3b2f8:e" }, skip: 0, limit: 1, batchSize: 1

108 times: Jun 8 12:24:17 juju-ch2-g004 mongod.37017[1304]: [conn88132] command presence.$cmd command: update { update: "presence.pings", updates: [ { q: { _id: "6bf26857-5f04-4a53-8970-b57969b3b2f8:1496924640" }, u: {

and a smaller number of "find machines" and saslContinue.

What is happening that we're initiating a saslStart 14,000 times in a single minute.
Are we connecting/failing/reconnecting/over-and-over-and-over again to the database?
Where is 1500 calls to 'isMaster' coming from?

This is running 2.1.2

Tags:

Revision history for this message

Tim Penhey (thumper) wrote on 2017-06-08:

Is this related to session copying? Just thinking out loud.

Revision history for this message

John A Meinel (jameinel) wrote on 2017-06-08:

Download full text (3.8 KiB)

Yes, it is related to session copying.

The thing I just noticed is this code:
func (s *Session) acquireSocket(slaveOk bool) (*mongoSocket, error) {

// Read-only lock to check for previously reserved socket.
s.m.RLock()
// If there is a slave socket reserved and its use is acceptable, take it as long
// as there isn't a master socket which would be preferred by the read preference mode.
if s.slaveSocket != nil && s.slaveOk && slaveOk && (s.masterSocket == nil || s.consistency != PrimaryPreferred && s.consistency != Monotonic) {
  socket := s.slaveSocket
  socket.Acquire()
  s.m.RUnlock()
  return socket, nil
}
if s.masterSocket != nil {
  socket := s.masterSocket
  socket.Acquire()
  s.m.RUnlock()
  return socket, nil
}
s.m.RUnlock()

// No go. We may have to request a new socket and change the session,
// so try again but with an exclusive lock now.
s.m.Lock()
defer s.m.Unlock()

if s.slaveSocket != nil && s.slaveOk && slaveOk && (s.masterSocket == nil || s.consistency != PrimaryPreferred && s.consistency != Monotonic) {
  s.slaveSocket.Acquire()
  return s.slaveSocket, nil
}
if s.masterSocket != nil {
  s.masterSocket.Acquire()
  return s.masterSocket, nil
}

// Still not good. We need a new socket.
sock, err := s.cluster().AcquireSocket(s.consistency, slaveOk && s.slaveOk, s.syncTimeout, s.sockTimeout, s.queryConfig.op.serverTags, s.poolLimit)
if err != nil {
return nil, err
}

// Authenticate the new socket.
if err = s.socketLogin(sock); err != nil {
sock.Release()
return nil, err
}

Assuming we're hitting s.cluster().AcquireSocket() which ultimately calls server.AcquireSocket which has this comment:
// AcquireSocket returns a socket for communicating with the server.
// This will attempt to reuse an old connection, if one is available. Otherwise,
// it will establish a new one. The returned socket is owned by the call site,
// and will return to the cache when the socket has its Release method called
// the same number of times as AcquireSocket + Acquire were called for it.
// If the poolLimit argument is greater than zero and the number of sockets in
// use in this server is greater than the provided limit, errPoolLimit is
// returned.

One thing to note, is that we could be re-using an existing connection, however if you look at the code above, we *unconditionally* call s.socketLogin(sock) if we had to grab from the pool and we did not already have an s.masterSocket.

From what I can tell, when a Session.Copy() is started, we *don't* have a masterSocket yet, we only get s.masterSocket after we've done a s.cluster.acquireSocket().

So what is happening is that maybe we *are* re-using an existing connection, but we will call Session.socketLogin() unconditionally.
func (s *Session) socketLogin(socket *mongoSocket) error {
for _, cred := range s.creds {
  if err := socket.Login(cred); err != nil {
   return err
  }
}
return nil
}

Yes, it is related to session copying.

The thing I just noticed is this code:
func (s *Session) acquireSocket(slaveOk bool) (*mongoSocket, error) {

// Read-only lock to check for previously reserved socket.
	s.m.RLock()
	// If there is a slave socket reserved and its use is acceptable, take it as long
	// as there isn't a master socket which would be preferred by the read preference mode.
	if s.slaveSocket != nil && s.slaveOk && slaveOk && (s.masterSocket == nil || s.consistency != PrimaryPreferred && s.consistency != Monotonic) {
		socket := s.slaveSocket
		socket.Acquire()
		s.m.RUnlock()
		return socket, nil
	}
	if s.masterSocket != nil {
		socket := s.masterSocket
		socket.Acquire()
		s.m.RUnlock()
		return socket, nil
	}
	s.m.RUnlock()

// No go.  We may have to request a new socket and change the session,
	// so try again but with an exclusive lock now.
	s.m.Lock()
	defer s.m.Unlock()

if s.slaveSocket != nil && s.slaveOk && slaveOk && (s.masterSocket == nil || s.consistency != PrimaryPreferred && s.consistency != Monotonic) {
		s.slaveSocket.Acquire()
		return s.slaveSocket, nil
	}
	if s.masterSocket != nil {
		s.masterSocket.Acquire()
		return s.masterSocket, nil
	}

// Still not good.  We need a new socket.
	sock, err := s.cluster().AcquireSocket(s.consistency, slaveOk && s.slaveOk, s.syncTimeout, s.sockTimeout, s.queryConfig.op.serverTags, s.poolLimit)
	if err != nil {
		return nil, err
	}

// Authenticate the new socket.
	if err = s.socketLogin(sock); err != nil {
		sock.Release()
		return nil, err
	}

From what I can tell, when a Session.Copy() is started, we *don't* have a masterSocket yet, we only get s.masterSocket after we've done a s.cluster.acquireSocket().

So what is happening is that maybe we *are* re-using an existing connection, but we will call Session.socketLogin() unconditionally. 
func (s *Session) socketLogin(socket *mongoSocket) error {
	for _, cred := range s.creds {
		if err := socket.Login(cred); err != nil {
			return err
		}
	}
	return nil
}

Now I do see a fast path:
func (socket *mongoSocket) Login(cred Credential) error {
	socket.Lock()
	if cred.Mechanism == "" && socket.serverInfo.MaxWireVersion >= 3 {
		cred.Mechanism = "SCRAM-SHA-1"
	}
	for _, sockCred := range socket.creds {
		if sockCred == cred {
			debugf("Socket %p to %s: login: db=%q user=%q (already logged in)", socket, socket.addr, cred.Source, cred.Username)
			socket.Unlock()
			return nil
		}
	}

So it seems that Sessions have a list of Creds, and *sockets* also have a list of creds. And if all of the Creds on the session have a matching cred already in the Socket, it is supposed to not auth again.

I wonder if there is something in all of those fast paths that is actually not getting the comparison correct, and it means that we end up with a list of creds.

Imagine we grow our Creds too much. Then each actual Session copy actually logs in *multiple times*.

Needs investigation, but there is clearly possibilities for bad stuff to happen here.

Revision history for this message

John A Meinel (jameinel) wrote on 2017-06-08:

There are also oddities like we are passing a Value not a Reference, but it is mutating that Value and then comparing it with other values.

Is it possible that we are somehow caching an unmutated value, and then calling socket.Login with the unmutated value, and we aren't exactly matching somehow?

Revision history for this message

John A Meinel (jameinel) wrote on 2017-06-08:

(that doesn't explain things like why restarting the controller would make it happy for longer, and other symptoms, but having something wrong with Session.Login() lining up with Socket.Login() and/or us ending up with an array of Credentials when we expected to only have a single one...)

Revision history for this message

John A Meinel (jameinel) wrote on 2017-06-08:

Whenever we call State.Open() we ultimately call Database.Login (which calls Session.Login), but that would only be for the lifetime of that State object, and we don't ever call mongodbLogin with a session that was passed to us. There is only 2 Login calls in normal code, both from mongodbLogin and only one caller of mongodbLogin which is immediately after we've just Dialed the server, so it can't be a shared session.

Revision history for this message

Ian Booth (wallyworld) wrote on 2017-06-08:

Every login request issues a APIHostPorts() call because it includes that info in the login response. That I think is why there's a lot of activity on the controllers collection.

Revision history for this message

Ian Booth (wallyworld) wrote on 2017-06-09:

I bootstrapped on lxd and did nothing else.
We in 15 minutes, we did approx 2500 socket logins.
The main culprits were things like what's below.
I need to add more digging to see how often we need to re-auth etc

presence.ping()
pings := p.pings.With(session)
_, err = pings.UpsertId(
  docIDInt64(p.modelUUID, slot),
  bson.D{
   {"$set", bson.D{{"slot", slot}}},
   {"$inc", bson.D{{"alive." + p.fieldKey, p.fieldBit}}},
  })
return e

lease.Refresh()
collection, closer := client.config.Mongo.GetCollection(client.config.Collection)
query := bson.M{
fieldType: typeLease,
fieldNamespace: client.config.Namespace,
}
iter := collection.Find(query).Iter()

settings.Read()
settings, closer := st.db().GetCollection(collection)
defer closer()

err := settings.FindId(key).One(out)

Revision history for this message

Ian Booth (wallyworld) wrote on 2017-06-09:

I added extra logging. I bootstrapped lxd and then deployed peer-xplod, and then add-unit -n 10.
Once the system reached steady state:

number of socket logins ~ 230,000
number of SASL logins ~ 150

So the caching of authenticated sockets does appear to be working in this case.

Revision history for this message

Ian Booth (wallyworld) wrote on 2017-06-09:

So after approx 10 minutes of steady state:

the number of sasl logins increased by approx 130, idle system
same time socket login calls increased by 2000
the socket login calls are all pretty much"
- lease
- presence
- looking for machines with transient errors by provisioner

So of the 2000 socket logins, 130 drop through to require a sasl login.

The transient errors operation is driven by a retry watcher, which wakes up every minute. So every single minute we wake up the provisoner loop which makes a backend api call to get machines with transient errors. This though doesn't grow with number of agents, whereas the lease and presence calls will.

Revision history for this message

Ian Booth (wallyworld) wrote on 2017-06-09:

#10

With the presence pinger, it does this:

session := p.pings.Database.Session.Copy()
defer session.Close()

pings := p.pings.With(session)
_, err = pings.UpsertId(...)

That *appears* to want to regularly cause a sasl login.

The lease client does this:

collection, closer := client.config.Mongo.GetCollection(client.config.Collection)
defer closer()
entries, err := client.readEntries(collection)

which ultimately uses a state.database struct and that doesn't appear to copy a session; it just does a mgo.Database.C("collname")

So if we optimise presence pings, that may help.

Revision history for this message

Ian Booth (wallyworld) wrote on 2017-06-09:

#11

The above comments refer to steady state.
As units are firing relations hooks etc, the main culprits for state calls that cause sasl logins on sessions:

Uniter -> ReadSettings -> getRelationUnit -> state.KeyRelation -> FindId

Uniter -> GetPrincipal -> getRelationUnit -> state.Unit -> FindId

Uniter -> ConfigSettings -> state.Unit -> FindId

etc

Revision history for this message

Ian Booth (wallyworld) wrote on 2017-06-09:

#12

So there's 2 ways we can login to a socket and not have to re-auth:
1. already logged in
2. cached

for _, sockCred := range socket.creds {
  if sockCred == cred {
   debugf("Socket %p to %s: login: db=%q user=%q (already logged in)", socket, socket.addr, cred.Source, cred.Username)
   logger.Criticalf("Socket %p to %s: login: db=%q user=%q (already logged in)", socket, socket.addr, cred.Source, cred.Username)
   socket.Unlock()
   return nil
  }
}
if socket.dropLogout(cred) {
  debugf("Socket %p to %s: login: db=%q user=%q (cached)", socket, socket.addr, cred.Source, cred.Username)
  logger.Criticalf("Socket %p to %s: login: db=%q user=%q (cached)", socket, socket.addr, cred.Source, cred.Username)
  socket.creds = append(socket.creds, cred)
  socket.Unlock()
  return nil
}

From what I can see, we never use the "already logged in path". We only see re-used of cached sockets. Cached sockets are reused if the "logout" slice contains cred. The logout slice only has cred added to it when Logout() is called on the socket. This is done when a socket which has been Acquired() from the pool is then Released().

Revision history for this message

Ian Booth (wallyworld) wrote on 2017-06-09:

#13

Sockets are only reused once released. In a heavily loaded system and a large socket pool size, sockets are in used when new requests come in so a new unauthed socket is created, requiring a new login.

Reducing the socket pool size to 100 results in the same number of sasl logins during bootstrap and deploy of peer-xplod, but halves the number of overlal socket logins, and then during steady state
appears to reduce the sasl login count.

Canonical Juju QA Bot (juju-qa-bot) on 2017-06-23

Changed in juju:
milestone:	2.2.1 → 2.2.2

Revision history for this message

Ian Booth (wallyworld) wrote on 2017-07-10:

#14

With the improvements to batch log writes and pings, and throttle connections and other mongo related tweaks, this issue appears to have gone away for now. Marking as incomplete and we can re-open if the symptoms are observed again.

Changed in juju:
status:	Triaged → Incomplete

Nicholas Skaggs (nskaggs) on 2017-07-11

Changed in juju:
milestone:	2.2.2 → none

Revision history for this message

Launchpad Janitor (janitor) wrote on 2017-09-10:

#15

[Expired for juju because there has been no activity for 60 days.]

Changed in juju:
status:	Incomplete → Expired

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.