Sentry errors in okta

Bug #1715434 reported by Paul Everitt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
KARL4
Won't Fix
Medium
Chris Rossi

Bug Description

When we turned on Sentry, we found out the root logger wasn't sending stuff to redis, so we're now seeing a bunch of exceptions that we hadn't previously seen.

Okta is giving us a "saml2.response Missing Attribute Statement" exception:

  https://sentry.io/agendaless-consulting/karl/issues/342842028/events/

Also, this exception implies a tremendous number (40?) of "new transaction, abort" log entries per request. Are these ZODB transactions, or some saml2 concept? They take around 30 seconds.

Revision history for this message
Paul Everitt (paul-agendaless) wrote :

Chris wrote in email:

My impression so far is that the only thing that has changed is the reporting--the user experience remains unchanged. So we haven't broken the app, we're just seeing some errors we weren't seeing before. Is this correct? I hope so, because I really won't have any time to look at this any more closely until next week. I might can find some time if there is a true emergency, but I'm pretty swamped at the moment.

Revision history for this message
Paul Everitt (paul-agendaless) wrote :

Chris, is it worth looking at this month? Or are these two issues (the error and the slew of commits) just pathological cases?

Changed in karl4:
milestone: 035 → 036
Revision history for this message
Chris Rossi (chris-archimedeanco) wrote :

I don't really know until I look at them. I can look at it this week.

Revision history for this message
Chris Rossi (chris-archimedeanco) wrote :

My findings so far:

I think the transaction stuff is noise, per conversation in Slack. The "breadcrumbs" feature in Sentry seems to be just showing you some other events that are temporally adjacent in hopes that they give some useful context. In this case, they don't. The slew of 'txn.*' logging statements is just showing normal operation of the app. I see that the root logger is set to 'INFO'. We could set that to 'WARN' and get less noise. Or we can be more surgical and just set the 'txn' logger to 'WARN', and just shut up that noise.

That leaves the SAML error. I'm not sure why the subject here says 'Okta'--could be happening with Google or Okta, we don't really know. But that's neither here nor there.

The log.error statement that is triggering this is located in the 'get_identity' method of the SAML response object in the pyyaml library. Reading the code, it's not obvious to me what it's trying to do. It's also not obvious to me while we call it. I call it because the documentation for pyyaml showed it being called, but those docs are short on explication. The return value is not used, so presumably it's called for its side effects, but, reading the code, there are no obvious side effects. So it's possible we could shut this thing up by just not calling that method. Obviously we'd need to test and see if that actually works, which is not a cheap or easy proposition.

So, it comes down to how much we're willing to spend to de-noise Sentry, I guess, whether we should pursue this further. I don't see any evidence of a degraded user experience associated with this ticket.

Revision history for this message
Chris Rossi (chris-archimedeanco) wrote :

Above comment should read "not obvious to me *why* we call it."

Revision history for this message
Paul Everitt (paul-agendaless) wrote :

We'll close this and teach Sentry to ignore these log.error messages. Thanks Chris for the deep dive to go reason about the nature of the error.

Changed in karl4:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.