JSONFormatter can log sensitive data

Bug #1571714 reported by Ken Thomas
266
This bug affects 1 person
Affects Status Importance Assigned to Milestone
oslo.log
Fix Released
High
Unassigned

Bug Description

The JSONFormatter logs the args to the log statement and in some cases that "args" is a dict of values where only some of them are intended to be logged.

For example, take a look at this code in the oslo impl_rabbit.py[1]. It's logging useful information like the hostname, port, and transport, which are just a few fields in the connection.info() dict. The JSONFormatter goes ahead and logs the full args[2], which in this case contains the username and password which was never intended to be in the log.

I can see the benefit of logging the args as it could contain useful data, but I think that if some is that useful, it should have been in the log message in the first place.

This appears in the old nova.common.openstack.log code as well as the current oslo.log.

[1] https://github.com/openstack/oslo.messaging/blob/stable/mitaka/oslo_messaging/_drivers/impl_rabbit.py#L564-L566
[2] https://github.com/openstack/oslo.log/blob/stable/mitaka/oslo_log/formatters.py#L96

Ben Nemec (bnemec)
Changed in oslo.log:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Ben Nemec (bnemec) wrote :

I know we'd been trying to get away from this, but maybe we should pass the args param through mask_dict_password before logging it?

https://github.com/openstack/oslo.utils/blob/da8d3c3bbcb640d91e2b014345387b532cc9978f/oslo_utils/strutils.py#L340

That seems like the param where people are most likely to accidentally pass private data. The rest are more explicit, in general.

I guess the alternative is to treat this as user error and change oslo.messaging to pass only the fields it's actually logging. Or we could do both. Mask the passwords we can, but also change oslo.messaging to not pass the sensitive data in the first place. We could also add a strong warning to the oslo.log documentation that anything in the args structure may be exposed by some formatters.

That way we fix what we can in oslo.log but also establish a new best practice for logging parameters.

As this is a rather old security bug, I probably need to check with the VMT to see how to proceed. Once the patches are posted this will have to become public knowledge. Any input from the rest of oslo-coresec is also welcome.

Revision history for this message
Ben Nemec (bnemec) wrote :

Okay, I think this should do it. If it looks good I'll submit a review to oslo.log to get this fixed.

diff --git a/oslo_log/formatters.py b/oslo_log/formatters.py
index 2545035..57e278e 100644
--- a/oslo_log/formatters.py
+++ b/oslo_log/formatters.py
@@ -28,6 +28,7 @@ from six import moves
 from oslo_context import context as context_utils
 from oslo_serialization import jsonutils
 from oslo_utils import encodeutils
+from oslo_utils import strutils

 if six.PY3:
     from functools import reduce
@@ -206,11 +207,16 @@ class JSONFormatter(logging.Formatter):
         return lines

     def format(self, record):
+ args = record.args
+ # If we were passed a dict object then we should attempt to mask any
+ # sensitive data.
+ if isinstance(args, dict):
+ args = strutils.mask_dict_password(dict(args))
         message = {'message': record.getMessage(),
                    'asctime': self.formatTime(record, self.datefmt),
                    'name': record.name,
                    'msg': record.msg,
- 'args': record.args,
+ 'args': args,
                    'levelname': record.levelname,
                    'levelno': record.levelno,
                    'pathname': record.pathname,
diff --git a/oslo_log/tests/unit/test_log.py b/oslo_log/tests/unit/test_log.py
index 7878525..b0ec22c 100644
--- a/oslo_log/tests/unit/test_log.py
+++ b/oslo_log/tests/unit/test_log.py
@@ -591,6 +591,11 @@ class JSONFormatterTestCase(LogTestBase):
         # convert it using repr() to prevent serialization error on logging.
         self.assertEqual(['repr'], data['args'])

+ def test_passwords_masked(self):
+ self.log.info('Test message', {'password': 'maskme'})
+ data = jsonutils.loads(self.stream.getvalue())
+ self.assertEqual('***', data['args']['password'])
+

 def get_fake_datetime(retval):
     class FakeDateTime(datetime.datetime):

Revision history for this message
Jeremy Stanley (fungi) wrote :

Now that the OpenStack VMT is subscribed, I'll note that oslo.log and oslo.messaging have never been explicitly tagged as within our oversight (nor do we even have automatic visibility into their private security bugs). That said, this could potentially impact a number of OpenStack deliverables for which we're responsible security-wise so I'm happy to weigh in and help with an advisory.

I'd be in agreement on the user error point except that in this case it sounds like the "user" is another popular OSLO library which is mistakenly using oslo.log in a vulnerable fashion? If so, we need to fix one of them and so should probably consider it a vulnerability on one end or the other. I'm inclined to agree that at least fixing it on the side where it presents less of a footgun for the next consumer of the same library is preferable.

If Ben's patch in comment #2 is acceptable to the oslo.log private security reviewers, I'm happy to propose an impact description for review.

Revision history for this message
Mohammed Naser (mnaser) wrote :

In my opinion, this is not a big with oslo.log. The library should not be in the business of stripping secrets out of logs. The consumer of the library should be responsible to make sure that they do not send secrets to the logging layer.

I know it’s a stretch but if a user wanted to log passwords for any reason or other, we’ve just removed that behaviour.. and the next thing out of this is us forgetting to strip some other thing out of the dictionary like “token” and we start going the rabbit hole of things not to log.

I think oslo.log shouldn’t be responsible to strip logs. It’s the responsibility of the consumer to make sure that it’s not passing secrets to the log layer. Therefore, I still believe it’s a oslo.messaging bug.

Revision history for this message
Ben Nemec (bnemec) wrote :

Shoot, apparently I wasn't subscribed to this bug so I didn't see the responses.

I am inclined to agree that this should be fixed in oslo.messaging, but I think we need to do _something_ different in oslo.log too. In this case the caller thought the call was okay because the password field wasn't referenced in the log message so it doesn't show up in any of the other formatters. The behavior here is significantly different.

Which makes me wonder why we're even logging args. One could argue that's internal data used to build the log message, not something that is supposed to be exposed to users. The logging docs even suggest this: https://docs.python.org/2/library/logging.html#logrecord-attributes "You shouldn’t need to format this yourself."

I think the JSONFormatter shouldn't be logging args or msg since those are just the building blocks for "message", which should contain all of the necessary information after formatting. While this could be considered a breaking change, realistically nobody should be relying on the content of args anyway. There's nothing about the keys passed in a log call that would be treated as an API so keys may be added/removed/renamed at any time with no warning. If data from args is needed, it should be included in the message explicitly.

Revision history for this message
Ben Nemec (bnemec) wrote :

Okay, here's a new patch that removes the content of args and msg but leaves the fields for compatibility purposes. However, they are deprecated and should not be used any more.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote :

I think the idea with logging args and msg is that we want the JSON output to be parsable to get those individual values out. I guess we probably don't care about the unformatted message string, but the args could possibly be useful.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote :

The patch Ben proposed seems good in principle (some of the formatting seems to have been lost so I would want to review it via gerrit or a patch file or something if we need a stronger vote of confidence to go ahead with it).

The strutuils code to mask secret values should be relatively easy to extend if we need to, and it has a long list of regexes now IIRC. If someone does want to log passwords, they can do so by avoiding those patterns, so I think securing this by default is a good idea.

Revision history for this message
Ken Giusti (kgiusti) wrote :

FWIW I've tested Ben's patch against master oslo.messaging and can confirm It Works For Me.

Revision history for this message
Ben Nemec (bnemec) wrote :

Here's an alternate fix that just filters unused keys out of args instead of removing it completely. It also leaves msg alone since that isn't a security concern. This one should be less disruptive while still addressing the behavior that led to this bug.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote :

I think in _MSG_KEY_REGEX you want (%+) rather than (%*) -- it isn't a key if there is no '%', it's just a parenthetical statement at that point, right?

I feel like there's a builtin that will give you all of the names used for interpolation in a string, but I can't for the life of me find it. Maybe it was related to the new style formatting instructions.

Other than the regex thing the patch looks fine.

Revision history for this message
Ben Nemec (bnemec) wrote :

Whoops, you're right. It would work anyway because of the count check in the next line, but it wasn't what I intended. /me is bad at regexes

I did run across some builtin regexes related to string formatting, but they were all for the {} style of formatting instead of %(). Every interpolation-related one I could find had issues (including the one we were previously using in oslo.i18n), which is why I tried to keep this one as simple as possible.

I did forget one other case too though, which is when a dict is passed and only a %s is included. In that case the entire dict will get logged. I will update the patch for that and the s/*/+/ change.

Revision history for this message
Ben Nemec (bnemec) wrote :
Revision history for this message
Ken Giusti (kgiusti) wrote :

Patch works for oslo.messaging.
pep8 grumbles a bit:

./oslo_log/formatters.py:217:80: E501 line too long (82 > 79 characters)
            # tricky and error-prone to include in the regex. Much easier to just

Revision history for this message
Ben Nemec (bnemec) wrote :

Ah, I wrote that comment before I indented the block. I fixed that in my local branch so it shouldn't be a problem when I push it. Which I guess I will do today since it sounds like everyone is okay with the patch in general.

Ben Nemec (bnemec)
information type: Private Security → Public Security
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.log (master)

Reviewed: https://review.openstack.org/596850
Committed: https://git.openstack.org/cgit/openstack/oslo.log/commit/?id=a93c6ef98c8aeddc5a4ae87083689225fbc728bb
Submitter: Zuul
Branch: master

commit a93c6ef98c8aeddc5a4ae87083689225fbc728bb
Author: Ben Nemec <email address hidden>
Date: Mon Aug 27 17:44:40 2018 +0000

    Filter args dict in JSONFormatter

    In most formatters, any unused keys in the args dict will just be
    discarded. Because JSONFormatter logged the entire dict in addition
    to the message, some values were included in the output that may
    not have been intended. This could include sensitive data, so we
    should stop that.

    In the interest of maintaining compatibility with any tools that are
    reading the args dict, we leave the dict but filter out any unused
    keys.

    Change-Id: Ib64837c1ae93a27bef3d30a776320a373f18dd1c
    Closes-Bug: 1571714
    Closes-Bug: 1787214

Changed in oslo.log:
status: Confirmed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.log (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/598338

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.log (stable/rocky)

Reviewed: https://review.openstack.org/598338
Committed: https://git.openstack.org/cgit/openstack/oslo.log/commit/?id=871d2df290d829b9cafb0dc383dc3d65810c2dac
Submitter: Zuul
Branch: stable/rocky

commit 871d2df290d829b9cafb0dc383dc3d65810c2dac
Author: Ben Nemec <email address hidden>
Date: Mon Aug 27 17:44:40 2018 +0000

    Filter args dict in JSONFormatter

    In most formatters, any unused keys in the args dict will just be
    discarded. Because JSONFormatter logged the entire dict in addition
    to the message, some values were included in the output that may
    not have been intended. This could include sensitive data, so we
    should stop that.

    In the interest of maintaining compatibility with any tools that are
    reading the args dict, we leave the dict but filter out any unused
    keys.

    Change-Id: Ib64837c1ae93a27bef3d30a776320a373f18dd1c
    Closes-Bug: 1571714
    Closes-Bug: 1787214
    (cherry picked from commit a93c6ef98c8aeddc5a4ae87083689225fbc728bb)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.log 3.39.1

This issue was fixed in the openstack/oslo.log 3.39.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.log 3.40.0

This issue was fixed in the openstack/oslo.log 3.40.0 release.

To post a comment you must log in.
This report contains Public Security information  
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.