Activity log for bug #1816927

Date Who What changed Old value New value Message
2019-02-20 23:19:25 Lance Bragstad bug added bug
2019-02-20 23:19:37 Lance Bragstad tags fernet
2019-02-20 23:20:33 Lance Bragstad summary Deployments with high churn as susceptible to false positives with token validation Deployments with high churn are susceptible to false positives with token validation
2019-02-21 03:15:21 Lance Bragstad description The implementation for fernet tokens relies on symmetric encryption. This underpinning requires that each keystone API node "share" the same key repository, specifically in deployments where keystone servers need to validate tokens issued by one another (e.g., a cluster of keystone servers behind an HA proxy). With getting into too much detail, each key repository consists of a set of files on disk. The naming of each file is crucial because it denotes the type of key it is (documented extensively [0]). Each file name corresponds to an integer. The file name with the highest index is used to encrypt new tokens, which is called the primary key. The file name with the lowest index, or 0, is known as a staged key and it is always promoted to be the primary key on the next rotation. Every other key in the repository is a secondary key and is only used to decrypt tokens. Each key on disk goes through a lifecycle, starting as a staged key, promoted to a primary key, eventually being demoted to a secondary key. Note that keystone does *not* handle token distribution between API servers. We recommend this be done using configuration management. The documentation suggests rsync as one possible utility to keep key repositories in sync. I'm opening this bug because it was brought to our attention that keystone servers may respond with a 401 Invalid Fernet token, in deployments with high churn, or high token load, across a cluster of keystone nodes. The issue is that in the process of key rotation, the staged key is promoted to be the primary key. As soon as this happens, any subsequent requests to create tokens will use the primary key to encrypt the token. It is assumed all other API servers have a copy of this key, because it's the staged key and also valid as a secondard key. A encrypted with the new primary key should be validatable on other API servers if they have a copy of the staged key, which has the same key contents as the new primary key on the API server that initiated the token rotation. The rsync implementation deletes the contents of the key repository and rebuilds it, alphanumerically. This results in the staged key always being written by rsync first, because its file name is 0. The primary key is always written last, because its filename is the highest index of the key repository. A unique timing event where: - a token is created after key rotation, but before key distribution - key distribution is invoked using a mechanism like rsync - token validation is performed on the API server getting its key repository built by rsync - the token is validated before the primary key is written to the key repository by rsync, and fails validation because the key repository doesn't contain the key used to encrypt the token A subsequent request to validate the token should succeed if rsync completes successfully. pas-ha brought this to the #openstack-keystone channel as an issue that was affecting an internal CI/CD deployment that has a lot of churn [1]. [0] https://docs.openstack.org/keystone/latest/admin/fernet-token-faq.html#what-are-the-different-types-of-keys [1] http://eavesdrop.openstack.org/irclogs/%23openstack-keystone/%23openstack-keystone.2019-02-20.log.html#t2019-02-20T20:11:12 The implementation for fernet tokens relies on symmetric encryption. This underpinning requires that each keystone API node "share" the same key repository, specifically in deployments where keystone servers need to validate tokens issued by one another (e.g., a cluster of keystone servers behind an HA proxy). Without getting into too much detail, each key repository consists of a set of files on disk. The naming of each file is crucial because it denotes the type of key it is (documented extensively [0]). Each file name corresponds to an integer. The file name with the highest index is used to encrypt new tokens, which is called the primary key. The file name with the lowest index, or 0, is known as a staged key and it is always promoted to be the primary key on the next rotation. Every other key in the repository is a secondary key and is only used to decrypt tokens. Each key on disk goes through a lifecycle, starting as a staged key, promoted to a primary key, eventually being demoted to a secondary key. Note that keystone does *not* handle token distribution between API servers. We recommend this be done using configuration management. The documentation suggests rsync as one possible utility to keep key repositories in sync. I'm opening this bug because it was brought to our attention that keystone servers may respond with a 401 Invalid Fernet token, in deployments with high churn, or high token load, across a cluster of keystone nodes. The issue is that in the process of key rotation, the staged key is promoted to be the primary key. As soon as this happens, any subsequent requests to create tokens will use the primary key to encrypt the token. It is assumed all other API servers have a copy of this key, because it's the staged key and also valid as a secondard key. A encrypted with the new primary key should be validatable on other API servers if they have a copy of the staged key, which has the same key contents as the new primary key on the API server that initiated the token rotation. The rsync implementation deletes the contents of the key repository and rebuilds it, alphanumerically. This results in the staged key always being written by rsync first, because its file name is 0. The primary key is always written last, because its filename is the highest index of the key repository. A unique timing event where: - a token is created after key rotation, but before key distribution - key distribution is invoked using a mechanism like rsync - token validation is performed on the API server getting its key repository built by rsync - the token is validated before the primary key is written to the key repository by rsync, and fails validation because the key repository doesn't contain the key used to encrypt the token A subsequent request to validate the token should succeed if rsync completes successfully. pas-ha brought this to the #openstack-keystone channel as an issue that was affecting an internal CI/CD deployment that has a lot of churn [1]. [0] https://docs.openstack.org/keystone/latest/admin/fernet-token-faq.html#what-are-the-different-types-of-keys [1] http://eavesdrop.openstack.org/irclogs/%23openstack-keystone/%23openstack-keystone.2019-02-20.log.html#t2019-02-20T20:11:12
2019-02-21 03:16:13 Lance Bragstad description The implementation for fernet tokens relies on symmetric encryption. This underpinning requires that each keystone API node "share" the same key repository, specifically in deployments where keystone servers need to validate tokens issued by one another (e.g., a cluster of keystone servers behind an HA proxy). Without getting into too much detail, each key repository consists of a set of files on disk. The naming of each file is crucial because it denotes the type of key it is (documented extensively [0]). Each file name corresponds to an integer. The file name with the highest index is used to encrypt new tokens, which is called the primary key. The file name with the lowest index, or 0, is known as a staged key and it is always promoted to be the primary key on the next rotation. Every other key in the repository is a secondary key and is only used to decrypt tokens. Each key on disk goes through a lifecycle, starting as a staged key, promoted to a primary key, eventually being demoted to a secondary key. Note that keystone does *not* handle token distribution between API servers. We recommend this be done using configuration management. The documentation suggests rsync as one possible utility to keep key repositories in sync. I'm opening this bug because it was brought to our attention that keystone servers may respond with a 401 Invalid Fernet token, in deployments with high churn, or high token load, across a cluster of keystone nodes. The issue is that in the process of key rotation, the staged key is promoted to be the primary key. As soon as this happens, any subsequent requests to create tokens will use the primary key to encrypt the token. It is assumed all other API servers have a copy of this key, because it's the staged key and also valid as a secondard key. A encrypted with the new primary key should be validatable on other API servers if they have a copy of the staged key, which has the same key contents as the new primary key on the API server that initiated the token rotation. The rsync implementation deletes the contents of the key repository and rebuilds it, alphanumerically. This results in the staged key always being written by rsync first, because its file name is 0. The primary key is always written last, because its filename is the highest index of the key repository. A unique timing event where: - a token is created after key rotation, but before key distribution - key distribution is invoked using a mechanism like rsync - token validation is performed on the API server getting its key repository built by rsync - the token is validated before the primary key is written to the key repository by rsync, and fails validation because the key repository doesn't contain the key used to encrypt the token A subsequent request to validate the token should succeed if rsync completes successfully. pas-ha brought this to the #openstack-keystone channel as an issue that was affecting an internal CI/CD deployment that has a lot of churn [1]. [0] https://docs.openstack.org/keystone/latest/admin/fernet-token-faq.html#what-are-the-different-types-of-keys [1] http://eavesdrop.openstack.org/irclogs/%23openstack-keystone/%23openstack-keystone.2019-02-20.log.html#t2019-02-20T20:11:12 The implementation for fernet tokens relies on symmetric encryption. This underpinning requires that each keystone API node "share" the same key repository, specifically in deployments where keystone servers need to validate tokens issued by one another (e.g., a cluster of keystone servers behind an HA proxy). Without getting into too much detail, each key repository consists of a set of files on disk. The naming of each file is crucial because it denotes the type of key it is (documented extensively [0]). Each file name corresponds to an integer. The file name with the highest index is used to encrypt new tokens, which is called the primary key. The file name with the lowest index, or 0, is known as a staged key and it is always promoted to be the primary key on the next rotation. Every other key in the repository is a secondary key and is only used to decrypt tokens. Each key on disk goes through a lifecycle, starting as a staged key, promoted to a primary key, eventually being demoted to a secondary key. Note that keystone does *not* handle key distribution between API servers. We recommend this be done using configuration management. The documentation suggests rsync as one possible utility to keep key repositories in sync. I'm opening this bug because it was brought to our attention that keystone servers may respond with a 401 Invalid Fernet token, in deployments with high churn, or high token load, across a cluster of keystone nodes. The issue is that in the process of key rotation, the staged key is promoted to be the primary key. As soon as this happens, any subsequent requests to create tokens will use the primary key to encrypt the token. It is assumed all other API servers have a copy of this key, because it's the staged key and also valid as a secondard key. A encrypted with the new primary key should be validatable on other API servers if they have a copy of the staged key, which has the same key contents as the new primary key on the API server that initiated the token rotation. The rsync implementation deletes the contents of the key repository and rebuilds it, alphanumerically. This results in the staged key always being written by rsync first, because its file name is 0. The primary key is always written last, because its filename is the highest index of the key repository. A unique timing event where: - a token is created after key rotation, but before key distribution - key distribution is invoked using a mechanism like rsync - token validation is performed on the API server getting its key repository built by rsync - the token is validated before the primary key is written to the key repository by rsync, and fails validation because the key repository doesn't contain the key used to encrypt the token A subsequent request to validate the token should succeed if rsync completes successfully. pas-ha brought this to the #openstack-keystone channel as an issue that was affecting an internal CI/CD deployment that has a lot of churn [1]. [0] https://docs.openstack.org/keystone/latest/admin/fernet-token-faq.html#what-are-the-different-types-of-keys [1] http://eavesdrop.openstack.org/irclogs/%23openstack-keystone/%23openstack-keystone.2019-02-20.log.html#t2019-02-20T20:11:12
2019-02-21 03:18:27 Lance Bragstad description The implementation for fernet tokens relies on symmetric encryption. This underpinning requires that each keystone API node "share" the same key repository, specifically in deployments where keystone servers need to validate tokens issued by one another (e.g., a cluster of keystone servers behind an HA proxy). Without getting into too much detail, each key repository consists of a set of files on disk. The naming of each file is crucial because it denotes the type of key it is (documented extensively [0]). Each file name corresponds to an integer. The file name with the highest index is used to encrypt new tokens, which is called the primary key. The file name with the lowest index, or 0, is known as a staged key and it is always promoted to be the primary key on the next rotation. Every other key in the repository is a secondary key and is only used to decrypt tokens. Each key on disk goes through a lifecycle, starting as a staged key, promoted to a primary key, eventually being demoted to a secondary key. Note that keystone does *not* handle key distribution between API servers. We recommend this be done using configuration management. The documentation suggests rsync as one possible utility to keep key repositories in sync. I'm opening this bug because it was brought to our attention that keystone servers may respond with a 401 Invalid Fernet token, in deployments with high churn, or high token load, across a cluster of keystone nodes. The issue is that in the process of key rotation, the staged key is promoted to be the primary key. As soon as this happens, any subsequent requests to create tokens will use the primary key to encrypt the token. It is assumed all other API servers have a copy of this key, because it's the staged key and also valid as a secondard key. A encrypted with the new primary key should be validatable on other API servers if they have a copy of the staged key, which has the same key contents as the new primary key on the API server that initiated the token rotation. The rsync implementation deletes the contents of the key repository and rebuilds it, alphanumerically. This results in the staged key always being written by rsync first, because its file name is 0. The primary key is always written last, because its filename is the highest index of the key repository. A unique timing event where: - a token is created after key rotation, but before key distribution - key distribution is invoked using a mechanism like rsync - token validation is performed on the API server getting its key repository built by rsync - the token is validated before the primary key is written to the key repository by rsync, and fails validation because the key repository doesn't contain the key used to encrypt the token A subsequent request to validate the token should succeed if rsync completes successfully. pas-ha brought this to the #openstack-keystone channel as an issue that was affecting an internal CI/CD deployment that has a lot of churn [1]. [0] https://docs.openstack.org/keystone/latest/admin/fernet-token-faq.html#what-are-the-different-types-of-keys [1] http://eavesdrop.openstack.org/irclogs/%23openstack-keystone/%23openstack-keystone.2019-02-20.log.html#t2019-02-20T20:11:12 The implementation for fernet tokens relies on symmetric encryption. This underpinning requires that each keystone API node "share" the same key repository, specifically in deployments where keystone servers need to validate tokens issued by one another (e.g., a cluster of keystone servers behind an HA proxy). Without getting into too much detail, each key repository consists of a set of files on disk. The naming of each file is crucial because it denotes the type of key it is (documented extensively [0]). Each file name corresponds to an integer. The file name with the highest index is used to encrypt new tokens, which is called the primary key. The file name with the lowest index, or 0, is known as a staged key and it is always promoted to be the primary key on the next rotation. Every other key in the repository is a secondary key and is only used to decrypt tokens. Each key on disk goes through a lifecycle, starting as a staged key, promoted to a primary key, eventually being demoted to a secondary key. Note that keystone does *not* handle key distribution between API servers. We recommend this be done using configuration management. The documentation suggests rsync as one possible utility to keep key repositories in sync. I'm opening this bug because it was brought to our attention that keystone servers may respond with a 401 Invalid Fernet token, in deployments with high churn, or high token load, across a cluster of keystone nodes. The issue is that in the process of key rotation, the staged key is promoted to be the primary key. As soon as this happens, any subsequent requests to create tokens will use the primary key to encrypt the token. It is assumed all other API servers have a copy of this key, because it's the staged key and also valid as a secondard key. A token encrypted with the new primary key should be validatable on other API servers if they have a copy of the staged key, which has the same key contents as the new primary key on the API server that initiated the token rotation. The rsync implementation deletes the contents of the key repository and rebuilds it, alphanumerically. This results in the staged key always being written by rsync first, because its file name is 0. The primary key is always written last, because its filename is the highest index of the key repository. A unique timing event where: - a token is created after key rotation, but before key distribution - key distribution is invoked using a mechanism like rsync - token validation is performed on the API server getting its key repository built by rsync - the token is validated before the new primary key is written to the key repository by rsync, and fails validation because the key repository doesn't contain the key used to encrypt the token A subsequent request to validate the token should succeed if rsync completes successfully. pas-ha brought this to the #openstack-keystone channel as an issue that was affecting an internal CI/CD deployment that has a lot of churn [1]. [0] https://docs.openstack.org/keystone/latest/admin/fernet-token-faq.html#what-are-the-different-types-of-keys [1] http://eavesdrop.openstack.org/irclogs/%23openstack-keystone/%23openstack-keystone.2019-02-20.log.html#t2019-02-20T20:11:12
2019-02-21 04:26:26 Lance Bragstad bug task added openstack-ansible
2019-02-21 14:36:43 Lance Bragstad keystone: status New Triaged
2019-02-21 14:36:48 Lance Bragstad keystone: importance Undecided Low
2019-02-21 14:36:53 Lance Bragstad tags fernet docu fernet
2019-02-21 14:36:59 Lance Bragstad tags docu fernet documentation fernet
2019-02-26 20:04:14 Lance Bragstad openstack-ansible: status New Fix Committed
2019-02-26 20:57:37 OpenStack Infra tags documentation fernet documentation fernet in-stable-queens
2019-02-26 21:11:00 OpenStack Infra tags documentation fernet in-stable-queens documentation fernet in-stable-ocata in-stable-queens
2019-02-26 21:12:40 OpenStack Infra tags documentation fernet in-stable-ocata in-stable-queens documentation fernet in-stable-ocata in-stable-pike in-stable-queens
2019-02-26 21:12:49 OpenStack Infra tags documentation fernet in-stable-ocata in-stable-pike in-stable-queens documentation fernet in-stable-ocata in-stable-pike in-stable-queens in-stable-rocky
2019-03-07 09:08:57 OpenStack Infra keystone: status Triaged In Progress
2019-03-07 09:08:57 OpenStack Infra keystone: assignee Pavlo Shchelokovskyy (pshchelo)
2019-03-12 14:33:34 Colleen Murphy keystone: milestone stein-rc1
2019-03-12 21:35:32 OpenStack Infra keystone: status In Progress Fix Released
2024-02-13 17:17:29 Jonathan Rosser openstack-ansible: status Fix Committed Fix Released