Ironic

Bug #1472565
Activity log

Activity log for bug #1472565

Date	Who	What changed	Old value	New value	Message
2015-07-08 10:40:24	Lucas Alvares Gomes	bug			added bug
2015-07-08 10:40:31	Lucas Alvares Gomes	ironic: assignee		Lucas Alvares Gomes (lucasagomes)
2015-07-08 10:40:37	Lucas Alvares Gomes	ironic: importance	Undecided	Medium
2015-07-08 10:41:48	Lucas Alvares Gomes	summary	API: Not all Conflicts (409) should be retryable	API: Not all Conflicts (409) should be retr-yable
2015-07-08 10:41:54	Lucas Alvares Gomes	summary	API: Not all Conflicts (409) should be retr-yable	API: Not all Conflicts (409) should be retry-able
2015-07-08 10:42:10	Lucas Alvares Gomes	description	Currently the way the interaction between the python-ironicclient and the Ironic API service works is that for Conflict (409) returned by the API we should retry it a number of times. While this assumption is valid for some cases e.g node is currently locked and doing some operation so you have to try it later. It's not valid for others e.g when registering a port with a mac address that is already registered. So following example of registering a port with a duplicated which has a MAC address, the request will take ~30s before it actually returns the error, this is not something that the server eventually will fix and retrying would succeed. See paste: [1] This problem happens when * Registering a node with duplicated "name", "instance_uuid" or "uuid". * Registering a port with duplicated "uuid" or "address" * Registering a chassis with duplicated "uuid" * Trying to delete a node which is in a provision state where it can't be deleted (e.g ACTIVE, CLEANING, etc...) So some suggestions here: 1) We need a way to indicate when 409 is retryable or not. One way to do it is by using the "Retry-After"[2] HTTP header and having the client to respect it. If specified the client should wait the number of seconds specified in that header and try again (we can keep the option of "--max-retries" and "--retry-interval" for the client, but the "--retry-interval" by default will be None and if specified it overwrite the seconds passed by the server in the "Retry-After" header) 2) Use a different HTTP code for non retryable errors. I would suggest us to use Unprocessable Entity (422) for it. Since the syntax of the request is fine we can't use BadRequest (400) [1] http://paste.openstack.org/show/354956/ [2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.37 [3] https://tools.ietf.org/html/rfc4918#section-11.2	Currently the way the interaction between the python-ironicclient and the Ironic API service works is that for Conflict (409) returned by the API we should retry it a number of times. While this assumption is valid for some cases e.g node is currently locked and doing some operation so you have to try it later. It's not valid for others e.g when registering a port with a mac address that is already registered. So following example of registering a port with a duplicated which has a MAC address, the request will take ~30s before it actually returns the error, this is not something that the server eventually will fix and retrying would succeed. See paste: [1] This problem happens when * Registering a node with duplicated "name", "instance_uuid" or "uuid". * Registering a port with duplicated "uuid" or "address" * Registering a chassis with duplicated "uuid" * Trying to delete a node which is in a provision state where it can't be deleted (e.g ACTIVE, CLEANING, etc...) So some suggestions here: 1) We need a way to indicate when 409 is retry-able or not. One way to do it is by using the "Retry-After"[2] HTTP header and having the client to respect it. If specified the client should wait the number of seconds specified in that header and try again (we can keep the option of "--max-retries" and "--retry-interval" for the client, but the "--retry-interval" by default will be None and if specified it overwrite the seconds passed by the server in the "Retry-After" header) 2) Use a different HTTP code for non retryable errors. I would suggest us to use Unprocessable Entity (422) for it. Since the syntax of the request is fine we can't use BadRequest (400) [1] http://paste.openstack.org/show/354956/ [2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.37 [3] https://tools.ietf.org/html/rfc4918#section-11.2
2015-07-08 10:44:31	Lucas Alvares Gomes	description	Currently the way the interaction between the python-ironicclient and the Ironic API service works is that for Conflict (409) returned by the API we should retry it a number of times. While this assumption is valid for some cases e.g node is currently locked and doing some operation so you have to try it later. It's not valid for others e.g when registering a port with a mac address that is already registered. So following example of registering a port with a duplicated which has a MAC address, the request will take ~30s before it actually returns the error, this is not something that the server eventually will fix and retrying would succeed. See paste: [1] This problem happens when * Registering a node with duplicated "name", "instance_uuid" or "uuid". * Registering a port with duplicated "uuid" or "address" * Registering a chassis with duplicated "uuid" * Trying to delete a node which is in a provision state where it can't be deleted (e.g ACTIVE, CLEANING, etc...) So some suggestions here: 1) We need a way to indicate when 409 is retry-able or not. One way to do it is by using the "Retry-After"[2] HTTP header and having the client to respect it. If specified the client should wait the number of seconds specified in that header and try again (we can keep the option of "--max-retries" and "--retry-interval" for the client, but the "--retry-interval" by default will be None and if specified it overwrite the seconds passed by the server in the "Retry-After" header) 2) Use a different HTTP code for non retryable errors. I would suggest us to use Unprocessable Entity (422) for it. Since the syntax of the request is fine we can't use BadRequest (400) [1] http://paste.openstack.org/show/354956/ [2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.37 [3] https://tools.ietf.org/html/rfc4918#section-11.2	Currently the way the interaction between the python-ironicclient and the Ironic API service works is that for Conflict (409) returned by the API we should retry it a number of times. While this assumption is valid for some cases e.g node is currently locked and doing some operation so you have to try it later. It's not valid for others e.g when registering a port with a mac address that is already registered. So following the example of registering a port with a duplicated MAC address, the request will take ~30s before it actually returns the error. This is not something that the server will eventually fix so retrying doesn't make much sense. See paste: [1] Currently the non retry-able Conflict(409) in our API (IMHO) are: * Registering a node with duplicated "name", "instance_uuid" or "uuid". * Registering a port with duplicated "uuid" or "address" * Registering a chassis with duplicated "uuid" * Trying to delete a node which is in a provision state where it can't be deleted (e.g ACTIVE, CLEANING, etc...) So some suggestions here: 1) We need a way to indicate when 409 is retry-able or not. One way to do it is by using the "Retry-After"[2] HTTP header and having the client to respect it. If specified the client should wait the number of seconds specified in that header and try again (we can keep the option of "--max-retries" and "--retry-interval" for the client, but the "--retry-interval" by default will be None and if specified it overwrite the seconds passed by the server in the "Retry-After" header) 2) Use a different HTTP code for non retryable errors. I would suggest us to use Unprocessable Entity (422) for it. Since the syntax of the request is fine we can't use BadRequest (400) [1] http://paste.openstack.org/show/354956/ [2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.37 [3] https://tools.ietf.org/html/rfc4918#section-11.2
2015-07-08 10:44:59	Lucas Alvares Gomes	description	Currently the way the interaction between the python-ironicclient and the Ironic API service works is that for Conflict (409) returned by the API we should retry it a number of times. While this assumption is valid for some cases e.g node is currently locked and doing some operation so you have to try it later. It's not valid for others e.g when registering a port with a mac address that is already registered. So following the example of registering a port with a duplicated MAC address, the request will take ~30s before it actually returns the error. This is not something that the server will eventually fix so retrying doesn't make much sense. See paste: [1] Currently the non retry-able Conflict(409) in our API (IMHO) are: * Registering a node with duplicated "name", "instance_uuid" or "uuid". * Registering a port with duplicated "uuid" or "address" * Registering a chassis with duplicated "uuid" * Trying to delete a node which is in a provision state where it can't be deleted (e.g ACTIVE, CLEANING, etc...) So some suggestions here: 1) We need a way to indicate when 409 is retry-able or not. One way to do it is by using the "Retry-After"[2] HTTP header and having the client to respect it. If specified the client should wait the number of seconds specified in that header and try again (we can keep the option of "--max-retries" and "--retry-interval" for the client, but the "--retry-interval" by default will be None and if specified it overwrite the seconds passed by the server in the "Retry-After" header) 2) Use a different HTTP code for non retryable errors. I would suggest us to use Unprocessable Entity (422) for it. Since the syntax of the request is fine we can't use BadRequest (400) [1] http://paste.openstack.org/show/354956/ [2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.37 [3] https://tools.ietf.org/html/rfc4918#section-11.2	Currently the way the interaction between the python-ironicclient and the Ironic API service works is that for Conflict (409) returned by the API we should retry it a number of times. While this assumption is valid for some cases e.g node is currently locked and doing some operation so you have to try it later. It's not valid for others e.g when registering a port with a mac address that is already registered. So following the example of registering a port with a duplicated MAC address, the request will take ~30s before it actually returns the error. This is not something that the server will eventually fix so retrying doesn't make much sense. See paste: [1] Currently the non retry-able Conflict(409) in our API (IMHO) are: * Registering a node with duplicated "name", "instance_uuid" or "uuid". * Registering a port with duplicated "uuid" or "address" * Registering a chassis with duplicated "uuid" * Trying to delete a node which is in a provision state where it can't be deleted (e.g ACTIVE, CLEANING, etc...) Suggestions: 1) We need a way to indicate when 409 is retry-able or not. One way to do it is by using the "Retry-After"[2] HTTP header and having the client to respect it. If specified the client should wait the number of seconds specified in that header and try again (we can keep the option of "--max-retries" and "--retry-interval" for the client, but the "--retry-interval" by default will be None and if specified it overwrite the seconds passed by the server in the "Retry-After" header) 2) Use a different HTTP code for non retryable errors. I would suggest us to use Unprocessable Entity (422) for it. Since the syntax of the request is fine we can't use BadRequest (400) [1] http://paste.openstack.org/show/354956/ [2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.37 [3] https://tools.ietf.org/html/rfc4918#section-11.2
2015-07-08 10:46:14	Lucas Alvares Gomes	description	Currently the way the interaction between the python-ironicclient and the Ironic API service works is that for Conflict (409) returned by the API we should retry it a number of times. While this assumption is valid for some cases e.g node is currently locked and doing some operation so you have to try it later. It's not valid for others e.g when registering a port with a mac address that is already registered. So following the example of registering a port with a duplicated MAC address, the request will take ~30s before it actually returns the error. This is not something that the server will eventually fix so retrying doesn't make much sense. See paste: [1] Currently the non retry-able Conflict(409) in our API (IMHO) are: * Registering a node with duplicated "name", "instance_uuid" or "uuid". * Registering a port with duplicated "uuid" or "address" * Registering a chassis with duplicated "uuid" * Trying to delete a node which is in a provision state where it can't be deleted (e.g ACTIVE, CLEANING, etc...) Suggestions: 1) We need a way to indicate when 409 is retry-able or not. One way to do it is by using the "Retry-After"[2] HTTP header and having the client to respect it. If specified the client should wait the number of seconds specified in that header and try again (we can keep the option of "--max-retries" and "--retry-interval" for the client, but the "--retry-interval" by default will be None and if specified it overwrite the seconds passed by the server in the "Retry-After" header) 2) Use a different HTTP code for non retryable errors. I would suggest us to use Unprocessable Entity (422) for it. Since the syntax of the request is fine we can't use BadRequest (400) [1] http://paste.openstack.org/show/354956/ [2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.37 [3] https://tools.ietf.org/html/rfc4918#section-11.2	Currently the way the interaction between the python-ironicclient and the Ironic API service works is that for Conflict (409) returned by the API we should retry it a number of times. While this assumption is valid for some cases e.g node is currently locked and doing some operation so you have to try it later. It's not valid for others e.g when registering a port with a mac address that is already registered. So following the example of registering a port with a duplicated MAC address, the request will take ~30s before it actually returns the error. This is not something that the server will eventually fix so retrying doesn't make much sense. See paste: [1] Currently the non retry-able Conflict(409) in our API (IMHO) are: * Registering a node with duplicated "name", "instance_uuid" or "uuid". * Registering a port with duplicated "uuid" or "address" * Registering a chassis with duplicated "uuid" * Trying to delete a node which is in a provision state where it can't be deleted (e.g ACTIVE, CLEANING, etc...) Suggestions: 1) We need a way to indicate when 409 is retry-able or not. One way to do it is by using the "Retry-After"[2] HTTP header and having the client to respect it. If specified the client should wait the number of seconds specified in that header and try again (We could keep the options of "--max-retries" and "--retry-interval" for the client, but the "--retry-interval" by default will be None and if specified it overwrite the seconds passed by the server in the "Retry-After" header) If "Retry-After" is not specified in the header, the client should not retry the request. 2) Use a different HTTP code for non retryable errors. I would suggest us to use Unprocessable Entity (422) for it. Since the syntax of the request is fine we can't use BadRequest (400) [1] http://paste.openstack.org/show/354956/ [2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.37 [3] https://tools.ietf.org/html/rfc4918#section-11.2
2015-07-08 10:46:38	Lucas Alvares Gomes	description	Currently the way the interaction between the python-ironicclient and the Ironic API service works is that for Conflict (409) returned by the API we should retry it a number of times. While this assumption is valid for some cases e.g node is currently locked and doing some operation so you have to try it later. It's not valid for others e.g when registering a port with a mac address that is already registered. So following the example of registering a port with a duplicated MAC address, the request will take ~30s before it actually returns the error. This is not something that the server will eventually fix so retrying doesn't make much sense. See paste: [1] Currently the non retry-able Conflict(409) in our API (IMHO) are: * Registering a node with duplicated "name", "instance_uuid" or "uuid". * Registering a port with duplicated "uuid" or "address" * Registering a chassis with duplicated "uuid" * Trying to delete a node which is in a provision state where it can't be deleted (e.g ACTIVE, CLEANING, etc...) Suggestions: 1) We need a way to indicate when 409 is retry-able or not. One way to do it is by using the "Retry-After"[2] HTTP header and having the client to respect it. If specified the client should wait the number of seconds specified in that header and try again (We could keep the options of "--max-retries" and "--retry-interval" for the client, but the "--retry-interval" by default will be None and if specified it overwrite the seconds passed by the server in the "Retry-After" header) If "Retry-After" is not specified in the header, the client should not retry the request. 2) Use a different HTTP code for non retryable errors. I would suggest us to use Unprocessable Entity (422) for it. Since the syntax of the request is fine we can't use BadRequest (400) [1] http://paste.openstack.org/show/354956/ [2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.37 [3] https://tools.ietf.org/html/rfc4918#section-11.2	Currently the way the interaction between the python-ironicclient and the Ironic API service works is that for Conflict (409) returned by the API we should retry it a number of times. While this assumption is valid for some cases e.g node is currently locked and doing some operation so you have to try it later. It's not valid for others e.g when registering a port with a mac address that is already registered. So following the example of registering a port with a duplicated MAC address, the request will take ~30s before it actually returns the error. This is not something that the server will eventually fix so retrying doesn't make much sense. See paste: [1] Currently the non retry-able Conflict(409) in our API (IMHO) are: * Registering a node with duplicated "name", "instance_uuid" or "uuid". * Registering a port with duplicated "uuid" or "address" * Registering a chassis with duplicated "uuid" * Trying to delete a node which is in a provision state where it can't be deleted (e.g ACTIVE, CLEANING, etc...) Suggestions: 1) We need a way to indicate when 409 is retry-able or not. One way to do it is by using the "Retry-After"[2] HTTP header and having the client to respect it. If specified the client should wait the number of seconds specified in that header and try again (We could keep the options of "--max-retries" and "--retry-interval" for the client, but the "--retry-interval" by default will be None and if specified it overwrite the seconds passed by the server in the "Retry-After" header) If "Retry-After" is not specified in the header, the client should not retry the request. 2) Use a different HTTP code for non retryable errors. I would suggest us to use Unprocessable Entity (422) for it. Since the syntax of the request is fine we can't use BadRequest (400) This won't need any change in the client. [1] http://paste.openstack.org/show/354956/ [2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.37 [3] https://tools.ietf.org/html/rfc4918#section-11.2
2015-07-08 14:48:09	Dmitry Tantsur	ironic: status	New	Triaged
2015-07-08 15:15:03	Lucas Alvares Gomes	marked as duplicate		1461140