Activity log for bug #1548135

Date Who What changed Old value New value Message
2016-02-21 23:55:11 Eva Balycheva bug added bug
2016-02-21 23:55:26 Eva Balycheva summary (Redis) Unable to claim messages after some messages have expired (Redis) Unable to claim messages after some messages expired
2016-02-22 00:17:45 Eva Balycheva description How to reproduce: 1. Post some messages 2. Wait for them to expire 3. (optional) You can post some more messages if you wish, it doesn't matter. 4. Try to claim messages. You'll get the error 503 on the client, and on the server you'll get: ResponseError: Error running script (call to f_f8726c7ad2f323131fb1bbd395fff6b501b316c8): @user_script:59: user_script:59: attempt to compare nil with number Why it happens: Redis can automatically expire top-level keys. We ask Redis to expire our top-level keys after some time: <claim id> or <message id> and Redis cleans them wonderfully. However, there are also these top-level keys: 1. <project_id>.<queue_name>.messages (contains list of message ids) 2. <project_id>.<queue_name>.claims (contains list of claim ids) And these are not cleaned. Nor Zaqar, nor Redis wipes out expired ids from these lists. So claim_messages.lua script grabs message ids from <project_id>.<queue_name>.messages top-level key list: https://github.com/openstack/zaqar/blob/master/zaqar/storage/redis/scripts/claim_messages.lua#L40 Then it takes first grabbed message id and tries to get two field values of corresponding <message id> key: https://github.com/openstack/zaqar/blob/master/zaqar/storage/redis/scripts/claim_messages.lua#L57 In case <message id> key was already expired and cleaned by Redis, this function returns [nil, nil] instead of [claim id, expiration date number]. So this condition fails, because it can't be evaluated: https://github.com/openstack/zaqar/blob/master/zaqar/storage/redis/scripts/claim_messages.lua#L59 Possible solution: Because it's impossible to make Redis auto expire row in a list (See https://github.com/antirez/redis/issues/167#issuecomment-2559040), make Zaqar check row's expiration date on each query in lists <project_id>.<queue_name>.messages and <project_id>.<queue_name>.claims. If the date is expired, delete the row. How to reproduce: 1. Post some messages 2. Wait for them to expire 3. (optional) You can post some more messages if you wish, it doesn't matter. 4. Try to claim messages. You'll get the error 503 on the client, and on the server you'll get: ResponseError: Error running script (call to f_f8726c7ad2f323131fb1bbd395fff6b501b316c8): @user_script:59: user_script:59: attempt to compare nil with number Why it happens: Redis can automatically expire top-level keys. We ask Redis to expire our top-level keys after some time: <claim id> or <message id> and Redis cleans them wonderfully. However, there are also these top-level keys: 1. <project_id>.<queue_name>.messages (contains set of message ids) 2. <project_id>.<queue_name>.claims (contains set of claim ids) And these are not cleaned. Nor Zaqar, nor Redis wipes out expired ids from these sets. So claim_messages.lua script grabs message ids from <project_id>.<queue_name>.messages top-level key set: https://github.com/openstack/zaqar/blob/master/zaqar/storage/redis/scripts/claim_messages.lua#L40 Then it takes first grabbed message id and tries to get two field values of corresponding <message id> key: https://github.com/openstack/zaqar/blob/master/zaqar/storage/redis/scripts/claim_messages.lua#L57 In case <message id> key was already expired and cleaned by Redis, this function returns [nil, nil] instead of [claim id, expiration date number]. So this condition fails, because it can't be evaluated: https://github.com/openstack/zaqar/blob/master/zaqar/storage/redis/scripts/claim_messages.lua#L59 Possible solution: Because it's impossible to make Redis auto expire row in a set (See https://github.com/antirez/redis/issues/167#issuecomment-2559040), make Zaqar check row's expiration date on each query in sets <project_id>.<queue_name>.messages and <project_id>.<queue_name>.claims. If the date is expired, delete the row.
2016-02-22 00:38:54 Eva Balycheva description How to reproduce: 1. Post some messages 2. Wait for them to expire 3. (optional) You can post some more messages if you wish, it doesn't matter. 4. Try to claim messages. You'll get the error 503 on the client, and on the server you'll get: ResponseError: Error running script (call to f_f8726c7ad2f323131fb1bbd395fff6b501b316c8): @user_script:59: user_script:59: attempt to compare nil with number Why it happens: Redis can automatically expire top-level keys. We ask Redis to expire our top-level keys after some time: <claim id> or <message id> and Redis cleans them wonderfully. However, there are also these top-level keys: 1. <project_id>.<queue_name>.messages (contains set of message ids) 2. <project_id>.<queue_name>.claims (contains set of claim ids) And these are not cleaned. Nor Zaqar, nor Redis wipes out expired ids from these sets. So claim_messages.lua script grabs message ids from <project_id>.<queue_name>.messages top-level key set: https://github.com/openstack/zaqar/blob/master/zaqar/storage/redis/scripts/claim_messages.lua#L40 Then it takes first grabbed message id and tries to get two field values of corresponding <message id> key: https://github.com/openstack/zaqar/blob/master/zaqar/storage/redis/scripts/claim_messages.lua#L57 In case <message id> key was already expired and cleaned by Redis, this function returns [nil, nil] instead of [claim id, expiration date number]. So this condition fails, because it can't be evaluated: https://github.com/openstack/zaqar/blob/master/zaqar/storage/redis/scripts/claim_messages.lua#L59 Possible solution: Because it's impossible to make Redis auto expire row in a set (See https://github.com/antirez/redis/issues/167#issuecomment-2559040), make Zaqar check row's expiration date on each query in sets <project_id>.<queue_name>.messages and <project_id>.<queue_name>.claims. If the date is expired, delete the row. How to reproduce: 1. Post some messages 2. Wait for them to expire 3. (optional) You can post some more messages if you wish, it doesn't matter. 4. Try to claim messages. You'll get the error 503 on the client, and on the server you'll get: ResponseError: Error running script (call to f_f8726c7ad2f323131fb1bbd395fff6b501b316c8): @user_script:59: user_script:59: attempt to compare nil with number Why it happens: Redis can automatically expire top-level keys. We ask Redis to expire our top-level keys after some time: <claim id> or <message id> and Redis cleans them wonderfully. However, there are also these top-level keys: 1. <project_id>.<queue_name>.messages (contains set of message ids) 2. <project_id>.<queue_name>.claims (contains set of claim ids) And these are not cleaned. Nor Zaqar, nor Redis wipes out expired ids from these sets. So claim_messages.lua script grabs message ids from <project_id>.<queue_name>.messages top-level key set: https://github.com/openstack/zaqar/blob/master/zaqar/storage/redis/scripts/claim_messages.lua#L40 Then it takes first grabbed message id and tries to get two field values of corresponding <message id> key: https://github.com/openstack/zaqar/blob/master/zaqar/storage/redis/scripts/claim_messages.lua#L57 In case <message id> key was already expired and cleaned by Redis, this function returns [nil, nil] instead of [claim id, expiration date number]. So this condition fails, because it can't be evaluated: https://github.com/openstack/zaqar/blob/master/zaqar/storage/redis/scripts/claim_messages.lua#L59 Possible solution: Because it's impossible to make Redis auto expire row in a set (See https://github.com/antirez/redis/issues/167#issuecomment-2559040), make Zaqar check row's expiration date on each query in sets <project_id>.<queue_name>.messages and <project_id>.<queue_name>.claims. If the date is expired, delete the row. Note: should also include expiration date in each row of <project_id>.<queue_name>.messages. It's only implemented in <project_id>.<queue_name>.claims.
2016-05-08 21:59:25 Feilong Wang zaqar: importance Undecided High
2016-05-08 22:02:17 Eva Balycheva zaqar: assignee Eva Balycheva (ubershy)
2016-05-09 20:48:07 OpenStack Infra zaqar: status New In Progress
2016-06-06 09:18:49 OpenStack Infra zaqar: status In Progress Fix Released