OpenStack Object Storage (swift)

Bug #1639691
Comment #0

Comment 0 for bug 1639691

Revision history for this message

Kota Tsuyuzaki (tsuyuzaki-kota) wrote on 2016-11-07:

This is what I have been fighting in the last week and here I'd like to summarize/describe for the relations and impact to Swift/PyECLib/liberasurecode.

I DO NOT think this is a security risk because this doesn't cause an information leak or significant resource attack from outside, but this includes significant impact for users because of the risk of data lost. This is why I set this as private-security temporary. We can make this as open after notmyname confirm the impact and according to his decision.

To Swift API:
Using isa_l_rs_vand EC policy, pyeclib <= 1.3.1, liberasurecode <=1.2.1 with >= 5 ec_n_parity setting, you can see corrupted data *RARELY* *WITHOUT ANY ERRORS* at downloading. If you are using 'swift' command to download, the error appears most likely,

ubuntu@saio:~$ swift download ectest vagrant/.scratch/swift.tar.gz -o delete.me
Error downloading object 'ectest/vagrant/.scratch/swift.tar.gz': u'Error downloading vagrant/.scratch/swift.tar.gz: md5sum != etag, 4f327eed4b9b9210d13a30c9441fefa1 != 5b60a22fcf65a356391cbdb08b9140d8'

And one more significant point is NOTHING QUARANTINED from the fact so swift is now regarding the corrupted one as a correct object.

The reason, I said "RARELY" above is that occurs due to the fragments combinations. You can see such buggy combinations with following script, that asserts isa_l_rs_vand ec_ndata=10, ec_n_parity=5 case:

from pyeclib.ec_iface import ECDriver
from itertools import combinations

if __name__ == '__main__':
    driver = ECDriver(ec_type='isa_l_rs_vand', k=10, m=5)
    orig_data = 'a' * 1024 * 1024
    encoded = driver.encode(orig_data)

    # make all fragment like (index, frag_data) format to feed to combinations
    frags = [(i, frag) for i, frag in enumerate(encoded)]
    bad_combinations = []
    check_frags_combinations = list(combinations(frags, 10))
    for check_frags in check_frags_combinations:
        check_frags_dict = dict(check_frags)
        decoded = driver.decode(check_frags_dict.values())
        # No No don't kidding me
        try:
            assert decoded == orig_data
        except AssertionError:
            bad_combinations.append(list(check_frags_dict))

    if bad_combinations:
        ratio = float(len(bad_combinations)) / len(check_frags_combinations)
        print 'Bad combinations found: %s/%s (ratio: %s)' % (
            len(bad_combinations), len(check_frags_combinations), ratio)
        print bad_combinations

The result of the script should be:

Bad combinations found: 10/3003 (ratio: 0.00333000333)
[[0, 1, 2, 3, 5, 6, 8, 10, 11, 14], [0, 1, 2, 3, 5, 7, 8, 10, 13, 14], [0, 1, 2, 4, 5, 7, 9, 10, 11, 14], [0, 1, 2, 4, 6, 7, 9, 10, 13, 14], [0, 1, 3, 4, 6, 8, 9, 10, 11, 14], [0, 1, 3, 5, 6, 8, 9, 10, 13, 14], [0, 2, 3, 5, 7, 8, 9, 10, 11, 14], [0, 2, 4, 5, 7, 8, 9, 10, 13, 14], [1, 2, 4, 6, 7, 8, 9, 10, 11, 14], [1, 3, 4, 6, 7, 8, 9, 10, 13, 14]]

So the bad combinations appears approximately in 1/300 of the time.

The bug reason:

The bug exists in liberasurecode isa-l backend implementation. I've already asked to the isa-l maintainer about the possiblity that isa-l can return such a corrupted data without errors[1]. He honestly point out the issue that gf_gen_rs_matrix (which is used in isa_l_rs_vand encode/decode/reconstruct process) could make non-invertable matrix that means "the matrix can have the combination to be unrecoverable". So the bad pattern should exist for the isa-l constraint. However, he points out another issue that "we can find the case if you asserts the return value of gf_invert_matrix" and liberasurecode doesn't. That is the main reason, Swift/PyECLib/liberasurecode can make such a garbage decoding result.

I made a patch[2] that fixes to enable liberasurecode to be able to return Error instead of corrupted data. With [2], Swift will be able to return Error (sort of 5xx) in the bad pattern case.

Still impact to existing clusters:

If you have existing cluster with >=5 parities isa-l policy, it is still problematic even if you patch the fix [2]. That is probably because you are running the *object-reconstructor* in the cluster. As well as Swift API impact, object-reconstructor is also impacted the issue. The following script is an extent from the previous script that asserts what happen if a fragment was replaced in a good pattern with the fragment reconstructed from a bad pattern.

from pyeclib.ec_iface import ECDriver
from itertools import combinations

def test_ec_driver(k, m):
    driver = ECDriver(ec_type='isa_l_rs_vand', k=k, m=m)
    orig_data = 'a' * 1000
    encoded = driver.encode(orig_data)

    # make all fragment like (index, frag_data) format to feed to combinations
    frags = [(i, frag) for i, frag in enumerate(encoded)]
    an_ok_pattern = None

    for check_frags in combinations(frags, k):
        check_frags_dict = dict(check_frags)
        decoded = driver.decode(check_frags_dict.values())
        # No No don't kidding me
        try:
            assert decoded == orig_data
            if not an_ok_pattern:
                # save this ok pattern
                an_ok_pattern = dict(check_frags_dict)
        except AssertionError:
            # find missing index
            for x in range(k + m):
                if x not in check_frags_dict:
                    break
            else:
                raise Exception('something wrong')
            print 'try reconstruct index %s with %s' % (
                x, check_frags_dict.keys())
            # try to reconstruct
            reconed = driver.reconstruct(check_frags_dict.values(), [x])[0]
            try:
                assert reconed == frags[x]
            except AssertionError:
                if x not in an_ok_pattern:
                    raise Exception('need to make this script better')
                # sanity check
                decoded = driver.decode(an_ok_pattern.values())
                assert decoded == orig_data
                print 'this is ok'
                # try to decode again including the garbage reconstructed one
                an_ok_pattern[x] = reconed
                decoded = driver.decode(an_ok_pattern.values())
                assert decoded == orig_data
                print 'this is ok if the assertion above passed'

if __name__ == '__main__':
    params = [(10, 5)]
    for k, m in params:
        test_ec_driver(k, m)

As you can see, the bad pattern can infect the corrupted frags to a good pattern. And then, right now, we have no way to find a good pattern, including the infected garbage as a bad pattern unless we calculate the original object etag with decoding whole fragments in the fragment archive. Even if once we can find the bad pattern, I don't think the way to detect which one or both is (are) bad one(s) exist so that it means, once infected, we cannot stop the pandemic anymore and the infected object will go to the garbage in all fragment archives step by step. This is the worst problem.

I think, to the new incoming objects and brand new cluster deployment, the fix [2] is enough if you can accept to see the errors in 1/300 of the time.

What we can do for this pandemic:

For the possible patches perspective, we should land [2] which can prevent *future pandemic* as soon as possible.

For existing cluster users, sorry, what I can do is just saying "Please plan to migrate to a safer policy as soon as possible before the pandemic goes to data lost". And I'm trying to add the greedy testing [5] to ensure the result consistency in decode/reconstruct that we can know which parameters can be safe. Otherwise, we may be able to leave same pandemic possiblity to another policy.

For isa-l perspective, we know isa-l is the most high-performance open sourced engine in the available backends so the constraint will be a problem to choose. To solve the bad combination, the isa-l maintainer suggested to use another matrix (gf_gen_caucy1_matrix) for the backend which can be invertable at 5>=parities case. I had worked at [3][4] as the draft. And in my experiment, it shows as similar high-perforemance as isa_r_rs_vand without the constraint we hit.

PyECLib/liberasurecode Version Consideration:

In my opinion, we should fix [2] as soon as possible and set it as the Swift requirement to prevent the pandemic in the future. However, right now we don't have the way to set the c lang binary version constraint. To enable the requirement, my idea is as following steps:

1. [2] land at liberasurecode master
2: bump liberasurecode version
3: make hard-coded constraint for PyECLib to require the liberasurecode >= (step2) (this can be painful becuase xenial or yakkety is =< 1.2.1)
4: bump pyeclib version
5: set the pyeclib requirement in Swift requirements.txt (<- this can have a tag #closes-bug for this bug report in the commit message)

Or any better idea is always welcome to me.

1: https://github.com/01org/isa-l/issues/10
2: https://review.openstack.org/#/c/393595/
3: https://review.openstack.org/#/c/393263/
4: https://review.openstack.org/#/c/393276/
5: https://review.openstack.org/#/c/393656/

This is what I have been fighting in the last week and here I'd like to summarize/describe for the relations and impact to Swift/PyECLib/liberasurecode.

ubuntu@saio:~$ swift download ectest vagrant/.scratch/swift.tar.gz -o delete.me
    Error downloading object 'ectest/vagrant/.scratch/swift.tar.gz': u'Error downloading vagrant/.scratch/swift.tar.gz: md5sum != etag, 4f327eed4b9b9210d13a30c9441fefa1 != 5b60a22fcf65a356391cbdb08b9140d8'

And one more significant point is NOTHING QUARANTINED from the fact so swift is now regarding the corrupted one as a correct object.

from pyeclib.ec_iface import ECDriver
from itertools import combinations

if __name__ == '__main__':
    driver = ECDriver(ec_type='isa_l_rs_vand', k=10, m=5)
    orig_data = 'a' * 1024 * 1024
    encoded = driver.encode(orig_data)

The result of the script should be:

So the bad combinations appears approximately in 1/300 of the time.

The bug reason:

I made a patch[2] that fixes to enable liberasurecode to be able to return Error instead of corrupted data. With [2], Swift will be able to return Error (sort of 5xx) in the bad pattern case.

Still impact to existing clusters:

from pyeclib.ec_iface import ECDriver
from itertools import combinations

def test_ec_driver(k, m):
    driver = ECDriver(ec_type='isa_l_rs_vand', k=k, m=m)
    orig_data = 'a' * 1000
    encoded = driver.encode(orig_data)

# make all fragment like (index, frag_data) format to feed to combinations
    frags = [(i, frag) for i, frag in enumerate(encoded)]
    an_ok_pattern = None

for check_frags in combinations(frags, k):
        check_frags_dict = dict(check_frags)
        decoded = driver.decode(check_frags_dict.values())
        # No No don't kidding me
        try:
            assert decoded == orig_data
            if not an_ok_pattern:
                # save this ok pattern
                an_ok_pattern = dict(check_frags_dict)
        except AssertionError:
            # find missing index 
            for x in range(k + m):
                if x not in check_frags_dict:
                    break
            else:
                raise Exception('something wrong')
            print 'try reconstruct index %s with %s' % (
                x, check_frags_dict.keys())
            # try to reconstruct
            reconed = driver.reconstruct(check_frags_dict.values(), [x])[0]
            try:
                assert reconed == frags[x]
            except AssertionError:
                if x not in an_ok_pattern:
                    raise Exception('need to make this script better')
                # sanity check
                decoded = driver.decode(an_ok_pattern.values())
                assert decoded == orig_data
                print 'this is ok'
                # try to decode again including the garbage reconstructed one
                an_ok_pattern[x] = reconed
                decoded = driver.decode(an_ok_pattern.values())
                assert decoded == orig_data
                print 'this is ok if the assertion above passed'

if __name__ == '__main__':
    params = [(10, 5)]
    for k, m in params:
        test_ec_driver(k, m)

I think, to the new incoming objects and brand new cluster deployment, the fix [2] is enough if you can accept to see the errors in 1/300 of the time.

What we can do for this pandemic:

For the possible patches perspective, we should land [2] which can prevent *future pandemic* as soon as possible.

For isa-l perspective, we know isa-l is the most high-performance open sourced engine in the available backends so the constraint will be a problem to choose. To solve the bad combination, the isa-l maintainer suggested to use another matrix (gf_gen_caucy1_matrix) for the backend which can be invertable at 5>=parities case. I had worked at [3][4] as the draft.  And in my experiment, it shows as similar high-perforemance as isa_r_rs_vand without the constraint we hit.

PyECLib/liberasurecode Version Consideration:

Or any better idea is always welcome to me.