------- Comment From <email address hidden> 2016-09-06 10:21 EDT-------
I think I found the bug in the genwqe code:
ddcb_cmd_fixups() -> genwqe_alloc_sync_sgl() (fails in f/lpage, but sgl->sgl != NULL and f/lpage maybe also != NULL) -> ddcb_cmd_cleanup() -> genwqe_free_sync_sgl() (double free, because sgl->sgl != NULL and f/lpage maybe also != NULL)
In this scenario we would have exactly the kind of double free that would explain the WARNING / Bad page state, and as expected it is caused by broken error handling (cleanup).
Not being familiar with the genwqe code, it would be good if Frank could have a look at the patch.
Using the Ubuntu git source, tag Ubuntu-4.4.0-33.52, I can reproduce the "Bad page state" issue, and with the patch on top I cannot reproduce it any more.
Alex, I'll attach a debian kernel package with the patch applied, please verify if it also solves the issue for you.
------- Comment From <email address hidden> 2016-09-06 10:21 EDT-------
I think I found the bug in the genwqe code:
ddcb_cmd_fixups() -> genwqe_ alloc_sync_ sgl() (fails in f/lpage, but sgl->sgl != NULL and f/lpage maybe also != NULL) -> ddcb_cmd_cleanup() -> genwqe_ free_sync_ sgl() (double free, because sgl->sgl != NULL and f/lpage maybe also != NULL)
In this scenario we would have exactly the kind of double free that would explain the WARNING / Bad page state, and as expected it is caused by broken error handling (cleanup).
Not being familiar with the genwqe code, it would be good if Frank could have a look at the patch.
Using the Ubuntu git source, tag Ubuntu-4.4.0-33.52, I can reproduce the "Bad page state" issue, and with the patch on top I cannot reproduce it any more.
Alex, I'll attach a debian kernel package with the patch applied, please verify if it also solves the issue for you.
Frank, I'll attach the patch, please comment.