Logs with added tracing show that driver does ring the endpoint doorbell, so
host controller should start processing the pending requests. Endpoint is
in stopped state as it should after endpoint reset, before we ring the doorbell.
So this part looks like hardware isn't doing its part.
when class driver starts cancelling transfer requests after some timeout time, we can see that the endpoint is in halted state. Host controller didn't issue any
event when endpoint turned into halted state. so driver is unaware of this state.
There is also a bug in the driver how the error is handled later. After the timeout, when class driver starts cancelling transfers, and xhci driver tries to stop the endpoint to cancel tranfers, it sohuld react to the context state error,
and check endpoint state, and handle the halted endpoint p
Driver should react to this, it should detect and handle the halted endpoint before attempting to set a new dequeue pointer. Now it just bluntly tries to set
a new dequeue pointer, and fails.
Details:
* We get a transaction error event, for transfer request (TRB) at 0xf61a0000
96.985254: xhci_handle_event: EVENT: TRB 00000000f61a0000 status 'USB Transaction Error' len 3860 slot 4 ep 3 type 'Transfer Event' flags e:C
96.985262: xhci_handle_transfer: BULK: Buffer 00000000ff32b04c length 3860 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
* We issue a Reset endpoint command to resolve the halted endpoint
(move endpoint from halted to stopped state)
96.985264: xhci_queue_trb: CMD: Reset Endpoint Command: ctx 0000000000000000 slot 4 ep 3 flags C
96.985265: xhci_inc_enq: CMD 0000000090dd7572: enq 0x00000000fff7e550(0x00000000fff7e000) deq 0x00000000fff7e540(0x00000000fff7e000) segs 1
96.985266: xhci_ring_host_doorbell: Ring doorbell for Command Ring 0
96.985268: xhci_inc_deq: EVENT 000000005715d3fc: enq 0x00000000fff7c000(0x00000000fff7c000) deq 0x00000000fff7c4a0(0x00000000fff7c000) segs 1
* Reset endpoint command successfully, endpoint state is now "stopped"
96.985395: xhci_handle_event: EVENT: TRB 00000000fff7e540 status 'Success' len 0 slot 4 ep 0 type 'Command Completion Event' flags e:C
96.985396: xhci_handle_command: CMD: Reset Endpoint Command: ctx 0000000000000000 slot 4 ep 3 flags C
96.985397: xhci_handle_cmd_reset_ep: State stopped mult 1 max P. Streams 0 interval 125 us max ESIT payload 0 CErr 3 Type Bulk IN burst 0 maxp 512 deq 00000000f61a0001 avg \
trb len 0
* We ring the doorbell, xHC hardware should start processing events on ring,
96.985402: xhci_ring_ep_doorbell: Ring doorbell for Slot 4 ep1in
* but nothing happends, this endpoint i silent until class driver starts cancelling Transfers ~25 seconds later
122.813121: xhci_urb_dequeue: ep1in-bulk: urb 00000000790ce3f7 pipe 3221259648 slot 4 length 0/3860 sgs 0/0 stream 0 flags 00010200
122.813134: xhci_dbg_cancel_urb: Cancel URB 00000000790ce3f7, dev 4, ep 0x81, starting at offset 0xf61a07f0
* stop the endpoint to cancel the pending transfers
122.813137: xhci_queue_trb: CMD: Stop Ring Command: slot 4 sp 0 ep 3 flags C
122.813137: xhci_inc_enq: CMD 0000000090dd7572: enq 0x00000000fff7e560(0x00000000fff7e000) deq 0x00000000fff7e550(0x00000000fff7e000) segs 1
122.813138: xhci_ring_host_doorbell: Ring doorbell for Command Ring 0
122.813141: xhci_urb_dequeue: ep1in-bulk: urb 000000009d35e579 pipe 3221259648 slot 4 length 0/3860 sgs 0/0 stream 0 flags 00010200
* Stop endpoit fails with "Context State Error"
Endpoint is halted, not stopped, but driver doesn't handle this
122.813605: xhci_handle_event: EVENT: TRB 00000000fff7e550 status 'Context State Error' len 0 slot 4 ep 0 type 'Command Completion Event' flags e:c
122.813606: xhci_handle_command: CMD: Stop Ring Command: slot 4 sp 0 ep 3 flags C
122.813606: xhci_handle_cmd_stop_ep: State halted mult 1 max P. Streams 0 interval 125 us max ESIT payload 0 CErr 3 Type Bulk IN burst 0 maxp 512 deq 00000000f61a0001
* Driver tries to queue new dequeue pointer anyway
122.813605: xhci_handle_event: EVENT: TRB 00000000fff7e550 status 'Context State Error' len 0 slot 4 ep 0 type 'Command Completion Event' flags e:c
122.813606: xhci_handle_command: CMD: Stop Ring Command: slot 4 sp 0 ep 3 flags C
122.813606: xhci_handle_cmd_stop_ep: State halted mult 1 max P. Streams 0 interval 125 us max ESIT payload 0 CErr 3 Type Bulk IN burst 0 maxp 512 deq 00000000f61a0001
* setting new dequeue pointer fails, endpoint is still halted
122.813820: xhci_handle_event: EVENT: TRB 00000000fff7e560 status 'Context State Error' len 0 slot 4 ep 0 type 'Command Completion Event' flags e:c
122.813820: xhci_handle_command: CMD: Set TR Dequeue Pointer Command: deq 00000000f61a0011 stream 0 slot 4 ep 3 flags C
122.813821: xhci_handle_cmd_set_deq: RS 00000 high-speed Ctx Entries 12 MEL 0 us Port# 8/0 [TT Slot 0 Port# 0 TTT 0 Intr 0] Addr 4 State configured
122.813821: xhci_handle_cmd_set_deq_ep: State halted mult 1 max P. Streams 0 interval 125 us max ESIT payload 0 CErr 3 Type Bulk IN burst 0 maxp 512 deq 00000000f61a0001
@Bernhard
Logs with added tracing show that driver does ring the endpoint doorbell, so
host controller should start processing the pending requests. Endpoint is
in stopped state as it should after endpoint reset, before we ring the doorbell.
So this part looks like hardware isn't doing its part.
when class driver starts cancelling transfer requests after some timeout time, we can see that the endpoint is in halted state. Host controller didn't issue any
event when endpoint turned into halted state. so driver is unaware of this state.
There is also a bug in the driver how the error is handled later. After the timeout, when class driver starts cancelling transfers, and xhci driver tries to stop the endpoint to cancel tranfers, it sohuld react to the context state error,
and check endpoint state, and handle the halted endpoint p
Driver should react to this, it should detect and handle the halted endpoint before attempting to set a new dequeue pointer. Now it just bluntly tries to set
a new dequeue pointer, and fails.
Details:
* We get a transaction error event, for transfer request (TRB) at 0xf61a0000
96.985254: xhci_handle_event: EVENT: TRB 00000000f61a0000 status 'USB Transaction Error' len 3860 slot 4 ep 3 type 'Transfer Event' flags e:C transfer: BULK: Buffer 00000000ff32b04c length 3860 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
96.985262: xhci_handle_
* We issue a Reset endpoint command to resolve the halted endpoint
(move endpoint from halted to stopped state)
96.985264: xhci_queue_trb: CMD: Reset Endpoint Command: ctx 0000000000000000 slot 4 ep 3 flags C 550(0x00000000f ff7e000) deq 0x00000000fff7e 540(0x00000000f ff7e000) segs 1 host_doorbell: Ring doorbell for Command Ring 0 000(0x00000000f ff7c000) deq 0x00000000fff7c 4a0(0x00000000f ff7c000) segs 1
96.985265: xhci_inc_enq: CMD 0000000090dd7572: enq 0x00000000fff7e
96.985266: xhci_ring_
96.985268: xhci_inc_deq: EVENT 000000005715d3fc: enq 0x00000000fff7c
* Reset endpoint command successfully, endpoint state is now "stopped"
96.985395: xhci_handle_event: EVENT: TRB 00000000fff7e540 status 'Success' len 0 slot 4 ep 0 type 'Command Completion Event' flags e:C command: CMD: Reset Endpoint Command: ctx 0000000000000000 slot 4 ep 3 flags C cmd_reset_ ep: State stopped mult 1 max P. Streams 0 interval 125 us max ESIT payload 0 CErr 3 Type Bulk IN burst 0 maxp 512 deq 00000000f61a0001 avg \
96.985396: xhci_handle_
96.985397: xhci_handle_
trb len 0
* We ring the doorbell, xHC hardware should start processing events on ring,
96.985402: xhci_ring_ ep_doorbell: Ring doorbell for Slot 4 ep1in
* but nothing happends, this endpoint i silent until class driver starts cancelling Transfers ~25 seconds later
122.813121: xhci_urb_dequeue: ep1in-bulk: urb 00000000790ce3f7 pipe 3221259648 slot 4 length 0/3860 sgs 0/0 stream 0 flags 00010200 cancel_ urb: Cancel URB 00000000790ce3f7, dev 4, ep 0x81, starting at offset 0xf61a07f0
122.813134: xhci_dbg_
* stop the endpoint to cancel the pending transfers
122.813137: xhci_queue_trb: CMD: Stop Ring Command: slot 4 sp 0 ep 3 flags C 560(0x00000000f ff7e000) deq 0x00000000fff7e 550(0x00000000f ff7e000) segs 1 host_doorbell: Ring doorbell for Command Ring 0
122.813137: xhci_inc_enq: CMD 0000000090dd7572: enq 0x00000000fff7e
122.813138: xhci_ring_
122.813141: xhci_urb_dequeue: ep1in-bulk: urb 000000009d35e579 pipe 3221259648 slot 4 length 0/3860 sgs 0/0 stream 0 flags 00010200
* Stop endpoit fails with "Context State Error"
Endpoint is halted, not stopped, but driver doesn't handle this
122.813605: xhci_handle_event: EVENT: TRB 00000000fff7e550 status 'Context State Error' len 0 slot 4 ep 0 type 'Command Completion Event' flags e:c command: CMD: Stop Ring Command: slot 4 sp 0 ep 3 flags C cmd_stop_ ep: State halted mult 1 max P. Streams 0 interval 125 us max ESIT payload 0 CErr 3 Type Bulk IN burst 0 maxp 512 deq 00000000f61a0001
122.813606: xhci_handle_
122.813606: xhci_handle_
* Driver tries to queue new dequeue pointer anyway
122.813605: xhci_handle_event: EVENT: TRB 00000000fff7e550 status 'Context State Error' len 0 slot 4 ep 0 type 'Command Completion Event' flags e:c command: CMD: Stop Ring Command: slot 4 sp 0 ep 3 flags C cmd_stop_ ep: State halted mult 1 max P. Streams 0 interval 125 us max ESIT payload 0 CErr 3 Type Bulk IN burst 0 maxp 512 deq 00000000f61a0001
122.813606: xhci_handle_
122.813606: xhci_handle_
* setting new dequeue pointer fails, endpoint is still halted
122.813820: xhci_handle_event: EVENT: TRB 00000000fff7e560 status 'Context State Error' len 0 slot 4 ep 0 type 'Command Completion Event' flags e:c command: CMD: Set TR Dequeue Pointer Command: deq 00000000f61a0011 stream 0 slot 4 ep 3 flags C cmd_set_ deq: RS 00000 high-speed Ctx Entries 12 MEL 0 us Port# 8/0 [TT Slot 0 Port# 0 TTT 0 Intr 0] Addr 4 State configured cmd_set_ deq_ep: State halted mult 1 max P. Streams 0 interval 125 us max ESIT payload 0 CErr 3 Type Bulk IN burst 0 maxp 512 deq 00000000f61a0001
122.813820: xhci_handle_
122.813821: xhci_handle_
122.813821: xhci_handle_