Given that VID 0 is, as you say, a special value that is treated as DSCP-only untagged packet, VID 0 is only meaningful to network switching infrastructure, not end nodes. From what I understand, networking hardware should be configured to strip a 802.1Q tag with VID 0 before sending the packet to the end node.
In other words, this sounds like a bug or misconfiguration on the network side.
Given that VID 0 is, as you say, a special value that is treated as DSCP-only untagged packet, VID 0 is only meaningful to network switching infrastructure, not end nodes. From what I understand, networking hardware should be configured to strip a 802.1Q tag with VID 0 before sending the packet to the end node.
In other words, this sounds like a bug or misconfiguration on the network side.