Comment 6 for bug 1785816

Revision history for this message
Colin Ian King (colin-king) wrote :

With config CONFIG_NETWORK_PHY_TIMESTAMPING enabled, the calls to skb_clone_tx_timestamp() and skb_defer_rx_timestamp() are enabled (these normally are empty inlined no-op functions). The overhead from what I can see is very small, for example for the tx path:

static unsigned int classify(const struct sk_buff *skb)
{
        if (likely(skb->dev && skb->dev->phydev &&
                   skb->dev->phydev->drv))
                return ptp_classify_raw(skb);
        else
                return PTP_CLASS_NONE;
}

void skb_clone_tx_timestamp(struct sk_buff *skb)
{
        struct phy_device *phydev;
        struct sk_buff *clone;
        unsigned int type;

        if (!skb->sk)
                return;

        type = classify(skb);
        if (type == PTP_CLASS_NONE)
                return;

        phydev = skb->dev->phydev;
        if (likely(phydev->drv->txtstamp)) {
                clone = skb_clone_sk(skb);
                if (!clone)
                        return;
                phydev->drv->txtstamp(phydev, clone, type);
        }
}

The classify() call is an overhead that runs a minimal BPF dissector to classify a network packet to
determine the PTP class. For the default non PTP case this returns PTP_CLASS_NONE. The BPF classifier is just 3-4 BPF branches (depending on the protocol), so it's a very small overhead per packet in the default non-PTP cases.

I ran some perf timings on TCP data being sent and received to a host over a 100 Mbit/s ethernet between to 8 thread Xeon servers and measured CPU cycles, instruction and branch activity with perf. 1 GB of raw data was transferred to/from the machines using netcat on otherwises idle systems. Each test was run 10 times and the average, standard deviation (population) and % standard deviation was computed.

I compared a default 4.17.0-6-generic Ubuntu Cosmic kernel against the same kernel with CONFIG_NETWORK_PHY_TIMESTAMPING. I could not observe any noticeable impact with the CONFIG_NETWORK_PHY_TIMESTAMPING config - mainly because the noise in the perf measurements was larger than any detectable difference (see the % standard deviation rates).

Since I can't easily measure the performance impact any more accurately than instruction and branch counts, I conclude that the impact of this config is not easily measurable and too small to be a concern.

Data in a libreoffice spread sheet is attached.

I therefore deem this config is OK to be enabled for by default for our kernels.