With config CONFIG_NETWORK_PHY_TIMESTAMPING enabled, the calls to skb_clone_tx_timestamp() and skb_defer_rx_timestamp() are enabled (these normally are empty inlined no-op functions). The overhead from what I can see is very small, for example for the tx path:
static unsigned int classify(const struct sk_buff *skb)
{
if (likely(skb->dev && skb->dev->phydev && skb->dev->phydev->drv)) return ptp_classify_raw(skb);
else return PTP_CLASS_NONE;
}
type = classify(skb);
if (type == PTP_CLASS_NONE) return;
phydev = skb->dev->phydev;
if (likely(phydev->drv->txtstamp)) { clone = skb_clone_sk(skb);
if (!clone) return; phydev->drv->txtstamp(phydev, clone, type);
}
}
The classify() call is an overhead that runs a minimal BPF dissector to classify a network packet to
determine the PTP class. For the default non PTP case this returns PTP_CLASS_NONE. The BPF classifier is just 3-4 BPF branches (depending on the protocol), so it's a very small overhead per packet in the default non-PTP cases.
I ran some perf timings on TCP data being sent and received to a host over a 100 Mbit/s ethernet between to 8 thread Xeon servers and measured CPU cycles, instruction and branch activity with perf. 1 GB of raw data was transferred to/from the machines using netcat on otherwises idle systems. Each test was run 10 times and the average, standard deviation (population) and % standard deviation was computed.
I compared a default 4.17.0-6-generic Ubuntu Cosmic kernel against the same kernel with CONFIG_NETWORK_PHY_TIMESTAMPING. I could not observe any noticeable impact with the CONFIG_NETWORK_PHY_TIMESTAMPING config - mainly because the noise in the perf measurements was larger than any detectable difference (see the % standard deviation rates).
Since I can't easily measure the performance impact any more accurately than instruction and branch counts, I conclude that the impact of this config is not easily measurable and too small to be a concern.
Data in a libreoffice spread sheet is attached.
I therefore deem this config is OK to be enabled for by default for our kernels.
With config CONFIG_ NETWORK_ PHY_TIMESTAMPIN G enabled, the calls to skb_clone_ tx_timestamp( ) and skb_defer_ rx_timestamp( ) are enabled (these normally are empty inlined no-op functions). The overhead from what I can see is very small, for example for the tx path:
static unsigned int classify(const struct sk_buff *skb)
skb- >dev->phydev- >drv))
return ptp_classify_ raw(skb) ;
return PTP_CLASS_NONE;
{
if (likely(skb->dev && skb->dev->phydev &&
else
}
void skb_clone_ tx_timestamp( struct sk_buff *skb)
{
struct phy_device *phydev;
struct sk_buff *clone;
unsigned int type;
if (!skb->sk)
return;
type = classify(skb);
return;
if (type == PTP_CLASS_NONE)
phydev = skb->dev->phydev; phydev- >drv->txtstamp) ) {
clone = skb_clone_sk(skb);
return;
phydev- >drv->txtstamp( phydev, clone, type);
if (likely(
if (!clone)
}
}
The classify() call is an overhead that runs a minimal BPF dissector to classify a network packet to
determine the PTP class. For the default non PTP case this returns PTP_CLASS_NONE. The BPF classifier is just 3-4 BPF branches (depending on the protocol), so it's a very small overhead per packet in the default non-PTP cases.
I ran some perf timings on TCP data being sent and received to a host over a 100 Mbit/s ethernet between to 8 thread Xeon servers and measured CPU cycles, instruction and branch activity with perf. 1 GB of raw data was transferred to/from the machines using netcat on otherwises idle systems. Each test was run 10 times and the average, standard deviation (population) and % standard deviation was computed.
I compared a default 4.17.0-6-generic Ubuntu Cosmic kernel against the same kernel with CONFIG_ NETWORK_ PHY_TIMESTAMPIN G. I could not observe any noticeable impact with the CONFIG_ NETWORK_ PHY_TIMESTAMPIN G config - mainly because the noise in the perf measurements was larger than any detectable difference (see the % standard deviation rates).
Since I can't easily measure the performance impact any more accurately than instruction and branch counts, I conclude that the impact of this config is not easily measurable and too small to be a concern.
Data in a libreoffice spread sheet is attached.
I therefore deem this config is OK to be enabled for by default for our kernels.