Consider enabling CONFIG_NETWORK_PHY_TIMESTAMPING

Bug #1785816 reported by Mark Shuttleworth on 2018-08-07
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Seth Forshee
Bionic
High
Seth Forshee

Bug Description

SRU Justification

Impact: CONFIG_NETWORK_PHY_TIMESTAMPING is disabled, preventing PTP time synchronization using the DP83640 PHYTER.

Fix: Enable this option and set CONFIG_DP83640_PHY=m to enable the DP83640 PHYTER driver.

Regression Potential: The kconfig help text warns of adding overhead to network tx and rx, but our analysis finds this to be negligible (see comment #6). Beyond this we are enabling a new driver which does little unless timestamping is explicitly activated, so regressions are not expected.

---

Hi folks, is there a reason we do not enable CONFIG_NETWORK_PHY_TIMESTAMPING ? I'm not sure of the tradeoffs but I think this config option is required for PTP time syncronization which would be useful in telco and financial (and OCD :)) environments.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1785816

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu):
importance: Undecided → High
status: Incomplete → Triaged
tags: added: kernel-key
Seth Forshee (sforshee) wrote :

We don't have any reason noted for having that option disabled. But based on the kconfig help text I'd guess that it's because it adds overhead to all network activity:

   This allows timestamping of network packets by PHYs with
   hardware timestamping capabilities. This option adds some
   overhead in the transmit and receive paths.

We need to do some testing to try and quantify this overhead.

Colin Ian King (colin-king) wrote :

I'll rig up some test scenarios and measure the overhead.

Changed in linux (Ubuntu):
assignee: nobody → Colin Ian King (colin-king)

Thanks folks, looking forward to the results. I would have thought it
would noop where the feature was not in use; as far as I know it
requires explicit activation.

Mark

Seth Forshee (sforshee) wrote :

There's some extra code that's added universally, and more code if a PHY which supports timestamping is present (currently only the DP83640 PHYTER device looks to be enabled by this option). What's done in the PHY-specific code depends on whether or not timestamping is enabled. So it's not completely a noop, but it's entirely possible that the real world impact in most cases will be negligible.

Colin Ian King (colin-king) wrote :

With config CONFIG_NETWORK_PHY_TIMESTAMPING enabled, the calls to skb_clone_tx_timestamp() and skb_defer_rx_timestamp() are enabled (these normally are empty inlined no-op functions). The overhead from what I can see is very small, for example for the tx path:

static unsigned int classify(const struct sk_buff *skb)
{
        if (likely(skb->dev && skb->dev->phydev &&
                   skb->dev->phydev->drv))
                return ptp_classify_raw(skb);
        else
                return PTP_CLASS_NONE;
}

void skb_clone_tx_timestamp(struct sk_buff *skb)
{
        struct phy_device *phydev;
        struct sk_buff *clone;
        unsigned int type;

        if (!skb->sk)
                return;

        type = classify(skb);
        if (type == PTP_CLASS_NONE)
                return;

        phydev = skb->dev->phydev;
        if (likely(phydev->drv->txtstamp)) {
                clone = skb_clone_sk(skb);
                if (!clone)
                        return;
                phydev->drv->txtstamp(phydev, clone, type);
        }
}

The classify() call is an overhead that runs a minimal BPF dissector to classify a network packet to
determine the PTP class. For the default non PTP case this returns PTP_CLASS_NONE. The BPF classifier is just 3-4 BPF branches (depending on the protocol), so it's a very small overhead per packet in the default non-PTP cases.

I ran some perf timings on TCP data being sent and received to a host over a 100 Mbit/s ethernet between to 8 thread Xeon servers and measured CPU cycles, instruction and branch activity with perf. 1 GB of raw data was transferred to/from the machines using netcat on otherwises idle systems. Each test was run 10 times and the average, standard deviation (population) and % standard deviation was computed.

I compared a default 4.17.0-6-generic Ubuntu Cosmic kernel against the same kernel with CONFIG_NETWORK_PHY_TIMESTAMPING. I could not observe any noticeable impact with the CONFIG_NETWORK_PHY_TIMESTAMPING config - mainly because the noise in the perf measurements was larger than any detectable difference (see the % standard deviation rates).

Since I can't easily measure the performance impact any more accurately than instruction and branch counts, I conclude that the impact of this config is not easily measurable and too small to be a concern.

Data in a libreoffice spread sheet is attached.

I therefore deem this config is OK to be enabled for by default for our kernels.

Colin Ian King (colin-king) wrote :

@Seth, would you be so kind to as to enable this given the evidence above?

Seth Forshee (sforshee) wrote :

Thanks Colin. Yes, I'll get our configs updated.

Changed in linux (Ubuntu):
assignee: Colin Ian King (colin-king) → Seth Forshee (sforshee)
Seth Forshee (sforshee) on 2018-08-08
Changed in linux (Ubuntu):
status: Triaged → In Progress
Mark Shuttleworth (sabdfl) wrote :

Colin, thanks for the analysis.

No rush on the change.

Mark

tags: added: kernel-da-key
removed: kernel-key
Seth Forshee (sforshee) wrote :

Committed to cosmic.

Mark, I'm assuming this should also go into 18.04. Is it wanted in the 16.04 GA kernel as well?

Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Mark Shuttleworth (sabdfl) wrote :

Hi Seth, 18.04 LTS is plenty for my purposes and for users wanting to
kick the tires on the functionality. No need for a backport on the
change at this stage.

Thank you!

Mark

Seth Forshee (sforshee) on 2018-08-09
Changed in linux (Ubuntu Bionic):
assignee: nobody → Seth Forshee (sforshee)
importance: Undecided → High
status: New → In Progress
Seth Forshee (sforshee) on 2018-08-09
description: updated
Dany (s-dany) wrote :

@Colin, thank you for posting your findings in comment #6.

If I am reading the spreadsheet correctly, it seems like the majority of the time the CONFIG_NETWORK_PHY_TIMESTAMPING enabled kernel is on average using fewer instructions and branches for the server side (yellow cells H27-K36 and H65-K74) than the kernel without this setting (green cells B27-E36 and B65-E74). For 512 byte TCP buffers that's also the case for the client.

Am I reading the spreadsheet correctly?

Launchpad Janitor (janitor) wrote :
Download full text (12.2 KiB)

This bug was fixed in the package linux - 4.17.0-9.10

---------------
linux (4.17.0-9.10) cosmic; urgency=medium

  * linux: 4.17.0-9.10 -proposed tracker (LP: #1787988)

  * Cosmic update to 4.17.17 stable release (LP: #1787973)
    - x86/speculation/l1tf: Exempt zeroed PTEs from inversion
    - Linux 4.17.17

  * Cosmic update to 4.17.16 stable release (LP: #1787972)
    - x86/l1tf: Fix build error seen if CONFIG_KVM_INTEL is disabled
    - x86: i8259: Add missing include file
    - x86/platform/UV: Mark memblock related init code and data correctly
    - x86/mm/pti: Clear Global bit more aggressively
    - xen/pv: Call get_cpu_address_sizes to set x86_virt/phys_bits
    - x86/mm: Disable ioremap free page handling on x86-PAE
    - kbuild: verify that $DEPMOD is installed
    - crypto: ccree - fix finup
    - crypto: ccree - fix iv handling
    - crypto: ccp - Check for NULL PSP pointer at module unload
    - crypto: ccp - Fix command completion detection race
    - crypto: x86/sha256-mb - fix digest copy in sha256_mb_mgr_get_comp_job_avx2()
    - crypto: vmac - require a block cipher with 128-bit block size
    - crypto: vmac - separate tfm and request context
    - crypto: blkcipher - fix crash flushing dcache in error path
    - crypto: ablkcipher - fix crash flushing dcache in error path
    - crypto: skcipher - fix aligning block size in skcipher_copy_iv()
    - crypto: skcipher - fix crash flushing dcache in error path
    - ioremap: Update pgtable free interfaces with addr
    - x86/mm: Add TLB purge to free pmd/pte page interfaces
    - Linux 4.17.16

  * Cosmic update to 4.17.16 stable release (LP: #1787972) // CVE-2018-9363
    - Bluetooth: hidp: buffer overflow in hidp_process_report

  * linux-cloud-tools-common: Ensure hv-kvp-daemon.service starts before
    walinuxagent.service (LP: #1739107)
    - [Debian] hyper-v -- Ensure that hv-kvp-daemon.service starts before
      walinuxagent.service

  * Miscellaneous Ubuntu changes
    - [Packaging] retpoline -- fix temporary filenaming

linux (4.17.0-8.9) cosmic; urgency=medium

  * linux: 4.17.0-8.9 -proposed tracker (LP: #1787259)

  * Cosmic update to v4.17.15 stable release (LP: #1787257)
    - parisc: Enable CONFIG_MLONGCALLS by default
    - parisc: Define mb() and add memory barriers to assembler unlock sequences
    - Mark HI and TASKLET softirq synchronous
    - stop_machine: Disable preemption after queueing stopper threads
    - sched/deadline: Update rq_clock of later_rq when pushing a task
    - zram: remove BD_CAP_SYNCHRONOUS_IO with writeback feature
    - xen/netfront: don't cache skb_shinfo()
    - bpf, sockmap: fix leak in bpf_tcp_sendmsg wait for mem path
    - bpf, sockmap: fix bpf_tcp_sendmsg sock error handling
    - scsi: sr: Avoid that opening a CD-ROM hangs with runtime power management
      enabled
    - scsi: qla2xxx: Fix memory leak for allocating abort IOCB
    - init: rename and re-order boot_cpu_state_init()
    - root dentries need RCU-delayed freeing
    - make sure that __dentry_kill() always invalidates d_seq, unhashed or not
    - fix mntput/mntput race
    - fix __legitimize_mnt()/mntput() race
    - ARM: dts: imx6sx: fix irq for pcie bridge
  ...

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
tags: added: kernel-key
removed: kernel-da-key
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Terry Rudd (terrykrudd) wrote :

Mark, if you want to kick the tires on this capability before it goes out, we're planning to release early next week, just FYI.

Mark Shuttleworth (sabdfl) wrote :

Timestamping seems to be working. However there may be other networking
issues in -proposed that could be kernel related (captured in bug
#1818340
) that might warrant a hold on promotion.

Hi Mark,

According to comment https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1818340/comments/8, the systemd-networkd issue has been isolated to systemd and verified that it's not a regression introduced by kernel 4.15.0-46-generic.

So considering the assessment of bug #1818340 and the previous comment about the timestamping feature working as expected, I will mark the verification as done so we can go ahead with the kernel promotion.

Thank you!

tags: added: verification-done-bionic
removed: verification-needed-bionic
Launchpad Janitor (janitor) wrote :
Download full text (11.4 KiB)

This bug was fixed in the package linux - 4.15.0-46.49

---------------
linux (4.15.0-46.49) bionic; urgency=medium

  * linux: 4.15.0-46.49 -proposed tracker (LP: #1814726)

  * mprotect fails on ext4 with dax (LP: #1799237)
    - x86/speculation/l1tf: Exempt zeroed PTEs from inversion

  * kernel BUG at /build/linux-vxxS7y/linux-4.15.0/mm/slub.c:296! (LP: #1812086)
    - iscsi target: fix session creation failure handling
    - scsi: iscsi: target: Set conn->sess to NULL when iscsi_login_set_conn_values
      fails
    - scsi: iscsi: target: Fix conn_ops double free

  * user_copy in user from ubuntu_kernel_selftests failed on KVM kernel
    (LP: #1812198)
    - selftests: user: return Kselftest Skip code for skipped tests
    - selftests: kselftest: change KSFT_SKIP=4 instead of KSFT_PASS
    - selftests: kselftest: Remove outdated comment

  * RTL8822BE WiFi Disabled in Kernel 4.18.0-12 (LP: #1806472)
    - SAUCE: staging: rtlwifi: allow RTLWIFI_DEBUG_ST to be disabled
    - [Config] CONFIG_RTLWIFI_DEBUG_ST=n
    - SAUCE: Add r8822be to signature inclusion list

  * kernel oops in bcache module (LP: #1793901)
    - SAUCE: bcache: never writeback a discard operation

  * CVE-2018-18397
    - userfaultfd: use ENOENT instead of EFAULT if the atomic copy user fails
    - userfaultfd: shmem: allocate anonymous memory for MAP_PRIVATE shmem
    - userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas
    - userfaultfd: shmem: add i_size checks
    - userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set

  * Ignore "incomplete report" from Elan touchpanels (LP: #1813733)
    - HID: i2c-hid: Ignore input report if there's no data present on Elan
      touchpanels

  * Vsock connect fails with ENODEV for large CID (LP: #1813934)
    - vhost/vsock: fix vhost vsock cid hashing inconsistent

  * SRU: Fix thinkpad 11e 3rd boot hang (LP: #1804604)
    - ACPI / LPSS: Force LPSS quirks on boot

  * Bionic update: upstream stable patchset 2019-01-17 (LP: #1812229)
    - scsi: sd_zbc: Fix variable type and bogus comment
    - KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in
      parallel.
    - x86/apm: Don't access __preempt_count with zeroed fs
    - x86/events/intel/ds: Fix bts_interrupt_threshold alignment
    - x86/MCE: Remove min interval polling limitation
    - fat: fix memory allocation failure handling of match_strdup()
    - ALSA: hda/realtek - Add Panasonic CF-SZ6 headset jack quirk
    - ARCv2: [plat-hsdk]: Save accl reg pair by default
    - ARC: Fix CONFIG_SWAP
    - ARC: configs: Remove CONFIG_INITRAMFS_SOURCE from defconfigs
    - ARC: mm: allow mprotect to make stack mappings executable
    - mm: memcg: fix use after free in mem_cgroup_iter()
    - mm/huge_memory.c: fix data loss when splitting a file pmd
    - cpufreq: intel_pstate: Register when ACPI PCCH is present
    - vfio/pci: Fix potential Spectre v1
    - stop_machine: Disable preemption when waking two stopper threads
    - drm/i915: Fix hotplug irq ack on i965/g4x
    - drm/nouveau: Use drm_connector_list_iter_* for iterating connectors
    - drm/nouveau: Avoid looping through fake MST connectors
    - gen_stats: Fix netl...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers