Comment 16 for bug 1567557

Revision history for this message
Colin Ian King (colin-king) wrote : Re: Performance degradation of "zfs clone" when under load

Running strace against zfs create I see the following ioctl() taking the time:

1500475028.118005 ioctl(3, _IOC(0, 0x5a, 0x12, 0x00), 0x7ffc7c2184f0) = -1 ENOMEM (Cannot allocate memory) <0.390093>
1500475028.508153 mmap(NULL, 290816, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbfd487b000 <0.000017>
1500475028.508201 ioctl(3, _IOC(0, 0x5a, 0x12, 0x00), 0x7ffc7c2184f0) = 0 <0.382304>

I believe this and ioctl on /dev/zfs, namely ZFS_IOC_OBJSET_STATS which is getting stats on all the zfs file systems. This ioctl takes longer to do as the number of clones increases. I believe this is the API causing the bottleneck.

perf shows that over 99.9% of the zfs clone is indeed performing this ioctl:

- 99.39% 0.00% zfs [kernel.kallsyms] [k] sys_ioctl
     sys_ioctl
     do_vfs_ioctl
   - zfsdev_ioctl
      - 99.33% zfs_ioc_objset_stats
         - 99.30% zfs_ioc_objset_stats_impl.part.20
            - 99.21% dmu_objset_stats
               - dsl_dataset_stats
                  - 99.18% get_clones_stat
                     - 60.46% fnvlist_add_nvlist
                          nvlist_add_nvlist
                          nvlist_add_common.part.51
                          nvlist_copy_embedded.isra.54
                          nvlist_copy_pairs.isra.52
                        - nvlist_add_common.part.51
                           - 30.35% nvlist_copy_embedded.isra.54
                                nvlist_copy_pairs.isra.52
                              + nvlist_add_common.part.51
                           - 29.62% nvlist_remove_all.part.49
                                strcmp
                     - 31.23% fnvlist_add_boolean
                        - nvlist_add_boolean
                        - nvlist_add_common.part.51
                           - 30.20% nvlist_remove_all.part.49
                                strcmp
                             0.94% strcmp
                     - 6.37% dsl_dataset_hold_obj
                        - 6.28% dmu_bonus_hold
                           - 5.89% dnode_hold
                              - 5.83% dnode_hold_impl
                                 - 5.40% dbuf_read
                                    - dmu_zfetch
                                       - 5.19% dmu_zfetch_dofetch.isra.7
                                          - 4.76% dbuf_prefetch
                                             - 2.67% dbuf_find
                                                  mutex_lock
                                               0.91% mutex_unlock
                                               0.55% dnode_block_freed