Comment 55 for bug 1733662

Revision history for this message
tglx (tglx) wrote : RE: [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing

On Tue, 16 Jan 2018, Yu, Fenghua wrote:
> > From: Thomas Gleixner [mailto:<email address hidden>]
> Is this a Haswell specific issue?
>
> I run the following test forever without issue on Broadwell and 4.15.0-rc6 with rdt mounted:
> for ((;;)) do
> for ((i=1;i<88;i++)) do
> echo 0 >/sys/devices/system/cpu/cpu$i/online
> done
> echo "online cpus:"
> grep processor /proc/cpuinfo |wc
> for ((i=1;i<88;i++)) do
> echo 1 >/sys/devices/system/cpu/cpu$i/online
> done
> echo "online cpus:"
> grep processor /proc/cpuinfo|wc
> done
>
> I'm finding a Haswell to reproduce the issue.

Come on. This is crystal clear from the KASAN trace. And the fix is simple enough.

You simply do not run into it because on your machine

    is_llc_occupancy_enabled() is false...

Thanks,

 tglx

8<--------------------

diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index 88dcf8479013..99442370de40 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -525,10 +525,6 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
    */
   if (static_branch_unlikely(&rdt_mon_enable_key))
    rmdir_mondata_subdir_allrdtgrp(r, d->id);
- kfree(d->ctrl_val);
- kfree(d->rmid_busy_llc);
- kfree(d->mbm_total);
- kfree(d->mbm_local);
   list_del(&d->list);
   if (is_mbm_enabled())
    cancel_delayed_work(&d->mbm_over);
@@ -545,6 +541,10 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
    cancel_delayed_work(&d->cqm_limbo);
   }

+ kfree(d->ctrl_val);
+ kfree(d->rmid_busy_llc);
+ kfree(d->mbm_total);
+ kfree(d->mbm_local);
   kfree(d);
   return;
  }