#991548 mesa: Since 20.3.5-1 AMD Vega GPU runs at max V and 90W in idle

#991548#5
Date:
2021-07-27 08:13:46 UTC
From:
To:
Dear Maintainer,

   * What led up to the situation?
	Update from 20.3.4-1 to 20.3.5-1
   * What exactly did you do (or not do) that was effective (or
     ineffective)?
	Update and then a reboot (PC shut down over night)
   * What was the outcome of this action?
	The GPU Idles at >90W and at max Voltage
	of 1.25V while before it was less than 10W
	and about 1V at idle.
	Secondary effects are high temps and RPM
	of the fans due to the heat.
   * What outcome did you expect instead?
	For it to behave as before.

Following chart comes from Netdata and shows
The day before and after the reboot:
https://seafile.merspieler.tk/f/70f444e3ffd248c68127/?dl=1

GPU is an AMD Vega Frontier Edition.

#991548#10
Date:
2021-07-27 08:34:39 UTC
From:
To:
Are you sure mesa was the only update? 20.3.5-1 was in unstable for
months before it got in testing.

#991548#15
Date:
2021-07-27 09:49:25 UTC
From:
To:
  console-setup console-setup-linux keyboard-configuration libdebconfclient0
  libegl-mesa0 libegl1-mesa-dev libgbm1 libgl1-mesa-dev libgl1-mesa-dri
  libglapi-mesa libgles2-mesa-dev libglx-mesa0 libxatracker2
  linux-compiler-gcc-10-x86 linux-config-5.10 linux-kbuild-5.10 linux-libc-dev
  linux-source linux-source-5.10 mesa-common-dev mesa-opencl-icd
  mesa-va-drivers mesa-vdpau-drivers mesa-vulkan-drivers os-prober

I'm new to debian bug reporting so if I'm missing something, please let me know.

#991548#20
Date:
2021-07-28 14:52:47 UTC
From:
To:
#991548#25
Date:
2021-11-24 16:49:56 UTC
From:
To:
Indeed, the symptoms sound very familiar.

According to the other bug report, it was a kernel issue which got fixed with
5.10.46-3, so the most likely case is that it's already resolved.
'merspieler' can you confirm that?
The link in the submission doesn't work for me (anymore?), but the following
command should make it clear whether it is the same issue as #991453:

$ cat /sys/class/drm/card0/device/gpu_busy_percent

and running 'sensors' showed quite a dramatic change in power1 and fans also
got quiet again after 5.10.46-3.

#991548#28
Date:
2021-11-24 16:49:56 UTC
From:
To:
Indeed, the symptoms sound very familiar.

According to the other bug report, it was a kernel issue which got fixed with
5.10.46-3, so the most likely case is that it's already resolved.
'merspieler' can you confirm that?
The link in the submission doesn't work for me (anymore?), but the following
command should make it clear whether it is the same issue as #991453:

$ cat /sys/class/drm/card0/device/gpu_busy_percent

and running 'sensors' showed quite a dramatic change in power1 and fans also
got quiet again after 5.10.46-3.

#991548#33
Date:
2021-11-24 20:01:14 UTC
From:
To:
Sorry, I can't, don't have that install around anymore.

On a fresh install (same GPU, different rest of system) but same kernel
version at that time, the problem didn't occur anymore.

From my side, the issue dissolved it self.