Intel® ARC™ Graphics
Get answers to your questions or issues when gaming on the world’s best discrete video cards with the latest news surrounding Intel® ARC™ Graphics
3335 Discussions

Possible memory leak noticed while using hardware decoder on Arc GPU

YuriyYuriy
Novice
344 Views

Hi

We've encountered possible memory leak on Intel Arc A750 GPU.

We are using the GPU as a decoder (DirectX11) of video streams (there was nine 1080p H.264 50 FPS video streams in test environment) and we use NVidia 3050 GPU as renderer. We are receiving decoded NV12 textures from Arc. NV12 textures are converted into RGBA and then transferred into renderer (Arc -> CPU -> 3050).

Intel Arc in not connected to any monitor, and it is used just for decoding. NV 3050 is connected to a monitor.

We are using pool of textures to convert/transfer/render video frames, no new textures were created during the test.

"Private bytes" of the process is constantly growing, so we used UMDH tool (user-mode dump heap) to save checkpoints and compare heaps (interval was approx. 30 minutes, see attached "cmp1-6.txt" file; the file is truncated to leave only the relevant allocations). All allocation looks like:

+ 296111025 ( 296111025 -      0)     63 allocs	BackTrace8A3
+      63 (     63 -      0)	BackTrace8A3	allocations

	ntdll!RtlpAllocateHeap+1D04
	ntdll!RtlpAllocateHeapInternal+6C9
	igd10umt64xe!ctlTemperatureGetState+A476A4
	igd10umt64xe!ctlTemperatureGetState+A2C549
	igd10umt64xe!QueryDesiredMode1+2AD890
	igd10umt64xe!ctlTemperatureGetState+402337
	igd10umt64xe!ctlTemperatureGetState+2FA29D
	igd10umt64xe!ctlTemperatureGetState+2758CE
	igd10umt64xe!QueryDesiredMode1+116043
	igd10umt64xe!QueryDesiredMode1+115885
	igd10umt64xe!QueryDesiredMode1+115510
	igd10umt64xe!QueryDesiredMode1+BA4CA
	igd10umt64xe!QueryDesiredMode1+B2B92
	igd10umt64xe!QueryDesiredMode1+B8FB5
	igd10umt64xe!QueryDesiredMode1+9FF88
	igd10umt64xe!QueryDesiredMode1+7345B
	igd10umt64xe!QueryDesiredMode1+81F13
	igd10umt64xe!QueryDesiredMode1+82486
	igd10umt64xe!QueryDesiredMode1+169CEB
	KERNEL32!BaseThreadInitThunk+1D
	ntdll!RtlUserThreadStart+28

+ 276749865 ( 276749865 -      0)   2115 allocs	BackTrace238
+    2115 (   2115 -      0)	BackTrace238	allocations

	ntdll!RtlpAllocateHeap+1B46
	ntdll!RtlpAllocateHeapInternal+6C9
	igd10umt64xe!ctlTemperatureGetState+A476A4
	igd10umt64xe!ctlTemperatureGetState+A2C549
	igd10umt64xe!QueryDesiredMode1+309EF5
	igd10umt64xe!ctlTemperatureGetState+38B1BD
	igd10umt64xe!ctlTemperatureGetState+37FA12
	igd10umt64xe!ctlTemperatureGetState+33EC52
	igd10umt64xe!ctlTemperatureGetState+275684
	igd10umt64xe!QueryDesiredMode1+115A1D
	igd10umt64xe!QueryDesiredMode1+115862
	igd10umt64xe!QueryDesiredMode1+115510
	igd10umt64xe!QueryDesiredMode1+BA4CA
	igd10umt64xe!QueryDesiredMode1+B2B92
	igd10umt64xe!QueryDesiredMode1+B8FB5
	igd10umt64xe!QueryDesiredMode1+9FF88
	igd10umt64xe!QueryDesiredMode1+7345B
	igd10umt64xe!QueryDesiredMode1+81F13
	igd10umt64xe!QueryDesiredMode1+82486
	igd10umt64xe!QueryDesiredMode1+169CEB
	KERNEL32!BaseThreadInitThunk+1D
	ntdll!RtlUserThreadStart+28

I can provide user heap checkpoint files if necessary.

Important note: this leak does not occur when using Intel UHD 770 as a decoder

See attached "Environment.txt" file for version of installed drivers etc.

 

Thank you

 

0 Kudos
1 Reply
AlphaTop89
New Contributor I
253 Views

it exhibits the memory allocations traced back to igd10umt64xe.dllspecifically within the ctlTemperatureGetState and QueryDesiredMode1 routines are increasing over time when leveraging the Arc GPU strictly as a hardware decoder in a headless configuration. this “headless decoder” architecture can introduce edge-case memory handling behaviors in the user mode driver, particularly in scenarios involving intensive multi-stream decoding without traditional rendering. Test with headless decoding on UHD 770 (as you've done) as a temporary workaround. Monitor system memory using Perfmon or a similar profiler to validate non-paged pool growth or user-mode heap increases. disable any thermal telemetry polling in your application, if applicable, to see if that influences the frequency of heap allocations originating from ctlTemperatureGetState.

 

0 Kudos
Reply