Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
4995 Discussions

Error: data file is corrupted

Andrew_B_Intel
Employee
2,806 Views

I am running VTune 2022.2 on a Fedora 35 machine, profiling a WebAssembly runtime emitting JIT-compiled code. When I run the hotspots collection, the produced data file is corrupted. This is not the case for other collection mechanisms.

Some background information:

$ uname -a
Linux ... 5.16.14-200.fc35.x86_64 #1 SMP PREEMPT Fri Mar 11 20:31:18 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

$ vtune --version
Intel(R) VTune(TM) Profiler 2022.2.0 (build 623516) Command Line Tool
Copyright (C) 2009 Intel Corporation. All rights reserved.

Here is the error:

$ vtune -v -collect hotspots wasmtime --vtune fib.wasm
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/fib/r003hs -command stop.
fib(45) = 1134903170
vtune: Collection stopped.
vtune: Using result path `/tmp/fib/r003hs'
vtune: Executing actions  7 % Clearing the database
vtune: The database has been cleared, elapsed time is 0.167 seconds.
vtune: Executing actions 12 % Loading '/tmp/fib/r003hs/data.0/emon.0.emon' file
vtune: Error: Cannot load data file `/tmp/fib/r003hs/data.0/23227-23234.0.trace' (Data file is corrupted).
vtune: Executing actions 14 % Updating precomputed scalar metrics
vtune: Raw data has been loaded to the database, elapsed time is 0.021 seconds.
...

I can upload any other information if that would be helpful.

0 Kudos
10 Replies
VaradJ_Intel
Moderator
2,785 Views

Hi,


Thank you for posting in Intel Communities.


VTune sometimes shows this error when the target application stops before VTune is done collecting the data it needs. This can happen because the application crashes before the collection stops.


Please can you try following the solution given in the below link:


Error Message: Cannot Load Data File (intel.com)


If it doesn't help, please can you answer the following questions:


1. Please can you tryout profiling matrix multiplication sample and check if you get this issue?


2. Are you using the ITT API for pause/resume or to mark up frames or tasks?


3. Please can you share which sampling method are you choosing for hotspot analysis?


4. Please can you share the exact steps you followed, a sample reproducer, Self checker logs, and loading drivers?


Thank You.


0 Kudos
Andrew_B_Intel
Employee
2,772 Views

I tried setting an alternate temporary directory and had the same result:

$ vtune -v -collect hotspots -target-tmp-dir=/home/abrown/tmp wasmtime --vtune fib.wasm
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/fib/r005hs -command stop.
fib(45) = 1134903170
vtune: Collection stopped.
vtune: Using result path `/tmp/fib/r005hs'
vtune: Executing actions  7 % Clearing the database                            
vtune: The database has been cleared, elapsed time is 0.179 seconds.
vtune: Executing actions 12 % Loading '/tmp/fib/r005hs/data.0/emon.0.emon' file
vtune: Error: Cannot load data file `/tmp/fib/r005hs/data.0/564912-564925.0.trace' (Data file is corrupted).
...

> 1. Please can you tryout profiling matrix multiplication sample and check if you get this issue?

Not sure where that is; could you point me to it?

 

> 2. Are you using the ITT API for pause/resume or to mark up frames or tasks?

No

 

> 3. Please can you share which sampling method are you choosing for hotspot analysis?

Whatever the default is for the VTune CLI?

 

> 4. Please can you share the exact steps you followed, a sample reproducer, Self checker logs, and loading drivers?

The exact steps are in the original post. I don't know if this is helpful, but the machine is an ADL and running <path to vtune>/bin64/vtune-self-checker.sh crashes the machine:

 

$ bin64/vtune-self-checker.sh 
Intel(R) VTune(TM) Profiler Self Check Utility
Copyright (C) 2009 Intel Corporation. All rights reserved.
Build Number: 623516

HW event-based analysis (counting mode) (Intel driver)   
Example of analysis types: Performance Snapshot
    Collection: Ok
    Finalization: Ok...
    Report: Ok

Instrumentation based analysis check   
Example of analysis types: Hotspots and Threading with user-mode sampling
    Collection: Ok
    Finalization: Ok...
    Report: Ok

HW event-based analysis check (Intel driver)   
Example of analysis types: Hotspots with HW event-based sampling, HPC Performance Characterization, etc.
    Collection: Ok
vtune: Warning: To enable hardware event-based sampling, VTune Profiler has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes.
    Finalization: Ok...
vtune: Warning: Cannot locate debugging information for the Linux kernel. Source-level analysis will not be possible. Function-level analysis will be limited to kernel symbol tables. See the Enabling Linux Kernel Analysis topic in the product online help for instructions.

    Report: Ok

HW event-based analysis check (Intel driver)   
Example of analysis types: Microarchitecture Exploration
    Collection: Ok
vtune: Warning: To enable hardware event-based sampling, VTune Profiler has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes.
    Finalization: Ok...
vtune: Warning: Cannot locate debugging information for the Linux kernel. Source-level analysis will not be possible. Function-level analysis will be limited to kernel symbol tables. See the Enabling Linux Kernel Analysis topic in the product online help for instructions.

    Report: Ok

HW event-based analysis with uncore events (Intel driver)   
Example of analysis types: Memory Access
    Collection: Ok
vtune: Warning: To enable hardware event-based sampling, VTune Profiler has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes.
    Finalization: Ok...
vtune: Warning: Cannot locate debugging information for the Linux kernel. Source-level analysis will not be possible. Function-level analysis will be limited to kernel symbol tables. See the Enabling Linux Kernel Analysis topic in the product online help for instructions.

    Report: Ok

HW event-based analysis with stacks...
[machine crashes here... SSH disconnects, etc.]

 

Could you please escalate this to someone who can investigate?

0 Kudos
Andrew_B_Intel
Employee
2,769 Views

I should clarify: when I say that the machine crashes upon running bin64/vtune-self-checker.sh, it actually just freezes. Over SSH the connection drops but when run directly on the machine the display manager freezes, all IO devices are unresponsive, and logs stop flowing to `journalctl -f` which I was monitoring while the self checker was running.

0 Kudos
VaradJ_Intel
Moderator
2,709 Views

Hi,

 

Thank you for providing the details.

 

You can find sample matrix application in the following location: <vtune-installation-directory>\samples\en\C++\matrix

 

User-mode sampling is the default sampling in the hotspot analysis.

 

We tried profiling matrix sample using hotspot analysis(user-mode sampling). We used Vtune 2022.2.0 version on Fedora 35 with ADL processor but everything worked fine for us.

 

Please can you share a sample reproducer(sample application that is similar to the application you are trying to profile) of your WebAssembly runtime emitting JIT-compiled code so that we can try to reproduce the issue you are facing?

 

Also, please can you share the summary of loading drivers? You can get the summary of the loading drivers in the following way: 

<vtune-installation-directory>/sepdk/src and run the following command: ./insmod-sep -q

 

Thank You.

 

 

0 Kudos
Andrew_B_Intel
Employee
2,664 Views

Ok, here are the results of building and running the matrix sample--everything looks good to me:

vtune/latest/samples/en/C++/matrix/linux$ sudo make
[sudo] password for abrown:
Domain Controller unreachable, using cached credentials instead. Network resources may be unavailable
gcc -g -O3 -fno-asm -DUSE_THR    -c ../src/util.c -D_LINUX
gcc -g -O3 -fno-asm -DUSE_THR    -c ../src/thrmodel.c -D_LINUX
gcc -g -O3 -fno-asm -DUSE_THR    -c ../src/multiply.c -D_LINUX
gcc -g -O3 -fno-asm -DUSE_THR    -c ../src/matrix.c -D_LINUX
../src/matrix.c: In function ‘main’:
../src/matrix.c:157:9: warning: implicit declaration of function ‘gettimeofday’ [-Wimplicit-function-declaration]
  157 |         gettimeofday(&before, NULL);
      |         ^~~~~~~~~~~~
gcc -g -O3 -fno-asm -DUSE_THR    -g util.o thrmodel.o multiply.o matrix.o  -o ../matrix -lpthread -lm

...

vtune/latest/samples/en/C++/matrix/results$ vtune -collect hotspots ../matrix
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /opt/intel/oneapi/vtune/2022.2.0/samples/en/C++/matrix/results/r000hs -command stop.
Addr of buf1 = 0x7f966d064010
Offs of buf1 = 0x7f966d064180
Addr of buf2 = 0x7f966b063010
Offs of buf2 = 0x7f966b0631c0
Addr of buf3 = 0x7f9669062010
Offs of buf3 = 0x7f9669062100
Addr of buf4 = 0x7f9667061010
Offs of buf4 = 0x7f9667061140
Threads #: 16 Pthreads
Matrix size: 2048
Using multiply kernel: multiply1
Execution time = 4.220 seconds
vtune: Collection stopped.
vtune: Using result path `/opt/intel/oneapi/vtune/2022.2.0/samples/en/C++/matrix/results/r000hs'
vtune: Executing actions 19 % Resolving information for `libc.so.6'
vtune: Warning: Cannot locate debugging information for file `/lib64/libc.so.6'.
vtune: Executing actions 21 % Resolving information for `libtpsstool.so'
vtune: Warning: Cannot locate debugging information for file `/opt/intel/oneapi/vtune/2022.2.0/lib64/libtpsstool.so'.
vtune: Executing actions 75 % Generating a report                              Elapsed Time: 4.237s
    CPU Time: 59.509s
        Effective Time: 59.509s
        Spin Time: 0s
        Overhead Time: 0s
    Total Thread Count: 17
    Paused Time: 0s

Top Hotspots
Function   Module     CPU Time  % of CPU Time(%)
---------  ---------  --------  ----------------
multiply1  matrix      59.489s            100.0%
init_arr   matrix       0.010s              0.0%
cfree      libc.so.6    0.010s              0.0%
Effective Physical Core Utilization: nan% (nan out of 16)
    Effective Logical Core Utilization: 59.3% (14.236 out of 24)
Collection and Platform Info
    Application Command Line: ../matrix
    Operating System: 5.16.19-200.fc35.x86_64
    Computer Name: abrown-desk2.amr.corp.intel.com
    Result Size: 4.7 MB
    Collection start time: 17:42:55 18/04/2022 UTC
    Collection stop time: 17:42:59 18/04/2022 UTC
    Collector Type: Event-based counting driver,User-mode sampling and tracing
    CPU
        Name: Intel(R) microarchitecture code named Alderlake-S
        Frequency: 3.187 GHz
        Logical CPU Count: 24
        Cache Allocation Technology
            Level 2 capability: available
            Level 3 capability: not detected

If you want to skip descriptions of detected performance issues in the report,
enter: vtune -report summary -report-knob show-issues=false -r <my_result_dir>.
Alternatively, you may view the report in the csv format: vtune -report
<report_name> -format=csv.
vtune: Executing actions 100 % done

Here is the output for the kernel module installation:

vtune/latest/sepdk/src$ sudo ./insmod-sep -q
pax driver is loaded and owned by group "vtune" with file permissions "660".
socperf3 driver is loaded and owned by group "vtune" with file permissions "660".
sep5 driver is loaded and owned by group "vtune" with file permissions "660".

socwatch driver is loaded and owned by group "vtune" with file permissions "660".

vtsspp driver is loaded and owned by group "vtune" with file permissions "660".

Here is how to reproduce the error I'm seeing in more detail:

  1. Build Wasmtime with the latest Cargo and Rust installed:
    $ git clone https://github.com/bytecodealliance/wasmtime --depth 1
    $ cd wasmtime
    $ git submodule update --init --depth 1
    $ cargo build​
  2. Build the Wasm example from here:
    # Copy Rust example code to fib.rs, then...
    $ rustc --target wasm32-wasi fib.rs -C opt-level=z -C lto=yes​
  3. Run the example in Wasmtime:
    $ vtune -collect hotspots target/debug/wasmtime --vtune fib.wasm
    vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /tmp/reproduction/wasmtime/r000hs -command stop.
    fib(45) = 1134903170
    vtune: Collection stopped.
    vtune: Using result path `/tmp/reproduction/wasmtime/r000hs'
    vtune: Executing actions 12 % Loading '/tmp/reproduction/wasmtime/r000hs/data.0
    vtune: Error: Cannot load data file `/tmp/reproduction/wasmtime/r000hs/data.0/398967-398974.0.trace' (Data file is corrupted).
    ...​

Since the matrix example works fine, it could be that Wasmtime's JIT integration with ittnotify is broken in some way. But it is not clear how or why. The applicable code that interacts with ittnotify is available here.

0 Kudos
VaradJ_Intel
Moderator
2,618 Views

Hi,


Thank you for providing the details.


Thanks for reporting this issue. We were able to reproduce it and we have informed the development team about it.


Mean while you can profile your application using Hardware based-event Sampling. You can use the below command for it.


vtune -run-pass-thru=--no-altstack -v -collect hotspots -knob sampling-mode=hw -knob sampling-interval=0.5 target/debug/wasmtime --vtune fib.wasm


Thank You.


0 Kudos
Andrew_B_Intel
Employee
2,588 Views

Varad, thanks; I'll continue with HW sampling in the meantime. (I received your private message by e-mail but I don't know how to respond to it in this forum tool). Let's leave this bug open until we can get someone to take a look and fix the issue?

0 Kudos
Andrew_B_Intel
Employee
2,376 Views

Varad, I tried with the sampling mode that you suggested and it fails with the following error:

 

 $ vtune -run-pass-thru=--no-altstack -v -collect hotspots -knob sampling-mode=hw -knob sampling-interval=0.5 wasmtime --vtune ../fib.wasm
vtune: Error: This analysis type is not applicable to the system because VTune Profiler cannot recognize the processor. If this is a new Intel processor, please check for an updated version of VTune Profiler. If this is an unreleased Intel processor, please contact Online Service Center for an NDA product package.

 

 For reference, here is what I ran that command on:

 

$ lscpu | grep 'Model name'
Model name:                      12th Gen Intel(R) Core(TM) i9-12900KF
$ uname -r
5.18.5-100.fc35.x86_64
$ vtune --version
Intel(R) VTune(TM) Profiler 2022.3.0 (build 624050) Command Line Tool
Copyright (C) 2009 Intel Corporation. All rights reserved.

 

Is there an ETA on a solution for this?

0 Kudos
VaradJ_Intel
Moderator
2,561 Views

Hi,


We have informed this issue to the development team and we are working in this internally.


Thank You.



0 Kudos
VaradJ_Intel
Moderator
1,142 Views

Hi


Good day to you!


As per the internal discussion you informed us that you have fixed this issue by altering the Rust bindings for ittapi: https://github.com/intel/ittapi/pull/105, Hence we are closing this thread. 


If you need any further assistance please post a new question as this thread will no longer be monitored by Intel.


Thank You!


Regards

Varad


0 Kudos
Reply