Community
cancel
Showing results for 
Search instead for 
Did you mean: 
RGard2
Novice
501 Views

Issue with eMMC accesses on Arria 10 SOC board?

Hi,

 

I'm troubleshooting an issue on an Arria 10 SOC board running the Linux 4.20 kernel where we randomly see corruption reported on filesystems located on the eMMC. It doesn't happen frequently and is quite indeterminate but after many reboots, one or more of the filesystems that are mounted report errors (even on filesystems mounted as read-only). As part of tracking down the issue (and having not narrowed it down to the eMMC at the time), we tried other filesystem types (ext3->ext4 and ext3->cramfs) but the problem persisted.

 

An interesting data point is that previously we were using the Linux 3.10 kernel and did NOT see this issue. We went back and confirmed that the issue does not occur when using the 3.10 kernel by testing multiple boards over the course of 10 days. Given this and other debugging performed, the issue appears related to MMC driver changes in the 4.20 kernel (which appear to be substantial based on a cursory comparison of the two versions).

 

I'm currently performing an mmc controller register comparison and added CMD logging to the MMC kernel driver for the purposes of comparing the CMDs being sent during configuration.

 

Has anyone encountered this issue? Any suggestions on things to try?

 

If you have any questions for me, please feel free to ask.

 

Thanks,

Ren

0 Kudos
10 Replies
257 Views

Hi Ren,

 

Based on my experience, I am unaware if there are files corruption. Also, which Quartus and SoC EDS version are you currently using?

 

Thanks for the clarification regarding the kernel versions, firstly I recommend that you try the kernel version available in our Github page, one of the available you could try is version 4.14:

https://github.com/altera-opensource/linux-socfpga

 

Please let me know if the above still doesn't fix the issue.

RGard2
Novice
257 Views

Hi el.ign,

 

We are currently using Quartus180pro

 

The full version is :

Quartus Prime Version 18.0.1 Build 261 06/28/2018 Patches 1.20,1.34,1.45 SJ Pro Edition

 

We have tried using the 4.14-lt release but it had the same issue.

 

Thanks,

Ren

257 Views

Hi Ren,

 

Was there any particular error seen during any bootup?

 

Did you spot any differences in the mmc controller register?

 

I will check if there were any driver missing or any alternatives from our internal team regarding this, can you share the part number of the eMMC device?

RGard2
Novice
257 Views

Hi el.ign,

 

No errors from the MMC kernel driver. The only errors are related to the filesystem(s) when they are mounted.

 

I do see differences in the MMC controller registers and I'm testing with changes now.

 

The part number of the eMMC device is MTFC4GACAANA-4M IT.

 

Thanks,

Ren

257 Views

Hi Ren,

 

Based on my checking, the part number was not really tested thus we are unsure the full compatibility with the latest kernel, I will check further for any information from our internal team, but it will take some time.

 

If you could share the any information regarding the MMC controller registers, that would help.

 

Also could you share what sort of file corruption that you are seeing? Or if you could screenshot it if that is easier for you.

RGard2
Novice
257 Views

Hi el.ign,

 

Okay, thank you for checking into the eMMC device compatibility.

 

Regarding the MMC controller registers, the differences between Linux 3.10 and Linux 4.20 are:

 

  1. The wait_priv_data bit in the CMD register seems to be normally set in Linux 3.10 but not in Linux 4.20.
  2. The msize field and rx and tx watermark fields in the FIFOTH register are set differently between Linux 3.10 and Linux 4.20. For example, nominally I see a value of 0x21ff0200 for Linux 3.10 and a value of 0x607f0200 for Linux 4.20.
  3. Interestingly, the PWREN register has a value of 0 for Linux 3.10 but a value of 1 (which I would expect) for Linux 4.20.

 

As a test, I tried modifying these registers for the Linux 4.20 based image to match those for Linux 3.10 but it did not help.

 

In addition, I noticed differences in the ext CSD mode set for the eMMC itself between Linux 3.10 and Linux 4.20. For Linux 3.10, HPI_MGMT is enabled but in Linux 4.20 it is not. Also, for Linux 4.20 CACHE_CTRL is enabled but it is not in Linux 3.10. Finally, for Linux 4.20 POWER_OFF_NOTIFICATION is enabled but it is not in Linux 3.10.

 

Also as a test, I tried modifying the switch commands used to configure these mode settings in the ext CSD for the eMMC device to match Linux 3.10 but it did not help.

 

Some examples of file corruption that are seen (note that the occurrences are random):

 

[  1.130239] Waiting for root device /dev/mmcblk0p5...

[  1.136248] mmc_host mmc0: Bus speed (slot 0) = 50000000Hz (slot req 52000000Hz, actual 50000000HZ div = 0)

[  1.150205] mmc0: new high speed MMC card at address 0001

[  1.156915] mmcblk0: mmc0:0001 P1XXXX 3.60 GiB 

[  1.163152] mmcblk0boot0: mmc0:0001 P1XXXX partition 1 16.0 MiB

[  1.170272] mmcblk0boot1: mmc0:0001 P1XXXX partition 2 16.0 MiB

[  1.278405] mmcblk0: p1 p2 p3 p4 < >

[  1.312344] VFS: Cannot open root device "mmcblk0p5" or unknown-block(179,5): error -6

[  1.320237] Please append a correct "root=" boot option; here are the available partitions:

[  1.328661] 0100      8192 ram0 

[  1.328663] (driver?)

[  1.334759] 0101      8192 ram1 

[  1.334761] (driver?)

[  1.340838] b300     3776512 mmcblk0 

[  1.340840] driver: mmcblk

[  1.347618]  b301     143360 mmcblk0p1 55af3d06-01

[  1.347620] 

[  1.354398]  b302      1024 mmcblk0p2 55af3d06-02

[  1.354399] 

[  1.361168]  b303     143360 mmcblk0p3 55af3d06-03

[  1.361169] 

[  1.367945]  b304        1 mmcblk0p4 

[  1.367946] 

[  1.373775] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(179,5)

 

 

[  46.336168] EXT4-fs error (device mmcblk0p5): htree_dirblock_to_tree:1007: inode #1999: block 10117: comm charon: bad entry in directory: directory entry overrun - offset=0, inode=3892174576, rec_len=57072, name_len=253, size=1024

 

 

[ 305.111761] EXT4-fs (mmcblk0p5): error count since last fsck: 1

[ 305.117683] EXT4-fs (mmcblk0p5): initial error at time 46: htree_dirblock_to_tree:1007: inode 1999: block 10117

[ 305.127746] EXT4-fs (mmcblk0p5): last error at time 46: htree_dirblock_to_tree:1007: inode 1999: block 10117

 

 

[  1.257384] mmc0: new high speed MMC card at address 0001

[  1.264015] mmcblk0: mmc0:0001 P1XXXX 3.60 GiB

[  1.269703] mmcblk0boot0: mmc0:0001 P1XXXX partition 1 16.0 MiB

[  1.276776] mmcblk0boot1: mmc0:0001 P1XXXX partition 2 16.0 MiB

[  1.385343] mmcblk0: p1 p2 p3 p4 < p5 >

[  1.402715] EXT4-fs (mmcblk0p5): mounting ext3 file system using the ext4 subsystem

[  1.511161] EXT4-fs (mmcblk0p5): ext4_check_descriptors: Block bitmap for group 0 not in group (block 0)!

[  1.520713] EXT4-fs (mmcblk0p5): group descriptors corrupted!

[  1.526557] VFS: Cannot open root device "mmcblk0p5" or unknown-block(179,5): error -117

[  1.534634] Please append a correct "root=" boot option; here are the available partitions:

 

 

[  87.942916] EXT4-fs error (device mmcblk0p5): ext4_mb_generate_buddy:747: group 1, block bitmap and bg descriptor inconsistent: 81 vs 82 free clusters

[  92.464179] JBD2: Spotted dirty metadata buffer (dev = mmcblk0p5, blocknr = 1). There's a risk of filesystem corruption in case of system crash.

[  92.488648] JBD2: Spotted dirty metadata buffer (dev = mmcblk0p5, blocknr = 1). There's a risk of filesystem corruption in case of system crash.

 

Let me know if you need more information.

 

Thanks,

Ren

 

257 Views

Hi Ren,

 

Thanks, I will check some more regarding the log you have provided.

 

Also, checking with our internal team if they have face similar error logs before.

 

It will take some time, I shall come back with some findings.

 

Thanks.

RGard2
Novice
257 Views

Hi el.ign,

 

Thanks very much for your help.

 

Could you provide a list of eMMC devices that were tested with the Linux 4 kernel? I think it could be an interesting test for us to try with one of the known supported devices.

 

Thanks,

Ren

257 Views

Hi Ren,

 

We do have a list for support flash devices for Arria 10 SoC here (bottom of the page for eMMC):

https://www.intel.com/content/www/us/en/programmable/support/support-resources/supported-flash-devic...

 

I am still getting info on the device you are using if any of our internal team has tested it before.

257 Views

Hi Ren,

 

The device you were using were not tested, and our internal team recommends initially to use the supported flash devices in the list above.

 

Best Regards.

Reply