Could you help me with this driver issue?
We use PCIe AVMM DMA reference design and its driver on A10 card. system is centos7.8, operation core is 3.10.0-1127.13.1.el7.x86_64.
The reference design and its driver are provided by Intel, as attached file.
I change this ref-design's pin location to this A10 Card, PCIe is Gen3 x8.
The driver can be installed and write/read operation are fine. But when we want to unload the driver, system crashed. Currently we can get the log, as attached file too.
The crash only happens on this driver, we tried to rmmod other driver, it is ok.
Could you help to give some advises? Thanks in advance!
[ 363.767890] ------------[ cut here ]------------
[ 363.767905] WARNING: CPU: 4 PID: 17120 at kernel/irq/manage.c:1348 __free_irq+0xb3/0x280
[ 363.767910] Trying to free already-free IRQ 11
[ 363.767913] Modules linked in: altera_dma(OE-) tcp_lp fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables devlink ip6table_filter ip6_tables iptable_filter sunrpc vfat fat iTCO_wdt iTCO_vendor_support intel_pmc_core intel_powerclamp coretemp intel_rapl kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd snd_hda_codec_hdmi snd_hda_codec_realtek pcspkr snd_hda_codec_generic sg snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm pinctrl_cannonlake snd_timer pinctrl_intel snd acpi_pad soundcore wmi i2c_i801 ip_tables xfs libcrc32c sd_mod
[ 363.768014] crc_t10dif crct10dif_generic i915 crct10dif_pclmul crct10dif_common crc32c_intel serio_raw r8169 mii i2c_algo_bit video iosf_mbi ahci drm_kms
_helper i2c_hid libahci syscopyarea sysfillrect sysimgblt fb_sys_fops libata drm drm_panel_orientation_quirks dm_mirror dm_region_hash dm_log dm_mod
[ 363.768058] CPU: 4 PID: 17120 Comm: rmmod Kdump: loaded Tainted: G OE ------------ 3.10.0-957.el7.x86_64 #1
[ 363.768062] Hardware name: Gigabyte Technology Co., Ltd. B360M POWER/B360M POWER, BIOS F4 07/13/2018
[ 363.768066] Call Trace:
[ 363.768081] [<ffffffff8b761dc1>] dump_stack+0x19/0x1b
[ 363.768090] [<ffffffff8b097648>] __warn+0xd8/0x100
[ 363.768099] [<ffffffff8b0976cf>] warn_slowpath_fmt+0x5f/0x80
[ 363.768108] [<ffffffff8b14a953>] __free_irq+0xb3/0x280
[ 363.768115] [<ffffffff8b14aba9>] free_irq+0x39/0x90
[ 363.768125] [<ffffffffc0939593>] altera_pci_remove+0xac/0x17a [altera_dma]
[ 363.768134] [<ffffffff8b3c607e>] pci_device_remove+0x3e/0xc0
[ 363.768144] [<ffffffff8b4a7cd2>] __devic
May I know why you want to unload the driver? For normal applications, we usually do not need to rmmod it.
I am sorry to say that the driver in the reference design is provided an as-is basis and Intel can't support this but I can give some suggestions. It looks like the crash is due to the host suddenly cannot detect the PCIe after you rmmod the driver. May you can try to mask the AER (error reporting) from the host side.
This thread will be transitioned to community support. If you have a new question, feel free to open a new thread to get the support from Intel experts. Otherwise, the community users will continue to help you on this thread. Thank you.