nvme_tcp: nvme connect failed after execute stress-ng: unshare
Sagi Grimberg
sagi at grimberg.me
Mon Sep 14 19:50:12 EDT 2020
> Hello
>
> Recently I found nvme-tcp connecting always failed[1] after execute stress-ng:unshare[2], by bisecting I finally found it was introduced with commit[3], the connecting works well after revert it.
> I'm not sure whether it's one test case issue or kernel issue, could anyone help check it.
Is this failure persistent or transient?
>
> [1]
> # sh test.sh
> + ./stress-ng/stress-ng --unshare 0 --timeout 5 --log-file unshare.log
> stress-ng: info: [355534] dispatching hogs: 32 unshare
> stress-ng: info: [355534] successful run completed in 5.04s
> + modprobe null-blk nr-devices=1
> + modprobe nvmet-tcp
> + modprobe nvme-tcp
> + nvmetcli restore tcp.json
> + nvme connect -t tcp -n nqn.2014-08.org.nvmexpress.discovery -a 127.0.0.1 -s 4420
> Failed to write to /dev/nvme-fabrics: Input/output error
>
> # dmesg | tail -9
> [ 700.012299] null_blk: module loaded
> [ 700.073415] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
> [ 700.073923] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
> [ 715.291020] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
> [ 715.297031] nvmet: ctrl 1 fatal error occurred!
> [ 749.939898] nvmet: creating controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:e405e6bb-8e28-4a73-b338-3fddb5746b8c.
> [ 763.417376] nvme nvme0: queue 0: timeout request 0x0 type 4
> [ 763.422979] nvme nvme0: Connect command failed, error wo/DNR bit: 881
> [ 763.429419] nvme nvme0: failed to connect queue: 0 ret=881
>
> # uname -r
> 5.9.0-rc4
>
>
> [2] stress-ng: unshare case
> https://github.com/ColinIanKing/stress-ng.git
> https://github.com/ColinIanKing/stress-ng/blob/master/stress-unshare.c
>
>
> [3]
> commit e1eb26fa62d04ec0955432be1aa8722a97cb52e7
> Author: Giuseppe Scrivano <gscrivan at redhat.com>
> Date: Sun Jun 7 21:40:10 2020 -0700
>
> ipc/namespace.c: use a work queue to free_ipc
>
> the reason is to avoid a delay caused by the synchronize_rcu() call in
> kern_umount() when the mqueue mount is freed.
>
>
> [4]
> # cat tcp.json
> {
> "hosts": [],
> "ports": [
> {
> "addr": {
> "adrfam": "ipv4",
> "traddr": "127.0.0.1",
> "treq": "not specified",
> "trsvcid": "4420",
> "trtype": "tcp"
> },
> "portid": 0,
> "referrals": [],
> "subsystems": [
> "blktests-subsystem-1"
> ]
> }
> ],
> "subsystems": [
> {
> "allowed_hosts": [],
> "attr": {
> "allow_any_host": "1",
> "cntlid_max": "65519",
> "cntlid_min": "1",
> "model": "Linux",
> "pi_enable": "0",
> "serial": "7d833f5501f6b240",
> "version": "1.3"
> },
> "namespaces": [
> {
> "device": {
> "nguid": "00000000-0000-0000-0000-000000000000",
> "path": "/dev/nullb0",
> "uuid": "b07c7eef-8428-47bf-8e79-26ec8c30f334"
> },
> "enable": 1,
> "nsid": 1
> }
> ],
> "nqn": "blktests-subsystem-1"
> }
> ]
> }
>
>
> Best Regards,
> Yi Zhang
>
>
More information about the Linux-nvme
mailing list