nvme_tcp: nvme connect failed after execute stress-ng: unshare
Yi Zhang
yi.zhang at redhat.com
Mon Sep 14 21:40:51 EDT 2020
Hi Sagi
On 9/15/20 7:50 AM, Sagi Grimberg wrote:
>
>> Hello
>>
>> Recently I found nvme-tcp connecting always failed[1] after execute
>> stress-ng:unshare[2], by bisecting I finally found it was introduced
>> with commit[3], the connecting works well after revert it.
>> I'm not sure whether it's one test case issue or kernel issue, could
>> anyone help check it.
>
> Is this failure persistent or transient?
>
It's persistent, and most of recent CKI job with 5.8 stable kernel also
showed this failure.
>>
>> [1]
>> # sh test.sh
>> + ./stress-ng/stress-ng --unshare 0 --timeout 5 --log-file unshare.log
>> stress-ng: info: [355534] dispatching hogs: 32 unshare
>> stress-ng: info: [355534] successful run completed in 5.04s
>> + modprobe null-blk nr-devices=1
>> + modprobe nvmet-tcp
>> + modprobe nvme-tcp
>> + nvmetcli restore tcp.json
>> + nvme connect -t tcp -n nqn.2014-08.org.nvmexpress.discovery -a
>> 127.0.0.1 -s 4420
>> Failed to write to /dev/nvme-fabrics: Input/output error
>>
>> # dmesg | tail -9
>> [ 700.012299] null_blk: module loaded
>> [ 700.073415] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
>> [ 700.073923] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
>> [ 715.291020] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
>> [ 715.297031] nvmet: ctrl 1 fatal error occurred!
>> [ 749.939898] nvmet: creating controller 1 for subsystem
>> nqn.2014-08.org.nvmexpress.discovery for NQN
>> nqn.2014-08.org.nvmexpress:uuid:e405e6bb-8e28-4a73-b338-3fddb5746b8c.
>> [ 763.417376] nvme nvme0: queue 0: timeout request 0x0 type 4
>> [ 763.422979] nvme nvme0: Connect command failed, error wo/DNR bit: 881
>> [ 763.429419] nvme nvme0: failed to connect queue: 0 ret=881
>>
>> # uname -r
>> 5.9.0-rc4
>>
>>
>> [2] stress-ng: unshare case
>> https://github.com/ColinIanKing/stress-ng.git
>> https://github.com/ColinIanKing/stress-ng/blob/master/stress-unshare.c
>>
>>
>> [3]
>> commit e1eb26fa62d04ec0955432be1aa8722a97cb52e7
>> Author: Giuseppe Scrivano <gscrivan at redhat.com>
>> Date: Sun Jun 7 21:40:10 2020 -0700
>>
>> ipc/namespace.c: use a work queue to free_ipc
>> the reason is to avoid a delay caused by the
>> synchronize_rcu() call in
>> kern_umount() when the mqueue mount is freed.
>>
>>
>> [4]
>> # cat tcp.json
>> {
>> "hosts": [],
>> "ports": [
>> {
>> "addr": {
>> "adrfam": "ipv4",
>> "traddr": "127.0.0.1",
>> "treq": "not specified",
>> "trsvcid": "4420",
>> "trtype": "tcp"
>> },
>> "portid": 0,
>> "referrals": [],
>> "subsystems": [
>> "blktests-subsystem-1"
>> ]
>> }
>> ],
>> "subsystems": [
>> {
>> "allowed_hosts": [],
>> "attr": {
>> "allow_any_host": "1",
>> "cntlid_max": "65519",
>> "cntlid_min": "1",
>> "model": "Linux",
>> "pi_enable": "0",
>> "serial": "7d833f5501f6b240",
>> "version": "1.3"
>> },
>> "namespaces": [
>> {
>> "device": {
>> "nguid": "00000000-0000-0000-0000-000000000000",
>> "path": "/dev/nullb0",
>> "uuid": "b07c7eef-8428-47bf-8e79-26ec8c30f334"
>> },
>> "enable": 1,
>> "nsid": 1
>> }
>> ],
>> "nqn": "blktests-subsystem-1"
>> }
>> ]
>> }
>>
>>
>> Best Regards,
>> Yi Zhang
>>
>>
>
More information about the Linux-nvme
mailing list