NFS mount hangs whole system

PPLdude

Expert Member
Joined
Oct 3, 2011
Messages
1,716
Reaction score
663
Location
South
Please tell me there are some greybeards on this forum...

When my node loses connection to an NFS mount, when running the df command, the whole system hangs until it is unmounted.

Is there an option in fstab or exports that can safeguard me from this?

fstab:

defaults,_netdev,soft,intr,nolock,noacl,noatime,sync,proto=tcp,mountproto=udp,port=(removed) 0 0

NFS server exports:

(rw,async,no_root_squash,no_subtree_check,no_wdelay,fsid=16)

I was thinking changing the retrans to a very low value?

Any help is appreciated!
 
Not afaik. You need to open another shell and force unmount. Or fix the server.
 
Well the first question is why do you loose connection to the mount point? I suspect fixing that will resolve the issue as to when the connection is re-established.
 
I've found some options for it to fail in a specified time
 
What exactly is hosted on this mount?

Maybe that’s what is breaking the system as it requires access to that?
 
Is it hanging the whole system (i.e. you can no longer ssh to it)? Or just the df command, and by extension your session? I'll assume the latter.

This is not an NFS problem. df by default includes local and remote filesystems. You want to add -l to df. Note the difference:

Code:
root@u16lab:~# df -h
Filesystem        Size  Used Avail Use% Mounted on
udev              232M     0  232M   0% /dev
tmpfs              49M  5.5M   43M  12% /run
/dev/xvda1         20G  1.6G   18G   8% /
tmpfs             242M     0  242M   0% /dev/shm
tmpfs             5.0M     0  5.0M   0% /run/lock
tmpfs             242M     0  242M   0% /sys/fs/cgroup
tmpfs              49M     0   49M   0% /run/user/0
192.168.3.2:/srv   20G  2.1G   17G  12% /mnt/data

root@u16lab:~# df -lh
Filesystem      Size  Used Avail Use% Mounted on
udev            232M     0  232M   0% /dev
tmpfs            49M  5.5M   43M  12% /run
/dev/xvda1       20G  1.6G   18G   8% /
tmpfs           242M     0  242M   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           242M     0  242M   0% /sys/fs/cgroup
tmpfs            49M     0   49M   0% /run/user/0

root@u16lab:~# man df
-l, --local
limit listing to local file systems

With soft, it should time out eventually. Check what the timeout is set to:

Code:
root@u16lab:~# grep nfs /proc/mounts
none /proc/xen xenfs rw,relatime 0 0
192.168.3.2:/srv /mnt/data nfs rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.3.2,mountvers=3,mountport=892,mountproto=udp,local_lock=none,addr=192.168.3.2 0 0

Now mount with a more short timeout to check

Code:
root@u16lab:~# mount -t nfs -o vers=3,soft,timeo=5 192.168.3.2:/srv /mnt/data

Firewall off the NFS server and test this:

Code:
root@u16lab:~# time ls /mnt/data
ls: cannot open directory '/mnt/data': Stale file handle

real	0m7.018s
user	0m0.000s
sys	0m0.000s
 
Thanks for the replies. I changed it to soft as well as some timeout values. The mount is to a backup server, and the actual host as well as all the vms hang when it loses the mount (Because it was hard), but now it fails gracefully
 
Please tell me there are some greybeards on this forum...

When my node loses connection to an NFS mount, when running the df command, the whole system hangs until it is unmounted.

Is there an option in fstab or exports that can safeguard me from this?

fstab:

defaults,_netdev,soft,intr,nolock,noacl,noatime,sync,proto=tcp,mountproto=udp,port=(removed) 0 0

NFS server exports:

(rw,async,no_root_squash,no_subtree_check,no_wdelay,fsid=16)

I was thinking changing the retrans to a very low value?

Any help is appreciated!
no_root_squash... Lovely privilege escalation ;)
 
Top
Sign up to the MyBroadband newsletter
X