Nfs-ganesha

From Mosuma
Jump to: navigation, search


NFS-Ganesha with Gluster

Goal

Compile and run nfs-ganesha in a docker container:


docker container C1 can mount any NFS volume exported by docker container C2. C2 accesses any node (e.g., container C3) in the trusted pool of gluserfs nodes.

The whole system map looks like this:

[C1: NFS client] --mounts--> [C2: NFS-GANESHA] --access--> [C3: NODE IN GLUSTERFS TRUSTED POOL]

C1, C2, C3 are 3 different docker containers running on 3 different cloud instances.

Motivation

I could just mount the glusterfs directly in C1 using glusterfs mount type, but that would incur a high overhead at C1 because any gluster replication is executed at the client side (C1).

For my setup, I use a number of the cheapest Digital Ocean ($5/mth) instances to form a large storage pool with distributed replication via glusterfs, because that is the most economically viable. Some of my NFS clients (all running from docker containers) might also be low powered (or have low memory), so the glusterfs overhead may not be viable.

I could also just use the built-in glusterfs-nfs server, but because

  1. I was unsuccessful starting the glusterfs-nfs server in a docker container, and
  2. the built-in glusterfs-nfs server lacks auto-failover, I went with the nfs-ganesha route

I have read good things about nfs-ganesha, e.g., it is fast, it supports many different filesystems, so in the future if I add more storage clusters like ZFS, I can continue to use the user-space NFS-ganesha as the server.

Failover

Note: This section is still under research, full of errors and hypothesis

If I use glusterfs-nfs, if the gluster node running glusterfs-nfs goes down, all my nfs clients will be stuck.

On the other hand, if the gluster node goes down, I supposedly could configure nfs-ganesha to connect to another gluster-node in the trusted pool? Need more R&D.

Compiling nfs-ganesha for Debian

--Zhangguiyu (talk) 14:26, 6 September 2015 (SGT)

Install dependencies

Check you have the back ports in your /etc/apt/sources.list

#/etc/apt/sources.list
deb http://http.debian.net/debian jessie-backports main


Required debian packages:

apt-get install \
bison \
cmake \
debhelper \
dh-python \
flex \
libattr1-dev \
libdbus-1-dev \
libjemalloc-dev \
libkrb5-dev \
libncurses5-dev \
libnfsidmap-dev \
libwbclient-dev \
python-qt4 \
pyqt4-dev-tools \
quilt \
uuid-dev \

OPTIONAL debian packages (not needed for gluster):

apt -get install \
libcephfs-dev \
libcephfs1 \
libcap-dev \

gluster runtimes:

wget -O - http://download.gluster.org/pub/gluster/glusterfs/3.7/3.7.3/Debian/jessie/apt/pub.key | apt-key add -

echo deb http://download.gluster.org/pub/gluster/glusterfs/3.7/3.7.3/Debian/jessie/apt jessie main > /etc/apt/sources.list.d/gluster.list
apt-get update
apt-get install glusterfs-client
# gluster server is NOT needed
Clone the repository

Clone main repo:

git clone https://github.com/nfs-ganesha/nfs-ganesha.git ~/nfs-ganesha.git

Update submodules

cd ~/nfs-ganesha.git
git submodule update --init
git checkout -b V2.2-stable
Pre Compile

Problem: Compiling nfs-ganesha on debian fails to find the debian gluster libraries, which lives in /usr/lib/x86_64-linux-gnu/

Solution:

mkdir ~/build
cd ~/build
cmake \
-DUSE_FSAL_GLUSTER=ON \
-DCMAKE_BUILD_TYPE=Maintainer ~/nfs-ganesha.git/src/

Check your pre-compile settings:


-- cmake version 3.0
-- The C compiler identification is GNU 4.9.2
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- was set CMAKE_INSTALL_PREFIX = /usr/local
-- Detected a non Red hat based
-- toolchain options processed
-- Compilation from within a git repository. Using git rev-parse HEAD
-- Looking for include file stdbool.h
-- Looking for include file stdbool.h - found
-- Looking for include file strings.h
-- Looking for include file strings.h - found
-- Looking for include file string.h
-- Looking for include file string.h - found
-- Check if the system is big endian
-- Searching 16 bit integer
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of unsigned short
-- Check size of unsigned short - done
-- Using unsigned short
-- Check if the system is big endian - little endian
-- Looking for include file pthread.h
-- Looking for include file pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- found krb5-config here /usr/bin/krb5-config
-- Found kerberos 5 headers: /include
-- Found kerberos 5 libs:    /usr/lib/x86_64-linux-gnu/mit-krb5/libkrb5.so;/usr/lib/x86_64-linux-gnu/mit-krb5/libk5crypto.so;/usr/lib/x86_64-linux-gnu/libcom_err.so;/usr/lib/x86_64-linux-gnu/mit-krb5/libgssapi_krb5.so
-- Looking for include file gssapi.h
-- Looking for include file gssapi.h - found
-- Looking for include file glusterfs/api/glfs.h
-- Looking for include file glusterfs/api/glfs.h - found
-- Looking for include files unistd.h, attr/xattr.h
-- Looking for include files unistd.h, attr/xattr.h - found
-- Looking for include files unistd.h, lustre/lustre_user.h
-- Looking for include files unistd.h, lustre/lustre_user.h - not found
CMake Warning at CMakeLists.txt:544 (message):
  Cannot find LUSTRE runtime.  Disabling LUSTRE fsal build


-- Looking for include file lustre/lustreapi.h
-- Looking for include file lustre/lustreapi.h - not found
-- Looking for include file lustre/liblustreapi.h
-- Looking for include file lustre/liblustreapi.h - not found
ERRORCannot find lustre header files, aborting
-- Looking for open_by_handle in handle
-- Looking for open_by_handle in handle - not found
-- Looking for include file xfs/xfs.h
-- Looking for include file xfs/xfs.h - not found
CMake Warning at CMakeLists.txt:610 (message):
  Cannot find XFS runtime.  Disabling XFS build


-- Looking for libzfswrap_init in zfswrap
-- Looking for libzfswrap_init in zfswrap - not found
-- Looking for include files unistd.h, libzfswrap.h
-- Looking for include files unistd.h, libzfswrap.h - not found
CMake Warning at CMakeLists.txt:628 (message):
  Cannot find ZFS runtime.  Disabling ZFS build


-- Found JeMalloc: /usr/lib/x86_64-linux-gnu/libjemalloc.so  
-- Found nfs idmap library: /usr/lib/x86_64-linux-gnu/libnfsidmap.so
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.28") 
-- Looking for wbcLookupSids in wbclient
-- Looking for wbcLookupSids in wbclient - found
-- Looking for 3 include files stdint.h, ..., wbclient.h
-- Looking for 3 include files stdint.h, ..., wbclient.h - found
-- Performing Test WBCLIENT4_H
-- Performing Test WBCLIENT4_H - Success
-- Found Winbind4 client: 
-- Looking for cap_set_proc in cap
-- Looking for cap_set_proc in cap - not found
-- Could not find capabilities library, disabling USE_CAPS
-- Looking for include file blkid/blkid.h
-- Looking for include file blkid/blkid.h - not found
-- Looking for blkid_devno_to_devname in blkid
-- Looking for blkid_devno_to_devname in blkid - not found
-- Looking for include file uuid/uuid.h
-- Looking for include file uuid/uuid.h - not found
-- Looking for uuid_parse in uuid
-- Looking for uuid_parse in uuid - not found
-- Could not find blkid library, disabling USE_BLKID
-- Looking for daemon in c
-- Looking for daemon in c - found
-- Found BISON: /usr/bin/bison (found version "3.0.2") 
-- Found FLEX: /usr/bin/flex (found version "2.5.39") 
-- 
-- -------------------------------------------------------
-- PLATFORM = LINUX
-- VERSION = 2.2.0
-- BUILD HOST = base.local
-- -------------------------------------------------------
-- USE_FSAL_PROXY = ON
-- USE_FSAL_VFS = ON
-- USE_FSAL_CEPH = ON
-- USE_FSAL_HPSS = OFF
-- USE_FSAL_XFS = OFF
-- USE_FSAL_PANFS = ON
-- USE_FSAL_GPFS = ON
-- USE_FSAL_PT = OFF
-- USE_FSAL_ZFS = OFF
-- USE_FSAL_LUSTRE = OFF
-- USE_FSAL_SHOOK = OFF
-- USE_FSAL_LUSTRE_UP = OFF
-- USE_FSAL_GLUSTER = ON
-- USE_DBUS = OFF
-- USE_CB_SIMULATOR = OFF
-- USE_NFSIDMAP = ON
-- ENABLE_ERROR_INJECTION = OFF
-- USE_CAPS = OFF
-- USE_BLKID = OFF
-- STRICT_PACKAGE = OFF
-- DISTNAME_HAS_GIT_DATA = OFF
-- _MSPAC_SUPPORT = ON
-- USE_EFENCE = OFF
-- _NO_TCP_REGISTER = ON
-- _NO_PORTMAPPER = ON
-- _NO_XATTRD = ON
-- DEBUG_SAL = OFF
-- _VALGRIND_MEMCHECK = OFF
-- PROXY_HANDLE_MAPPING = OFF
-- DEBUG_SYMS = OFF
-- COVERAGE = OFF
-- ENFORCE_GCC = OFF
-- PROFILING = OFF
-- USE_GSS = ON
-- TIRPC_EPOLL = ON
-- USE_9P = ON
-- _USE_9P = ON
-- _USE_9P_RDMA = OFF
-- KRB5_PREFIX = 
-- CEPH_PREFIX = /usr
-- HPSS_PREFIX = 
-- GLUSTER_PREFIX = /usr
-- ZFS_PREFIX = /usr
-- LUSTRE_PREFIX = /usr
-- CMAKE_PREFIX_PATH = 
-- _GIT_HEAD_COMMIT = 26b7a1d03c0bc8fd3ac24a6eed3e3f547bc1a6e5
-- _GIT_HEAD_COMMIT_ABBREV = 26b7a1d
-- _GIT_DESCRIBE = V2.2.0-2-g26b7a1d
-- ALLOCATOR = jemalloc
-- GOLD_LINKER = 
-- CMAKE_INSTALL_PREFIX = /usr/local
-- FSAL_DESTINATION = lib/ganesha
-- USE_ADMIN_TOOLS = OFF
-- USE_GUI_ADMIN_TOOLS = ON
-- MODULES_PATH = /usr/local
-- USE_TSAN = OFF
-- USE_LTTNG = OFF
-- Could NOT find Doxygen (missing:  DOXYGEN_EXECUTABLE) 
-- Configuring done
-- Generating done
-- Build files have been written to: /host/build

Notes:

Warnings for missing REQUIRED capabilities must be NOTED:

REQUIRED

The following parameters must be set as follows in order for the compiled package to work under Debian:

  1. USE_FSAL_GLUSTER=ON
NOT NEEDED

These capabilities not needed for glusterFS

  1. LUSTRE
  2. XFS build
  3. ZFS build
  4. USE_CAPS
  5. USE_BLKID
  6. DOXYGEN_EXECUTABLE

You need not set the following parameters:

  1. GLUSTER_PREFIX = /usr
  2. FSAL_DESTINATION = lib/ganesha
Compile

After you have confirmed that the GLUSTER library in the previous step is valid, you can go ahead with the compile:

make

which will generate a bunch of files for you to install into the target system

or

make package


which will generate a tgz file

nfs-ganesha-2.2.0-0.1.1-Linux.tar.gz

Install
make install

-- Install configuration: "Maintainer"
-- Installing: /usr/etc/ganesha/ganesha.conf
-- Installing: /usr/share/doc/ganesha/config_samples
-- Installing: /usr/share/doc/ganesha/config_samples/README
-- Installing: /usr/share/doc/ganesha/config_samples/ceph.conf
-- Installing: /usr/share/doc/ganesha/config_samples/config.txt
-- Installing: /usr/share/doc/ganesha/config_samples/ds.conf
-- Installing: /usr/share/doc/ganesha/config_samples/export.txt
-- Installing: /usr/share/doc/ganesha/config_samples/gluster.conf
-- Installing: /usr/share/doc/ganesha/config_samples/gpfs.conf
-- Installing: /usr/share/doc/ganesha/config_samples/gpfs.ganesha.exports.conf
-- Installing: /usr/share/doc/ganesha/config_samples/gpfs.ganesha.log.conf
-- Installing: /usr/share/doc/ganesha/config_samples/gpfs.ganesha.main.conf
-- Installing: /usr/share/doc/ganesha/config_samples/gpfs.ganesha.nfsd.conf
-- Installing: /usr/share/doc/ganesha/config_samples/logging..txt
-- Installing: /usr/share/doc/ganesha/config_samples/logrotate_ganesha
-- Installing: /usr/share/doc/ganesha/config_samples/lustre.conf
-- Installing: /usr/share/doc/ganesha/config_samples/pt.conf
-- Installing: /usr/share/doc/ganesha/config_samples/vfs.conf
-- Installing: /usr/share/doc/ganesha/config_samples/xfs.conf
-- Installing: /usr/share/doc/ganesha/config_samples/zfs.conf
-- Installing: /usr/var/run/ganesha
-- Installing: /usr/bin/libntirpc.a
-- Installing: /usr/lib64/ganesha/libfsalnull.so.4.2.0
-- Installing: /usr/lib64/ganesha/libfsalnull.so.4
-- Installing: /usr/lib64/ganesha/libfsalnull.so
-- Installing: /usr/lib64/ganesha/libfsalproxy.so.4.2.0
-- Installing: /usr/lib64/ganesha/libfsalproxy.so.4
-- Installing: /usr/lib64/ganesha/libfsalproxy.so
-- Removed runtime path from "/usr/lib64/ganesha/libfsalproxy.so.4.2.0"
-- Installing: /usr/lib64/ganesha/libfsalceph.so.4.2.0
-- Installing: /usr/lib64/ganesha/libfsalceph.so.4
-- Installing: /usr/lib64/ganesha/libfsalceph.so
-- Removed runtime path from "/usr/lib64/ganesha/libfsalceph.so.4.2.0"
-- Installing: /usr/lib64/ganesha/libfsalgpfs.so.4.2.0
-- Installing: /usr/lib64/ganesha/libfsalgpfs.so.4
-- Installing: /usr/lib64/ganesha/libfsalgpfs.so
-- Removed runtime path from "/usr/lib64/ganesha/libfsalgpfs.so.4.2.0"
-- Installing: /usr/lib64/ganesha/libfsalvfs.so.4.2.0
-- Installing: /usr/lib64/ganesha/libfsalvfs.so.4
-- Installing: /usr/lib64/ganesha/libfsalvfs.so
-- Removed runtime path from "/usr/lib64/ganesha/libfsalvfs.so.4.2.0"
-- Installing: /usr/lib64/ganesha/libfsalpanfs.so.4.2.0
-- Installing: /usr/lib64/ganesha/libfsalpanfs.so.4
-- Installing: /usr/lib64/ganesha/libfsalpanfs.so
-- Removed runtime path from "/usr/lib64/ganesha/libfsalpanfs.so.4.2.0"
-- Installing: /usr/lib64/ganesha/libfsalgluster.so.4.2.0
-- Installing: /usr/lib64/ganesha/libfsalgluster.so.4
-- Installing: /usr/lib64/ganesha/libfsalgluster.so
-- Removed runtime path from "/usr/lib64/ganesha/libfsalgluster.so.4.2.0"
-- Installing: /usr/bin/ganesha.nfsd
-- Removed runtime path from "/usr/bin/ganesha.nfsd"
Build Debian Packages (TODO)

This is NOT working

cd nfs-ganesha.git/src
dpkg-buildpackage -uc -us

Running nfs-ganesha in Docker

Configuring Ganesha

The 2.2 version of ganesha is very sensitive to settings, please use exactly the setup I have below

EXPORT {
    Export_Id = 77 ;   # Export ID unique to each export
    Path = "/home";  # DOCKER BUG: will export only "/" if some of the commented options below uncommented

    FSAL {
        Name = "GLUSTER";
        Hostname = "123.123.123.123";  # IP of a node in the GLUSTER trusted pool
        Volume = "home";  # Volume name. Eg: "test_volume"
    }

    Access_type = RW;    # Access permissions
    Squash = No_Root_Squash; # To enable/disable root squashing
    Disable_ACL = TRUE;  # To enable/disable ACL
    Pseudo = "/home_pseudo";  # NFSv4 pseudo path for this export. Eg: "/test_vo
lume_pseudo"
    #Protocols = "3,4" ;  # NFS protocols supported
    #Transports = "UDP,TCP" ; # Transport protocols supported
    SecType = "sys";     # Security flavors supported

    CLIENT { # first matching CLIENT block is used
        Clients = 127.0.0.1, IPADDRESS1, IP_PATTERNBLOCKS;
    }

    CLIENT {        # MUST HAVE THIS GUARD, OTHERWISE anyone can mount
        Clients = *;
        Access_Type = NONE;
    }
}

Start from the above by modifying the essential parts, check that it is working, then modify iteratively!

I had uncommented some of the above options, and that resulted in various errors:

  • export path ALWAYS stuck at '/'
  • NFS3 mount not working
NFS-SERVER Container running nfs-ganesha

Container running nfs-ganesha MUST

1. open up at least the following ports (Both UDP and TCP) for NFS, and all the regular Gluster ports. The following are arguments fed into docker

PORTS_GLUSTER="\
-p 111:111                  \
-p 111:111/udp              \
-p 2049:2049                \
-p 2049:2049/udp            \
-p 24007:24007              \
-p 49152-49155:49152-49155  \
-p 38465-38467:38465-38467  
"

2. run as a privileged container, e.g.,
<pre>
--privileged=true

3. have the following software packages installed

apt-get install \
rpcbind \
nfs-common

4. start rpcbind before starting ganesha.nfsd

mkdir /run/sendsigs.omit.d/
service rpcbind start
/usr/bin/ganesha.nfsd -f /etc/ganesha/nfs-ganesha.conf -L \
/var/log/ganesha/ganesha.log -N NIV_DEBUG -p /var/run/ganesha.pid

Test that export is working correctly first on the nfs-ganesha container itself:

rpcinfo -p localhost
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100003    3   udp   2049  nfs
    100003    3   tcp   2049  nfs
    100003    4   udp   2049  nfs
    100003    4   tcp   2049  nfs
    100005    1   udp  43741  mountd
    100005    1   tcp  48153  mountd
    100005    3   udp  43741  mountd
    100005    3   tcp  48153  mountd
    100021    4   udp  44806  nlockmgr
    100021    4   tcp  51907  nlockmgr
    100011    1   udp    875  rquotad
    100011    1   tcp    875  rquotad
    100011    2   udp    875  rquotad
    100011    2   tcp    875  rquotad
    100024    1   udp  47751  status
    100024    1   tcp  37514  status

If the above works, then you can try the next test:

showmount -e localhost
Export list for localhost:
/home (everyone)

try mounting it on the server container itself:

mount -t nfs4 localhost:/home /home
NFS-CLIENT: Container mounting a remote NFS share

Container that plans to mount a remote NFS volume MUST

1. have the following software packages installed

apt-get install \
rpcbind         \
nfs-common


2. Test that you can access the remote nfs server

rpcinfo -p IP_OF_NFS_SERVER_RUNNING_GANESHA
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100003    3   udp   2049  nfs
    100003    3   tcp   2049  nfs
    100003    4   udp   2049  nfs
    100003    4   tcp   2049  nfs
    100005    1   udp  43741  mountd
    100005    1   tcp  48153  mountd
    100005    3   udp  43741  mountd
    100005    3   tcp  48153  mountd
    100021    4   udp  44806  nlockmgr
    100021    4   tcp  51907  nlockmgr
    100011    1   udp    875  rquotad
    100011    1   tcp    875  rquotad
    100011    2   udp    875  rquotad
    100011    2   tcp    875  rquotad
    100024    1   udp  47751  status
    100024    1   tcp  37514  status

3. Mount the volume, e.g., /mnt/home

mkdir -p /mnt/home
mount -t nfs4 GANESHA:/home_pseudo /mnt/home  # this or the one below works
mount -t nfs4 GANESHA:/home /mnt/home         # try this if above does not work

where GANESHA is the IP address of your GANESHA server

nfs-ganesha log

If you see the following lines in your log, you need to

  1. install/copy ganesha in the right directories, or
  2. export the right directories via docker.
05/09/2015 14:43:45 : epoch 55ea8f21 : g00 : ganesha.nfsd-545[main] load_fsal :NFS STARTUP :DEBUG :Loading FSAL GLUSTER with /usr/lib64/ganesha/libfsalgluster.so
05/09/2015 14:43:45 : epoch 55ea8f21 : g00 : ganesha.nfsd-545[main] load_fsal :NFS STARTUP :CRIT :Could not dlopen module:/usr/lib64/ganesha/libfsalgluster.so Error:/usr/lib64/ganesha/libfsalgluster.so: cannot open shared object file: No such file or directory
05/09/2015 14:43:45 : epoch 55ea8f21 : g00 : ganesha.nfsd-545[main] load_fsal :NFS STARTUP :MAJ :Failed to load module (/usr/lib64/ganesha/libfsalgluster.so) because: Can not access a needed shared library

References