The current use case of NFS is 400G-1T 'stashes' shared from an NFS server to hundreds of Linux/Unix clients in an academic setting. In some cases these stashes are accessed by a single user on a single machine, in some cases dozens of users access them across dozens of machines.
Drawbacks to the current situation are the same as any situation involving NFS:
- Security is a joke
- Single über-powerful NFS filers present a SPOF
- Bigger and bigger filers get more and more expensive
- Forced to use proprietary and expensive ZFS on Solaris
- Backing up is becoming a problem as total dataset size becomes more than a tape backup system can really hold
- No tiering of storage. The whole dataset either goes on the fast disks or the slow disks
- NFS is old faithful
- Every operating system supports it, and usually pretty well
- NFS ipv6's like a champ
- It's already working
- Integrates well with pam, autofs, ldap
- Vendor, while expensive, is really good at fixing it
- ZFS allows 'thin provisioning' so that we can over subscribe.
- ZFS allows full nfsv4 acls to be used (This could also go in the drawbacks section because extended acls cause much pain)
Some key advantages we hope to achieve with ceph:
- Replication of data at the ceph layer instead of RAID
- Tiering of disks/storage
- Setting different replication levels for different storage sets
The CephFS remote filesystem has capabilities roughly analogous to NFS. There is a single 'volume', it can be simultaneously mounted by multiple clients, it respects unix groups.
In the follow up posts to this one we will build out a test ceph cluster, build filesystems on it, mount them, and generally attempt to build feature parity with an NFS system.