Setup a distributed filesystem
The cluster has 3 nodes specifically configured for storage, in particular MDS and OSS-0 and OSS-1, which correspond to the names of the MetaData Service and Object Storage Service needed by Lustre.
Based on the documentation, the OSS nodes have 4 disks of 2TB each, a total of 16 TB of storage, which currently is completely disregarded. The current setup uses a single 1TB disk in the login node served via NFS to the compute nodes, which is almost full. Also the storage is served via the Ethernet port (1Gbit/s), and using the OmniPath network may be a better idea.
Lustre and Ceph seem to be appropriate candidates. However, Lustre seems to be incompatible with the latest kernel version.
-
Contact Ramón Nou to erase the disks in the MDS, OSS1 and OSS2 nodes (currently used by their Lustre installation). -
Take control over mds01 -
Install nixos in one of the disks -
Test Ceph -
Mount the ceph FS in the other nodes
Edited by Rodrigo Arias Mallo