User interaction with Lustre file systems6–8
6.3.1.1 Improving the performance of the rm -rf command
If the rm -rf command is issued from a single client node to a large directory tree populated with hundreds
of thousands of files, the command can sometimes take a long time (in the order of an hour) to complete
the operation. The primary reason for this is that each file is unlinked (using the unlink() operation)
individually and the transactions must be committed to disk at the server. Lustre directory trees are not sorted
by inode number, but files with adjacent inode numbers are typically adjacent on disk, so that successive
unlink operations cause excessive unnecessary disk-seeking at the server.
The speed of such operations can be increased by pre-sorting the directory entries by inode number. The
HP SFS software includes a library and a script that you can use to pre-sort the library. You can do this in
either of two ways:
• Edit your script to prefix existing rm -rf commands with an LD_preload library, as in the following
example:
LD_PRELOAD=/usr/opt/hpls/lib/fast_readdir.so /bin/rm -rf
/mnt/lustre/mydirectories
• Change your script to replace invocations of the rm -rf command with the wrapper script supplied
with the HP SFS software, as shown in either of the following examples:
/bin/sfs_rm -rf /mnt/lustre/mydirectories
Or:
RM=/bin/sfs_rm
.
.
.
${RM} -rf /mnt/lustre/mydirectories
Tests using the library as described above showed faster performance (by up to ten times the speed) in the
execution time for removing large directories.
Though the library can be used with other Linux commands, no performance improvement was shown when
it was tested with commands such as ls or find. HP recommends that you use the library only for rm
operations on large directories.
6.3.2 Large sequential I/O operations
When large sequential I/O operations are being performed (that is, when large files that are striped across
multiple OST services are being read or written in their entirety), there are some general rules of thumb that
you apply; also, there are some Lustre tuning parameters that can be modified to improve overall
performance. These factors are described here.
I/O chunk size
In HP SFS Version 2.2, the MTU of the I/O subsystem is 4MB per operation. To give optimum performance,
all I/O chunks must be at least this size. An I/O chunk size that is based on the following formula ensures
that a client can perform I/O operations in parallel to all available Object Storage Servers:
chunk_size = stripe_size * ost_count
where stripe_size is the default stripe size of the file system and ost_count is the number of OST
services that the file system is striped across.
Large sequential write operations
If you are writing large sequential files, you can achieve the best performance by ensuring that each file is
exclusively written by one process.
If all processes are writing to the same file, best performance is (in general) obtained by having each client
process write to distinct, non-overlapping sections of the same file.
Commentaires sur ces manuels