Perth, Western Australia - 6th to 10th January 2014
From the tiniest microSD card to the largest storage array, for all workloads, there is always a way to make sure that you don't get the performance that your storage system ought to deliver. This talk goes through some of the simple ways that you can squander bandwidth, IOPs, and throughput for tape, solid state, and spinning disk storage. These storage antipatterns have been well proven in the author's experience, and can hopefully save the fresh and optimistic from having to reinvent the coarsely polygonal wheel yet again. For those less adventurous, these same gotchas *could* instead show how with a good grasp of how your storage works, and the workloads that best suit it, you can avoid some costly pitfalls when specifying and building your storage systems.
Jason Ozolins worked at the Australian National University Supercomputer Facility (now the National Facility for the National Computational Infrastructure, nf.nci.org.au) from 2004 until 2013 as a systems administrator.
He has been fairly obsessed with computer architecture and storage technology fundamentals and developments since the early 1980s, and got to use that knowledge to address a range of storage performance problems during his time at the Facility, across (and sometimes between) large HPC and data-oriented systems. These systems used: flash SSD, capacity-oriented and high IOPs disks, direct attach JBODs and fibre channel intelligent arrays, large-scale robotic tape libraries, Linux and Solaris operating systems, LVM and MD volume managers, Ext2/3/4, SAM-FS, Lustre and CXFS cluster filesystems, across Ethernet and Infiniband networks that ranged from 1Gbit/sec to 40Gbit/sec. He knows all too well that none of these are made of magic.
His last large storage project was addressing performance issues with the migration of 1.3 petabytes of tape-resident data from the legacy Sun SAM-FS Hierarchical Storage Management system to the Facility's new GNU/Linux-based SGI DMF HSM. The work done to mitigate those issues, ranging from hacking SAM-FS utility binaries, through to using 40+ million entry tied Perl hashes in tmpfs to help schedule optimal tape recall and transfer of file sets that ranged from millions of 4K files through to individual files in the hundreds of gigabytes, brought together a lot of common threads of performance analysis and optimization that he had seen across all the different storage systems at the Facility.