All-Flash NAS Project
I am currently in the midst of building an all-flash NAS and want to put some of my thoughts down for you guys and maybe they are good for someone.
Motivation
So first thing: Why do I want to do this? I am currently running a very complicated storage setup for my vSphere hosts and I want to simplify that to get away from nursing the solution all the time.
Currently I use local NVMes in the hosts and provide them via OSNexus Quantastor appliances back to the same host. Hen and egg problem when you need to reboot or something goes wrong.
For the new project I want to achieve the following:
- NFS or iSCSI datastore for my vSphere hosts with reasonable latency and throughput
- 10G throughput to all my 10G connected systems (vSphere hosts & workstation)
- Put all my productive & lab data on the SSDs (~10 TB) – I want to have some headroom, so target is 12 TB
- Low power consumption where possible
Build or buy?
That’s basically one of the most asked questions I assume. I love building stuff myself but on the other hand I love it if it just works – which my current setup is not.
I thought about buying the QNAP QuTS hero TX-h973AX – this is about 1000 EUR for a ZFS based system with 4G RAM (or ~1200 EUR for the 32G version). The system already provides 10G ethernet and has fast enough processor to also provide the speed. Also you can put in up to 9 disks – 2 of them can even be U.2 NVMes (which I happen to have, but I didn’t want to use because of their high power consumption).
I put down a list of components for building a NAS myself, which would cost me about 800 EUR based on an EPYC embedded processor.
In the end I decided to build it myself, but with already existing hardware. I do have a server system which I can use for it with a Xeon processor and 64GB of memory and 10G and lots of hard disk slots (8x 3.5″, 8x 2.5″).
The downside is the power consumption. The system takes about 70-80W in idle mode with the SAS controller built-in and powering all the backplanes and server components (BMC). However, for another 800-1200 EUR invest I can spend a lot of kWh and might still not get the flexibility and performance I get from my full fledged server.
Software?
With the thought of having an easy to use system I decided to leave the options of plain Linux or Windows systems and building stuff on it myself. For me the best purpose built NAS OS is TrueNAS – I also thought about using Quantastor, but their Community Edition has a 40TB limit – and I happened to meet that lately when I experimented with Chia. I don’t want to have that limit so must be a free (ideally open source) solution.
TrueNAS gives you the power of ZFS with snapshots, data integrity checks, etc.. – so a good choice and I might be able to use RAID-Z to have good storage efficiency while having some redundancy.
SAS, SATA or NVMe?
This is a decision matrix I took, your thoughts might be different. Prices are based on beginning of June 2021.
I removed SAS from the list easily when I saw per TB costs of min ~200 EUR. That’s just too expensive and I don’t really need the enterprise class SSDs (PB of TBW…)
So I thought about the following options: 8x 2TB SATA drives (RAIDZ would give me 14 TB usable, or RAID-Z2 double redundancy at 12TB). I still have a MX500 2TB lying around, so I’d only have to buy 7 drives.
The cheapest per TB was the Samsung QVO 870 2TB (75 EUR/TB) and specs looked OK for me. 78 GB of SLC cache per drive would still add up in an array of 8 drives, so you’d be able to write about 500GB in a single shot in the SLC cache (distributed over the drives).
Alternatively I thought about getting 4x 4TB M.2 NVMes. Cheapest was Adata XPG Spectrix 4TB at 110EUR/TB. Additionally I’d need a PCIe card to split my PCIe x16 slot to 4x x4 slots which is about 250 EUR. But therefore I’d get higher speeds from the NVMes (high throughput and IOPS).
I decided to go for SATA:
- Cheaper (per TB price and also I have the spare 2TB drive)
- More flexible (the M.2 is more complex with the additional card, might be incompatible with my system, might not be able to move it to a QNAP NAS in the future, etc…)
- Max speed in the NAS I can get over ethernet is ~1 GB/s – so faster speeds on the disks are nice but not that important. Same goes for IOPS – these will be limited via Ethernet.
- More disks (8 vs 4) gives me more possibilities in building my zpool later (e.g. Mirrors, RAID-Z2, 2x RAID-Z, …)
Terabyte Written (TBW)
That’s an important number to determine the lifetime of the SSDs. The 2TB QVOs are specified with 720TBW. But as I use the drives in a RAID write efforts would be split between the drives (writing a 1 MB file writes some parts of the file to several disks). And also I don’t write that much after initially putting all my stuff on the disks. So I thought 720 TBW might be enough. There are also other drives around which give you 300 TBW for 2TB of data. That value would be definitely too low for me.
Summary & Lessons Learned
So I decided to build the NAS myself and go for the QVO SATA drives. I had these in my hands and did some benchmarking on them and it initially looked quite good until I moved some VMs to a vSphere datastore on these drives.
Independent of the zpool layout (mirror, raidz…) I saw very high latency (~5-20 ms) which I just didn’t expect for an all-flash system. Throughput and IOPS of the drives looked very good, but just latency for the VM workload was aweful – so I do not recommend the QVO drives for VM storage!!!
I decided to return the drives and get 5x 4TB Samsung 870 EVO drives in instead as I got them cheap. This gives me 16 TB usable capacity in a RAID-Z and the writes can be split to 4 data drives + parity with no need to fill data stripes.
The EVOs show much better speeds. Maximum latency I saw in the tests is 5-6 ms. Also transfer tests via SMB are much more stable compared to the QVO drives.
In my tests I also saw not much difference between mirroring, RAID-Z1, RAID-Z2 or 2x RAID-Z in one zpool. That’s why I chose to go to RAID-Z1 – it gives me redundancy with good data efficiency (at 5 drives 80% of the capacity are usable).