TBW readings for VSAN disks (Update #2)
As I just removed two of my VSAN capacity disks from my cluster nodes I took the chance to have a look on the TBW data of the disks.
Funny enough I removed the 2 newest disks (without intention). One of them I just bought when I built the VSAN cluster (#2) and the other one came as a replacement for a failed disk (#1). This means both disks have only been in use within the VSAN environment and not in the RAID5 environment I wrote about before.
Disk | Power On Time | Total TB Written | Average writes/day |
---|---|---|---|
Samsung Evo 870 4TB #1 (…87W) | 79 days | 5,95 TB | 75 GB |
Samsung Evo 870 4TB #2 (…63H) | 113 days | 7,54 TB | 67 GB |
These 70GB/day are just half of the 150GB/day which I saw in my previous RAID5 setup with the same workload.
Summing this up again for a 5 year usage I would be okay with 125 TBW endurance on each of the capacity SSDs. With the Samsung EVO 870s having 2.4 Petabyte TBW endurance for the 4TB model I could theoretically live about 98 years. Well, I hope I’ll see that happening 🙂
While this is specific for my workload it shows that TBW should not be overestimated for home lab usage. I might not have the heaviest of all workloads, but I do run a decent lab and even some (lab) backups on the VSAN datastore.
Update 22.07.2022 (+3 days)
I exchanged some hardware in my node #3 on 22.07.2022 so I took the chance and looked at the written TB again after the VSAN migration. In brackets is the difference to the readings before I added the node to the VSAN cluster. Also I added the boot device (was active in another system before) and the cache disk (brand new with all 0 values before).
Disk | Power On Time | Total TB Written | Average writes/day |
---|---|---|---|
Samsung Evo 870 4TB #1 (…87W) | 82 days (+3) | 9,5 TB (+2,70) | 116 GB (+41) |
Samsung Evo 870 4TB #2 (…63H) | 116 days (+3) | 7,95 TB (+0,41) | 69 GB (+2) |
PNY XLR8 CS3030 M.2 250GB (ESXi Boot) | 93 days | 1,34 TB | 15 GB |
Gigabyte AORUS Gen 4 M.2 1TB (Cache) | 3 days | 7 TB | 2389 GB |
As you can see from the cache disks there have been a lot of data moves when I added the system to the VSAN cluster. I was a little shocked on the difference in change for the two capacity disks. This is factor 10 difference in TBW.
I don’t stress these values too much though, as this is looking on the first 3 days when all the VSAN migration work started. I’ll check to get some more TBW values from this system in the future to paint a more realistic picture.
Update 13.08.2022 (+26 days)
Apologies, when I checked this table I recognized that I did some kind of TB and TiB mixup in the last update. I reconfirmed everything and updated all values to TB instead of TiB as this is what HDD and SSD vendors are talking when we talk about TBW and capacity.
Values in brackets show the difference to the very first measurement from 19th July 2022 which was 26 days ago. NVMes are only added to the table after 3 days on 22.07.2022, so these values compare to the readings from the last update.
Disk | Power On Time | Total TB Written | Average writes/day |
---|---|---|---|
Samsung Evo 870 4TB #1 (…87W) | 105 days (+26) | 9,57 TB (+2,77) | 91 GB (+16) |
Samsung Evo 870 4TB #2 (…63H) | 139 days (+26) | 11,30 TB (+3,76) | 81 GB (+14) |
PNY XLR8 CS3030 M.2 250GB (ESXi Boot) | 116 days (+23) | 1,45 TB | 12,5 GB (-2,5) |
Gigabyte AORUS Gen 4 M.2 1TB (Cache) | 26 days (+23) | 22,4 TB | 862 GB (-1527) |
There are some nice readings from this last table. Firstly, compared to the second table with readings from 3 days after adding the VSAN node, the values leveled out and the average GB written per day reduced a lot for most of the disks. This will even be more visible with longer runtime I think.
Also the strange behavior for the capacity disks, where only one was written to, changed. Both have additional multi-TB writes from the last month.
Lastly, the cache disk is really put to use in this setup. Here you can see almost 1 drive write per day (DWPD).
With these updated values the TBW limit of the cache disk will be hit after 5,72 years while the capacity disks will last for over 70 years. I am very happy with my choice for the AORUS NVME with 1.8PB TBW value. Their warrenty is 5y or reaching the TBW. Most other 1TB NVMe disks are in the range of 300 to 600 TBW, so they’re hitting the limits already after a year in the worst case (300TB).