Last TBW Readings for VSAN Disks
As described in another post I migrated from vSphere with VSAN to a Ceph based Proxmox cluster.
I tracked the TBW readings of my SSDs before and I took two more readings.
I added “Wear Leveling” as a new column. For the Samsung 4TB SSDs it’s called “Wear Leveling Count” and the Gigabyte AORUS Cache NVMe calls it “Percentage Used”.
The average writes/day are calculated on the whole use time and total written TB.
Values from 04.04.2023 (+234 days)
Disk | PowerOn Time | Total TB Written | Wear Leveling | Average writes/day |
---|---|---|---|---|
Samsung Evo 870 4TB #1 (…87W) | 339 days (+234) | 20,34 TB (+10,77) | 5 | 60 GB (-31) |
Samsung Evo 870 4TB #2 (…63H) | 373 days (+234) | 21,85 TB (+10,55) | 6 | 59 GB (-22) |
PNY XLR8 CS3030 M.2 250GB (ESXi Boot) | 350 days (+234) | 2,58 TB (+1,13) | 2% | 7 GB (-5,5) |
Gigabyte AORUS Gen 4 M.2 1TB (Cache) | 260 days (+234) | 470 TB (+447,6) | 16% | 1851 GB (+989) |
The values on the 4TB SSDs are very consistent and the average write value went down a lot over the longer usage period. However you can see the intense load on the cache SSD with has almost a DWPD of 2.
The EVO 4TB SSDs have a TBW endurance of 2,4 Petabyte, so at 60GB/day I could use them for over 109 years – way to go.
The AORUS has 1,8 Petabyte TBW endurance and with the 1851GB/day they’ll not even live for 3 years.
Values from 15.11.2023 (+225 days)
This was the last day of this VSAN node, so the last reading from the VSAN cluster profile and another 225 days in the future from the previous reading.
Disk | PowerOn Time | Total TB Written | Wear Leveling | Average writes/day |
---|---|---|---|---|
Samsung Evo 870 4TB #1 (…87W) | 564 days (+225) | 29,56 TB (+9,22) | 8 (+3) | 52 GB (-8) |
Samsung Evo 870 4TB #2 (…63H) | 598 days (+225) | 31,02 TB (+9,17) | 9 (+3) | 52 GB (-7) |
PNY XLR8 CS3030 M.2 250GB (ESXi Boot) | 574 days (+224) | 6,77 TB (+4,19) | 5% (+3) | 12 GB (+5) |
Gigabyte AORUS Gen 4 M.2 1TB (Cache) | 485 days (+225) | 684 TB (+214) | 25% (+9) | 1410 GB (-441) |
Again the values for both capacity SSDs are very consistent. I did less writes overall so the average went down quite a bit. Also there were no big migrations, rebuilds, etc. required.
The average writes on the cache NVMe decreased by 441GB/day, so this brings the lifetime of the drive up to 3,5 years.
Final thoughts
I’ve just written about the disks in a single node in my blog posts, but I was tracking the numbers for all three nodes. While the other capacity disks were previously used in another config, these two were just added for the VSAN configuration and show the TBW on my load profile.
For the cache disks I was also observing the newest one instead of the older ones. The older ones show about 500GB/day less use, so the usage over the nodes is not consistent.
I hope this series of articles helped to have a look on the TBW requirements of VSAN a little bit. I’ll look on TBW values in the future in my Proxmox Ceph cluster, too.