datahoarder
Who are we?
We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.
We are one. We are legion. And we're trying really hard not to forget.
-- 5-4-3-2-1-bang from this thread
view the rest of the comments
Thank you! By the way, I've heard that ZFS has some issues with growing raid array. Is it true?
Yes very much so. You'll need to figure out what your strategy is for array growth before you start building, because ZFS is very inflexible when growing. I normally just use mirrors because you can add 2 disks at a time with no hassles or gotchas. If you use a RAIDZ variant you will basically have to destruct and rebuild the pool/vdev if you want to grow them (or buy a whole bunch of new disks to start a separate RAIDZ array). The only other option is to replace every single disk in the RAIDZ with a larger capacity, which also works. RAIDZ expansion as a feature has been promised for many years (and the code is even already written!) but so far it has not been mainlined and although the current plan is to expect it sometime in the next year or so, that has also been the plan for the past 3-5 years. If you want to count on this feature being implemented by the time you want to grow you can feel free to do so, but it's not something I would count on until it's there.
If you want to be able to grow, check out mergerfs and snapraid. If you're wanting to use a pi and USB drives it's probably more what you're wanting than zfs and raid arrays. It's what i'm using and I've been really happy with it.
Thank you! Gonna check it out.
I've been using linux for a long time, and I have a background in this kind of stuff, but it's not my career and I don't keep as current as if it was, so i'm going to give my point of view on this.
A zfs array is probably the legit way to go. But there's a huge caveat there. If you're not working with this technology all the time, it's really not more robust or reliable for you. If you have a failure in several years, you don't want to rely on the fact that you set it up appropriately years ago, and you don't want to have to relearn it all just to recover your data.
Mergerfs is basically just files on a bunch of disks. Each disk has the same directory structure and your files just exist in one of those directories on a single disk, and your mergerfs volume shows you all files on all disks in that directory. There are finer points of administration, but the bottom line is you don't need to know a lot, or interact with mergerfs at all, to move all those files somewhere else. Just copy from each disk to a new drive and you have it all.
Snapraid is just a snapshot. You can use it to recover your data if a drive fails. The commands are pretty simple, and relearning that isn't going to be too hard several years down the road.
The best way isn't always the best if you know you're not going to keep current with the technology.