You know, ZFS, ButterFS (btrfs…its actually “better” right?), and I’m sure more.

I think I have ext4 on my home computer I installed ubuntu on 5 years ago. How does the choice of file system play a role? Is that old hat now? Surely something like ext4 has its place.

I see a lot of talk around filesystems but Ive never found a great resource that distiguishes them at a level that assumes I dont know much. Can anyone give some insight on how file systems work and why these new filesystems, that appear to be highlights and selling points in most distros, are better than older ones?

Edit: and since we are talking about filesystems, it might be nice to describe or mention how concepts like RAID or LUKS are related.

  • d3Xt3rM
    link
    fedilink
    arrow-up
    6
    ·
    edit-2
    7 months ago

    Not OP, but yes, that’s pretty much how it works. (ZFS scrubs do not defrgment data however).

    Fragmentation isn’t really a problem for several reasons.

    • Some (most?) COW filesystems have mechanisms to mitigate fragmentation. ZFS, for instance, uses a special allocation strategy to minimize fragmentation and can reallocate data during certain operations like resilvering or rebalancing.

    • ZFS doesn’t even have a traditional defrag command. Because of its design and the way it handles file storage, a typical defrag process is not applicable or even necessary in the same way it is with other traditional filesystems

    • Btrfs too handles chunk allocation effeciently and generally doesn’t require defragmentation, and although it does have a defrag command, it’s almost never used by anyone, unless you have a special reason to (eg: maybe you have a program that is reading raw sectors of a file, and needs the data to be contiguous).

    • Fragmentation is only really an issue for spinning disks, however, that is no longer a concern for most spinning disk users because:

      • Most home users who still have spinning disks use it for archival/long term storage/media that rarely changes (eg: photos, movies, other infrequently accessed data), so fragmentation rarely occurs here and even if it does, it’s not a concern.
      • Power users typically have a DAS or NAS setup where spinning disks are in a RAID config with striping, so the spread of data across multiple sectors actually has an advantage for averaging out read times (so no file is completely stuck in the slow regions of a disk), but also, any performance loss is also generally negated because a single file can typically be read from two or more drives simultaneously, depending on the redundancy config.
    • Enterprise users also almost always use a RAID (or similar) setup, so the same as above applies. They also use filesystems like ZFS which employs heavy caching mechanisms, typically backed by SSDs/NVMes, so again, fragmentation isn’t really an issue.

    • teawrecks@sopuli.xyz
      link
      fedilink
      arrow-up
      3
      ·
      7 months ago

      Cool, good to know. I’d be interested to learn how they mitigate fragmentation, though. It’s not clear to me how COW could mitigate the copy cost without fragmentation, but I’m certain people smarter than me have been thinking about the problem for my whole life. I know spinning disks have their own set of limitations, but even SSDs perform better on sequential reads over random reads, so it seems like the preference would still be to not split a file up too much.