SergDL reports that mounts seem broken for the 7.0 kernel:
https://github.com/linux-apfs/linux-apfs-rw/issues/119
I never encountered this problem during testing, probably because my
version of the mount tool is older. It seems that mount options are now
set one by one using the fsconfig() syscall, and that's not possible if
we parse them all at once with ->parse_monolithic().
Implement ->parse_param() instead, at least for the latest kernel. I do
wonder if this issue won't pop up with older kernels though, since the
fsconfig() syscall has been around since 5.2. I guess I'll wait to see
if somebody complains about it.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
Instead of setting the default mount options right before parsing the
actual options, set them on the superblock as soon as it gets allocated.
This will allow the parsing code to be called more than once by the new
mount api.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
The new mount api forces me to make one function call for each option to
parse. This change means that the default options will need to get set
elsewhere, otherwise each new call would reset to default.
As a first step, move that code from parse_options() into a new
apfs_set_default_opts().
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
Reorganize the version check in the out_free_sbi label to better match
the simpler one at out_unmap_super.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
Following the previous patch, which made parse_options() always run
right after preparse_options(), merge both functions into one.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
Since the mount api conversion, kernel version 7.0 and above parse all
the options at once before moving on to the rest of the mount process.
There is no reason that other kernel versions can't do the same; in fact
that was originally my intention, as explained in the commit message for
af2ee526c0 ("Avoid double parsing of mount options"). So always call
parse_options() right after preparse_options(). The plan is to unify
these functions next.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
Inside parse_options(), some kernel versions retrieve the superblock
info from the vfs superblock, while others retrieve it from the
filesystem context. Instead just pass the info as an argument and
simplify the version checks.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
I seem to have missed a few whitespace issues during my review of the
recent mount api conversion. Fix them now.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
At this point, the nx_flags variable only exists to get copied into
s_mount_opt of the superblock. Just work with s_mount_opt directly and
get rid of the goto label.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
I'm trying to unify mount option processing across kernel versions as
much as possible. Instead of calling parse_options_set_flags() directly
from parse_options() whenever possible, always set the new s_mount_opt
and use it to call parse_options_set_flags() later.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
Move s_uid/s_gid initialization inside parse_options() for all kernel
versions, getting rid of a version check for kernels 7.0 and above.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
As all the in-tree modules has been ported to the new Kernel 7
fs mount API, the old API has been removed.
This patch add the support for the new set of APIs, while
keeping previous kernel support intact.
The code is in review on the Canonical Launchpad bug
https://launchpad.net/bugs/2142837 as well
Signed-off-by: Alessio Faina <alessio.faina@canonical.com>
Once more, the build is still broken for the 6.17 release candidate.
This time the problem is that we are no longer allowed to set s_d_op
directly. Use the new helper instead.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
The statfs code was one of the first things I implemented (most likely
before I could even mount the filesystem in any meaningful sense), and I
never tested it much. It was bound to have bugs.
Now that we handle ENOSPC properly (or so I hope), I'm starting to try
my luck with the enospc group from xfstests. The first issue I've
noticed is that generic/275 complains that it "could not sufficiently
fill filesystem". The container is actually very much full, but statfs
reports used space incorrectly: it adds up the blocks allocated by each
volume, neglecting all the blocks that don't belong to any of them.
At the time I didn't know anything about the space manager, so I guess
this seemed like a good enough solution. Fix it now.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
If a function throws ENOSPC halfway through an operation, the
transaction will get aborted so that the filesystem isn't left in an
inconsistent state. This means that the mount will go read-only, and
that the user will lose any uncommitted changes.
So, it's better if we detect ENOSPC before starting an operation. I
tried this before with commit a52b73ed97 ("Check free space before
starting a transaction"), but the code was messy and it would have
required constant updates, which I neglected of course. The problem was
that I wanted to be as accurate as possible with my calculations of
required space, but I'm not sure there is any point.
Instead, require a large fixed amount of space (512 KiB) for most
operations, and a much smaller but still almost certainly sufficient
amount of space (80 KiB) for deletions. Hopefully this will be enough
for users to maneuver around ENOSPC situations. It's not perfect, but no
CoW filesystem can be truly perfect in this regard.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
The build is still broken for redhat kernels:
https://github.com/linux-apfs/linux-apfs-rw/issues/89#issuecomment-2691955523
Fix the remaining issues. For the previous patch I was just looking at
the source, but this time I set up a rhel vm and tried a build and a
mount for each of the 9.x kernels, so hopefully it's over. I didn't know
you could get rhel for free as a developer, I guess it makes sense.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
The build is broken for redhat kernels:
https://github.com/linux-apfs/linux-apfs-rw/issues/89
This seems to have been the case for more than a year, but nobody had
reported it so far, so I guess the driver isn't getting much use over
there.
We had encountered a similar problem before in commit 1a0b9fb8af ("Fix
build for RHEL 9 kernels"), but back then I thought it was just a
one-off thing. I was wrong: redhat kernels often backport patches that
break internal apis. I don't know how other out-of-tree drivers handle
this, but for now I will have to rely on user reports.
So, go through all RHEL 9 kernels one by one, find out which api changes
were applied to each, and add the corresponding macro version checks to
the driver.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
If the user mixes up the main and tier 2 devices, the mount fails with
confusing errors. We can do better here: the fusion uuid has a bit to
tell which is the main device, so check that early and fail with a good
explanation. While we are at it, also add checks to make sure what we
are mounting is indeed a fusion drive.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
While testing the new "tier2" mount option, I realized that the driver
gives no feedback to the user when a mount option is invalid. I have my
doubts about logging an error here because, while the apfs driver is
inserted, the message may show up for unrelated mounts as well. I think
this sort of issues are unavoidable though, and XFS does log a warning
for invalid options, so do the same thing.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
So far I haven't been able to figure out fusion drives to an extent that
would allow me to implement writes. I'm tired of it, and I think the
effort is probably pointless because apfs fusion drives aren't all that
popular. I might reconsider if people request the feature, but for now
just add read-only support, which is almost trivial.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
Make the block device info field of the container superblock into a
pointer. This will make it easier to tell if we are working with a
fusion drive.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
For kernels before 6.5, blkdev_put() needs to be told the mode of the
block device. This information is stored in the vfs superblock, but I
can't access that from apfs_free_main_super() because that function may
be needed early on mount.
My somewhat hacky solution so far has been to guess the mode from the
setting of the APFS_READWRITE flag. This isn't accurate because the mode
is actually determined by SB_RDONLY, but it still seems to work - I
really haven't checked what happens to the mode inside blkdev_put().
For fusion drives, I will soon need to reuse apfs_blkdev_cleanup()
inside apfs_attach_nxi(). In this case the mode is available, so working
with the different definitions of it for each kernel version would make
a huge mess.
Since we now have apfs_blkdev_info for portablity, just keep the mode in
there and let apfs_blkdev_cleanup() retrieve it.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
Put all information about the container's block device into a single
struct, and move all related code into wrapper functions that work with
that struct. Hopefully this will make it easier to add a second block
device for the tier 2 of a fusion drive.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
A user has encountered a volume with the dataless snapshots flag:
https://github.com/linux-apfs/linux-apfs-rw/issues/86
I'm not really sure what "dataless snapshots" are, and so far I have
failed to trigger them under macos. I can set the
SNAP_META_PENDING_DATALESS flag in a snapshot myself, and macos will
call it "dataless", but their fsck doesn't complain no matter what I do
with the other fields, so I can't tell what is expected.
I can't imagine any risk in mounting such a volume read-only, so allow
that much from now on. Keep writes banned just in case, and avoid
mounting the dataless snapshots themselves, since I have no idea what
that entails.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>
While testing eda21f33e9 ("Check the superblock magic before the block
size"), I noticed that early log messages such as those don't include
device information, just the filesystem type. The problem is that the
s_id field of the vfs superblock is only getting set after the call to
apfs_read_main_super(); there is no real reason for this, so invert it.
While paying attention to this field for the first time, I realized that
I'm making a huge mess of it. Most filesystems use a "%pg" format
specifier: this prints the name of the block_device pointer passed as an
argument, so the logs are very clear. Meanwhile I was just printing the
dev_t... and putting a "g" afterwards? I guess I somehow got confused
while trying to copy the others. Do this right.
That said, an apfs superblock is not uniquely identified by its device:
we have volumes and snapshots as well. The comments in the super_block
structure claim that s_id is just an "Informational name", so I guess I
can set it to whatever I find useful. Provide the volume number and
snapshot name (if any) as well.
Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>