linux-apfs-rw

mirror of https://github.com/linux-apfs/linux-apfs-rw.git synced 2026-05-01 15:01:34 -07:00

Author	SHA1	Message	Date
Ernesto A. Fernández	a3bca211bd	Don't access inode state directly in 6.19 The build is broken for the 6.19 release candidate: https://github.com/linux-apfs/linux-apfs-rw/issues/109 The problem is that the inode i_state field can no longer be accessed directly. Switch to the new helpers, using a wrapper to keep the code consistent for all kernel versions. Note that I always use the *_raw version of the helpers to avoid a lockdep check that will not pass. I can't tell for sure if this is a bug. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2025-12-30 20:51:14 -03:00
Ernesto A. Fernández	0daf28cef7	Always treat ->d_name as const The build is broken for the 6.18 release candidate: https://github.com/linux-apfs/linux-apfs-rw/issues/107 The problem is that ->d_name has become const, and that qualifier gets dropped when we pass it around in the driver. Just treat is as const for all kernel versions. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2025-10-27 22:30:27 -03:00
Ernesto A. Fernández	c87d7aa77f	Fix build for RHEL 9.6 The build has broken again for RHEL: https://github.com/linux-apfs/linux-apfs-rw/issues/95 Fix it. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2025-05-14 20:43:28 -03:00
Ernesto A. Fernández	51a5b742ec	Fix the returned type of ->mkdir() for 6.15 The ->mkdir() method now needs to return a dentry struct instead of an int, but this is meant for clustered filesystems or something so just return NULL on success, like most filesystems do. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2025-04-21 19:24:48 -03:00
Ernesto A. Fernández	3174e363ad	Simplify ENOSPC checks before transactions If a function throws ENOSPC halfway through an operation, the transaction will get aborted so that the filesystem isn't left in an inconsistent state. This means that the mount will go read-only, and that the user will lose any uncommitted changes. So, it's better if we detect ENOSPC before starting an operation. I tried this before with commit `a52b73ed97` ("Check free space before starting a transaction"), but the code was messy and it would have required constant updates, which I neglected of course. The problem was that I wanted to be as accurate as possible with my calculations of required space, but I'm not sure there is any point. Instead, require a large fixed amount of space (512 KiB) for most operations, and a much smaller but still almost certainly sufficient amount of space (80 KiB) for deletions. Hopefully this will be enough for users to maneuver around ENOSPC situations. It's not perfect, but no CoW filesystem can be truly perfect in this regard. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2025-03-11 16:15:41 -03:00
Ernesto A. Fernández	514a2164d8	Split orphan cleanup among multiple transactions Gilgamesh reports via email that the following iozone benchmark is failing: ./iozone -a -i 0 -r 1M -n 1g -g 1g -f /mnt/apfs/test I've tried this myself on a small (3 GiB) filesystem, and it is the commit after the unlink that fails, complaining about the main free queue being full: APFS (800010g): free queue has too many nodes (348 > 347) (apfs_free_queue_insert_nocache:849) The problem is that random writes to a big file create a lot of extents, and when the file is deleted they all get dumped to the main free queue as separate records. The queue gets full and the commit fails. Stop forcing file deletion to happen in a single transaction. Instead, set up a workqueue whenever an orphan inode gets evicted, and leave the cleanup for another time. Also allow the cleanup to execute only partially (deleting the first extents like the official driver) and reschedule itself (again via inode eviction). Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2024-03-22 21:20:27 -03:00
Woody Suwalski	5dc836ad96	Support for kernel 6.7 [eafer: fixed some issues with folios and timestamps] Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2024-01-07 16:27:45 -03:00
Ernesto A. Fernández	af1491622d	Avoid filename normalizations issues on refresh The previous patch introduced a regression in generic/013, seen when the test is allowed to run for 20 minutes or so. The result is a corrupted filesystem where one inode appears to have too many dentries, and another inode has too few to balance. The problem is that query refreshes just rerun the query once and expect the correct result, but dentry queries for an existing record actually need to be called repeatedly to deal with hash collisions. For normalization-sensitive filesystems (which I use for most of my tests) "collisions" are constant because there is no hash. So if a split happens during a dentry deletion, chances are very high that the refresh will end up pointing to the wrong record. Don't add awkward code to deal with this to apfs_query_refresh(). Instead transform the multiple query from apfs_dentry_lookup() into a regular exact query before returning, setting the key to the unnormalized name. The refresh code will have no trouble working with that. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2024-01-04 23:22:34 -03:00
Ernesto A. Fernández	f1775ed0ef	Turn the search key into a query field I've been working on making some very needed changes to how node splits work. The plan is to always rerun queries after a split, to keep the query chain valid. One problem I've encountered while implementing this is that the search key is not really preserved by apfs_update_inode(). We have a pointer to it, but it was allocated in the stack of some other function, so the data is just noise. I've always thought it was silly to allocate both a query and a key before a lookup. Instead, turn the search key into a field within the query structure, and copy it to the children as the query gets run. This will ensure that the key remains available for future reruns. One thing to note is that, while this simplifies most callers a little bit, some of them still allocate a key in the stack and later copy it to the query. I didn't want to make any big changes and risk introducing bugs at this point, since I'm busy with other stuff. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2023-12-20 20:33:03 -03:00
Woody Suwalski	1bca725a63	Fix for the kernel 6.6 build errors [eafer: removed unnecessary setting of some timestamps] Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2023-10-25 22:43:27 -03:00
Ernesto A. Fernández	5c4773c036	Convert all strcpy() to strscpy() The build appears to be broken for some configurations on the Linux 6.5 release candidate: https://github.com/linux-apfs/linux-apfs-rw/issues/49 Some sort of static overflow check is going on for strcpy(), and several of my calls are triggering false positives. Just be safe and switch them all to strscpy(). The qstr stuff in particular looked a bit too weird anyway. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2023-08-08 19:14:10 -03:00
Ernesto A. Fernández	d4797823ea	Improve error reporting The driver is much closer to being usable, so I might start getting subtler bug reports soon. To make them easier to handle, put error messages all over the place. I should have done this from the beginning, but I guess I didn't fully understand the need back then. From now my general policy will be to use apfs_warn() for user errors or unsupported features; apfs_err() for things that are probably corruption or io errors; and apfs_alert() for things that are most likely bugs. These last two should be rare, so the same error/alert will be thrown by several layers in the callstack to provide as much information as possible. Be careful and don't flood the console on normal situations. Also, make messages with a log level lower than warning output their function name and line number, which I think will help debugging more than the actual messages. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2023-04-05 21:59:24 -03:00
Woody Suwalski	24daddc890	Changes needed to build on kernel 6.3	2023-03-24 15:12:47 -03:00
Ernesto A. Fernández	03bc7bbc65	Adjust behaviour on orphan link creation Make a few changes on orphan link behaviour to better match the official implementation. It's not clear to me that this is mandatory, but I want my fsck to work properly with official images: I've already modified it to that end, so these changes need to be applied here as well to avoid potential reports of corruption. Anyway, the changes to orphan creation are four: - Report the inode's link count as 0, that is, don't count the orphan link itself. - Preserve the original primary name for the inode. - Preserve the inode's parent id. - Update the volume's file count immediately, instead of waiting until the orphan gets purged. In essence, stop treating orphaned inodes as real files. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2023-01-06 19:36:57 -03:00
Ernesto A. Fernández	8d0c434441	Change the names of orphan files The official apfs implementation uses the "0x%llx-dead" template to name its orphan files. It's not really mandatory that all implementations follow the same scheme, but try to do it anyway. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2023-01-06 19:36:57 -03:00
Ernesto A. Fernández	8f7e39583e	Don't pass the parent directory to __apfs_link() Get rid of the unused 'dir' parameter of __apfs_link(). Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2022-03-01 21:37:23 -03:00
Ernesto A. Fernández	a4720986c0	Don't pass the superblock to apfs_free_query() I'm trying to build with "-W" for the first time, and it reports that apfs_free_query() doesn't use the 'sb' parameter. Get rid of it. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2022-03-01 20:57:15 -03:00
Ernesto A. Fernández	8b1e37e232	Allow node data to be in memory instead of disk Add a 'data' field to in-memory objects and point it to the buffer head's 'b_data'; from now on, always access the data through this field instead of the bh. To make sure I remembered to clean up all instances, rename the buffer head in the object from 'bh' to 'o_bh'. The point is to allow nodes to work with a data block that exists only in memory, and the next patch will take advantage of this to improve apfs_node_split(). Since objects that are not nodes will not need this feature, only adjust the cleanup done in apfs_node_release() for now, though maybe object cleanups should all be done in one place. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2022-02-25 17:34:11 -03:00
Ernesto A. Fernández	441540d919	Remove unused argument to apfs_init_drec_key() The 'hashed' argument is never used by apfs_init_drec_key(), since we can figure that out from the superblock. Get rid of it. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2022-02-03 16:50:24 -03:00
Ernesto A. Fernández	dc1ed2cfe7	Rename SPDX license identifier It seems that the license identifier currently in use (GPL-2.0) has been deprecated: https://spdx.org/licenses/GPL-2.0.html I don't know how important this is in practice, but I've received some complaints about it: https://github.com/linux-apfs/linux-apfs-rw/issues/18 So, just run sed -i 's/2\.0/2.0-only/' *.{c,h} sed -i 's/2\.0/2.0-only/' Makefile and change it to GPL-2.0-only. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2021-11-16 18:41:54 -03:00
Ernesto A. Fernández	07585e3d68	Get the dentry cache to work Profiling has shown that the module spends too much time inside apfs_lookup(). I had never really checked if the dentry cache was working, so naturally that turned out to be the problem: I had always assumed that the strings inside qstr structures were null-terminated, but they may actually be a single component in a pathname, terminated by a forward slash. The result was that we always searched the cache for full pathnames, which were naturally never found. To fix this, always pass the filename length from qstr to the unicode handlers, and make them work without assuming a null-termination. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2021-11-09 18:01:16 -03:00
Ernesto A. Fernández	1bf1c35880	Fix build for kernels without d_instantiate_new() It seems that d_instantiate_new() was introduced in 4.17 to fix a problem with lockdep. I won't use lockdep myself in such an old kernel, and I don't expect users to care. So, to get the module to build with those kernels, just revert the code to unlock_new_inode/d_instantiate, as was done before. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2021-09-14 13:02:08 -03:00
Ernesto A. Fernández	1ee197b6f4	Advance readdir context position after each emit I'm running generic/257 of xfstests, and it reports that getdents64() returns repeated values for d_off. It's been years since I implemented this stuff, but it would seem that I thought only the final value of the context position would matter, and I neglected to update it after each call to dir_emit(). Fix this. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2021-08-20 20:02:10 -03:00
Ernesto A. Fernández	b9920bb77d	Support the sync family of syscalls Implement the methods to support all syscalls such as sync() and fsync(). Do this by just committing the whole transaction, which is not very performant but should be technically valid. I would like to go back to this eventually and get it right, but it will probably be tricky and I would need to understand how btrfs and others handle it. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2021-08-16 22:59:18 -03:00
Ernesto A. Fernández	85ee7e2e21	Defer inode metadata writes Don't write inode metadata (including the cached extent) to the memory-mapped blocks until the transaction is about to be committed. This should allow several consecutive metadata changes to be done in a single operation. Most importantly, a new extent for appended content will only be written once, in the end. Right now it's rewritten after each block. The performance improvement from this is not as big as I had hoped, it's under 20%. I'm also sure this isn't the right way to approach this, since I could be using the dirty inode lists from the vfs. So I'll need to revise this patch at some point, but I'm leaving it for now. Signed-off-by: Ernesto A. Fernández <ernesto@corellium.com>	2021-08-06 21:39:21 -03:00

1 2

33 Commits