Merge tag 'xfs-6.9-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs updates from Chandan Babu:

 - Online repair updates:
    - More ondisk structures being repaired:
        - Inode's mode field by trying to obtain file type value from
          the a directory entry
        - Quota counters
        - Link counts of inodes
        - FS summary counters
        - Support for in-memory btrees has been added to support repair
          of rmap btrees
    - Misc changes:
        - Report corruption of metadata to the health tracking subsystem
        - Enable indirect health reporting when resources are scarce
        - Reduce memory usage while repairing refcount btree
        - Extend "Bmap update" intent item to support atomic extent
          swapping on the realtime device
        - Extend "Bmap update" intent item to support extended attribute
          fork and unwritten extents
    - Code cleanups:
        - Bmap log intent
        - Btree block pointer checking
        - Btree readahead
        - Buffer target
        - Symbolic link code

 - Remove mrlock wrapper around the rwsem

 - Convert all the GFP_NOFS flag usages to use the scoped
   memalloc_nofs_save() API instead of direct calls with the GFP_NOFS

 - Refactor and simplify xfile abstraction. Lower level APIs in shmem.c
   are required to be exported in order to achieve this

 - Skip checking alignment constraints for inode chunk allocations when
   block size is larger than inode chunk size

 - Do not submit delwri buffers collected during log recovery when an
   error has been encountered

 - Fix SEEK_HOLE/DATA for file regions which have active COW extents

 - Fix lock order inversion when executing error handling path during
   shrinking a filesystem

 - Remove duplicate ifdefs

* tag 'xfs-6.9-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (183 commits)
  xfs: shrink failure needs to hold AGI buffer
  mm/shmem.c: Use new form of *@param in kernel-doc
  kernel-doc: Add unary operator * to $type_param_ref
  xfs: use kvfree() in xlog_cil_free_logvec()
  xfs: xfs_btree_bload_prep_block() should use __GFP_NOFAIL
  xfs: fix scrub stats file permissions
  xfs: fix log recovery erroring out on refcount recovery failure
  xfs: move symlink target write function to libxfs
  xfs: move remote symlink target read function to libxfs
  xfs: move xfs_symlink_remote.c declarations to xfs_symlink_remote.h
  xfs: xfs_bmap_finish_one should map unwritten extents properly
  xfs: support deferred bmap updates on the attr fork
  xfs: support recovering bmap intent items targetting realtime extents
  xfs: add a realtime flag to the bmap update log redo items
  xfs: add a xattr_entry helper
  xfs: fix xfs_bunmapi to allow unmapping of partial rt extents
  xfs: move xfs_bmap_defer_add to xfs_bmap_item.c
  xfs: reuse xfs_bmap_update_cancel_item
  xfs: add a bi_entry helper
  xfs: remove xfs_trans_set_bmap_flags
  ...
This commit is contained in:
Linus Torvalds
2024-03-13 13:52:24 -07:00
186 changed files with 13274 additions and 3585 deletions

View File

@@ -1915,19 +1915,13 @@ four of those five higher level data structures.
The fifth use case is discussed in the :ref:`realtime summary <rtsummary>` case
study.
The most general storage interface supported by the xfile enables the reading
and writing of arbitrary quantities of data at arbitrary offsets in the xfile.
This capability is provided by ``xfile_pread`` and ``xfile_pwrite`` functions,
which behave similarly to their userspace counterparts.
XFS is very record-based, which suggests that the ability to load and store
complete records is important.
To support these cases, a pair of ``xfile_obj_load`` and ``xfile_obj_store``
functions are provided to read and persist objects into an xfile.
They are internally the same as pread and pwrite, except that they treat any
error as an out of memory error.
For online repair, squashing error conditions in this manner is an acceptable
behavior because the only reaction is to abort the operation back to userspace.
All five xfile usecases can be serviced by these four functions.
To support these cases, a pair of ``xfile_load`` and ``xfile_store``
functions are provided to read and persist objects into an xfile that treat any
error as an out of memory error. For online repair, squashing error conditions
in this manner is an acceptable behavior because the only reaction is to abort
the operation back to userspace.
However, no discussion of file access idioms is complete without answering the
question, "But what about mmap?"
@@ -1939,15 +1933,14 @@ tmpfs can only push a pagecache folio to the swap cache if the folio is neither
pinned nor locked, which means the xfile must not pin too many folios.
Short term direct access to xfile contents is done by locking the pagecache
folio and mapping it into kernel address space.
Programmatic access (e.g. pread and pwrite) uses this mechanism.
Folio locks are not supposed to be held for long periods of time, so long
term direct access to xfile contents is done by bumping the folio refcount,
folio and mapping it into kernel address space. Object load and store uses this
mechanism. Folio locks are not supposed to be held for long periods of time, so
long term direct access to xfile contents is done by bumping the folio refcount,
mapping it into kernel address space, and dropping the folio lock.
These long term users *must* be responsive to memory reclaim by hooking into
the shrinker infrastructure to know when to release folios.
The ``xfile_get_page`` and ``xfile_put_page`` functions are provided to
The ``xfile_get_folio`` and ``xfile_put_folio`` functions are provided to
retrieve the (locked) folio that backs part of an xfile and to release it.
The only code to use these folio lease functions are the xfarray
:ref:`sorting<xfarray_sort>` algorithms and the :ref:`in-memory
@@ -2277,13 +2270,12 @@ follows:
pointing to the xfile.
3. Pass the buffer cache target, buffer ops, and other information to
``xfbtree_create`` to write an initial tree header and root block to the
xfile.
``xfbtree_init`` to initialize the passed in ``struct xfbtree`` and write an
initial root block to the xfile.
Each btree type should define a wrapper that passes necessary arguments to
the creation function.
For example, rmap btrees define ``xfs_rmapbt_mem_create`` to take care of
all the necessary details for callers.
A ``struct xfbtree`` object will be returned.
4. Pass the xfbtree object to the btree cursor creation function for the
btree type.

View File

@@ -124,12 +124,24 @@ config XFS_DRAIN_INTENTS
bool
select JUMP_LABEL if HAVE_ARCH_JUMP_LABEL
config XFS_LIVE_HOOKS
bool
select JUMP_LABEL if HAVE_ARCH_JUMP_LABEL
config XFS_MEMORY_BUFS
bool
config XFS_BTREE_IN_MEM
bool
config XFS_ONLINE_SCRUB
bool "XFS online metadata check support"
default n
depends on XFS_FS
depends on TMPFS && SHMEM
select XFS_LIVE_HOOKS
select XFS_DRAIN_INTENTS
select XFS_MEMORY_BUFS
help
If you say Y here you will be able to check metadata on a
mounted XFS filesystem. This feature is intended to reduce
@@ -164,6 +176,7 @@ config XFS_ONLINE_REPAIR
bool "XFS online metadata repair support"
default n
depends on XFS_FS && XFS_ONLINE_SCRUB
select XFS_BTREE_IN_MEM
help
If you say Y here you will be able to repair metadata on a
mounted XFS filesystem. This feature is intended to reduce

View File

@@ -92,8 +92,7 @@ xfs-y += xfs_aops.o \
xfs_symlink.o \
xfs_sysfs.o \
xfs_trans.o \
xfs_xattr.o \
kmem.o
xfs_xattr.o
# low-level transaction/log code
xfs-y += xfs_log.o \
@@ -137,6 +136,9 @@ xfs-$(CONFIG_FS_DAX) += xfs_notify_failure.o
endif
xfs-$(CONFIG_XFS_DRAIN_INTENTS) += xfs_drain.o
xfs-$(CONFIG_XFS_LIVE_HOOKS) += xfs_hooks.o
xfs-$(CONFIG_XFS_MEMORY_BUFS) += xfs_buf_mem.o
xfs-$(CONFIG_XFS_BTREE_IN_MEM) += libxfs/xfs_btree_mem.o
# online scrub/repair
ifeq ($(CONFIG_XFS_ONLINE_SCRUB),y)
@@ -159,6 +161,8 @@ xfs-y += $(addprefix scrub/, \
health.o \
ialloc.o \
inode.o \
iscan.o \
nlinks.o \
parent.o \
readdir.o \
refcount.o \
@@ -179,6 +183,7 @@ xfs-$(CONFIG_XFS_RT) += $(addprefix scrub/, \
xfs-$(CONFIG_XFS_QUOTA) += $(addprefix scrub/, \
dqiterate.o \
quota.o \
quotacheck.o \
)
# online repair
@@ -188,12 +193,17 @@ xfs-y += $(addprefix scrub/, \
alloc_repair.o \
bmap_repair.o \
cow_repair.o \
fscounters_repair.o \
ialloc_repair.o \
inode_repair.o \
newbt.o \
nlinks_repair.o \
rcbag_btree.o \
rcbag.o \
reap.o \
refcount_repair.o \
repair.o \
rmap_repair.o \
)
xfs-$(CONFIG_XFS_RT) += $(addprefix scrub/, \
@@ -202,6 +212,7 @@ xfs-$(CONFIG_XFS_RT) += $(addprefix scrub/, \
xfs-$(CONFIG_XFS_QUOTA) += $(addprefix scrub/, \
quota_repair.o \
quotacheck_repair.o \
)
endif
endif

View File

@@ -1,30 +0,0 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (c) 2000-2005 Silicon Graphics, Inc.
* All Rights Reserved.
*/
#include "xfs.h"
#include "xfs_message.h"
#include "xfs_trace.h"
void *
kmem_alloc(size_t size, xfs_km_flags_t flags)
{
int retries = 0;
gfp_t lflags = kmem_flags_convert(flags);
void *ptr;
trace_kmem_alloc(size, flags, _RET_IP_);
do {
ptr = kmalloc(size, lflags);
if (ptr || (flags & KM_MAYFAIL))
return ptr;
if (!(++retries % 100))
xfs_err(NULL,
"%s(%u) possible memory allocation deadlock size %u in %s (mode:0x%x)",
current->comm, current->pid,
(unsigned int)size, __func__, lflags);
memalloc_retry_wait(lflags);
} while (1);
}

View File

@@ -1,83 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000-2005 Silicon Graphics, Inc.
* All Rights Reserved.
*/
#ifndef __XFS_SUPPORT_KMEM_H__
#define __XFS_SUPPORT_KMEM_H__
#include <linux/slab.h>
#include <linux/sched.h>
#include <linux/mm.h>
#include <linux/vmalloc.h>
/*
* General memory allocation interfaces
*/
typedef unsigned __bitwise xfs_km_flags_t;
#define KM_NOFS ((__force xfs_km_flags_t)0x0004u)
#define KM_MAYFAIL ((__force xfs_km_flags_t)0x0008u)
#define KM_ZERO ((__force xfs_km_flags_t)0x0010u)
#define KM_NOLOCKDEP ((__force xfs_km_flags_t)0x0020u)
/*
* We use a special process flag to avoid recursive callbacks into
* the filesystem during transactions. We will also issue our own
* warnings, so we explicitly skip any generic ones (silly of us).
*/
static inline gfp_t
kmem_flags_convert(xfs_km_flags_t flags)
{
gfp_t lflags;
BUG_ON(flags & ~(KM_NOFS | KM_MAYFAIL | KM_ZERO | KM_NOLOCKDEP));
lflags = GFP_KERNEL | __GFP_NOWARN;
if (flags & KM_NOFS)
lflags &= ~__GFP_FS;
/*
* Default page/slab allocator behavior is to retry for ever
* for small allocations. We can override this behavior by using
* __GFP_RETRY_MAYFAIL which will tell the allocator to retry as long
* as it is feasible but rather fail than retry forever for all
* request sizes.
*/
if (flags & KM_MAYFAIL)
lflags |= __GFP_RETRY_MAYFAIL;
if (flags & KM_ZERO)
lflags |= __GFP_ZERO;
if (flags & KM_NOLOCKDEP)
lflags |= __GFP_NOLOCKDEP;
return lflags;
}
extern void *kmem_alloc(size_t, xfs_km_flags_t);
static inline void kmem_free(const void *ptr)
{
kvfree(ptr);
}
static inline void *
kmem_zalloc(size_t size, xfs_km_flags_t flags)
{
return kmem_alloc(size, flags | KM_ZERO);
}
/*
* Zone interfaces
*/
static inline struct page *
kmem_to_page(void *addr)
{
if (is_vmalloc_addr(addr))
return vmalloc_to_page(addr);
return virt_to_page(addr);
}
#endif /* __XFS_SUPPORT_KMEM_H__ */

View File

@@ -217,6 +217,7 @@ xfs_initialize_perag_data(
*/
if (fdblocks > sbp->sb_dblocks || ifree > ialloc) {
xfs_alert(mp, "AGF corruption. Please run xfs_repair.");
xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS);
error = -EFSCORRUPTED;
goto out;
}
@@ -241,7 +242,7 @@ __xfs_free_perag(
struct xfs_perag *pag = container_of(head, struct xfs_perag, rcu_head);
ASSERT(!delayed_work_pending(&pag->pag_blockgc_work));
kmem_free(pag);
kfree(pag);
}
/*
@@ -263,7 +264,7 @@ xfs_free_perag(
xfs_defer_drain_free(&pag->pag_intents_drain);
cancel_delayed_work_sync(&pag->pag_blockgc_work);
xfs_buf_hash_destroy(pag);
xfs_buf_cache_destroy(&pag->pag_bcache);
/* drop the mount's active reference */
xfs_perag_rele(pag);
@@ -351,9 +352,9 @@ xfs_free_unused_perag_range(
spin_unlock(&mp->m_perag_lock);
if (!pag)
break;
xfs_buf_hash_destroy(pag);
xfs_buf_cache_destroy(&pag->pag_bcache);
xfs_defer_drain_free(&pag->pag_intents_drain);
kmem_free(pag);
kfree(pag);
}
}
@@ -381,7 +382,7 @@ xfs_initialize_perag(
continue;
}
pag = kmem_zalloc(sizeof(*pag), KM_MAYFAIL);
pag = kzalloc(sizeof(*pag), GFP_KERNEL | __GFP_RETRY_MAYFAIL);
if (!pag) {
error = -ENOMEM;
goto out_unwind_new_pags;
@@ -389,7 +390,7 @@ xfs_initialize_perag(
pag->pag_agno = index;
pag->pag_mount = mp;
error = radix_tree_preload(GFP_NOFS);
error = radix_tree_preload(GFP_KERNEL | __GFP_RETRY_MAYFAIL);
if (error)
goto out_free_pag;
@@ -416,9 +417,10 @@ xfs_initialize_perag(
init_waitqueue_head(&pag->pag_active_wq);
pag->pagb_count = 0;
pag->pagb_tree = RB_ROOT;
xfs_hooks_init(&pag->pag_rmap_update_hooks);
#endif /* __KERNEL__ */
error = xfs_buf_hash_init(pag);
error = xfs_buf_cache_init(&pag->pag_bcache);
if (error)
goto out_remove_pag;
@@ -453,7 +455,7 @@ out_remove_pag:
radix_tree_delete(&mp->m_perag_tree, index);
spin_unlock(&mp->m_perag_lock);
out_free_pag:
kmem_free(pag);
kfree(pag);
out_unwind_new_pags:
/* unwind any prior newly initialized pags */
xfs_free_unused_perag_range(mp, first_initialised, agcount);
@@ -491,7 +493,7 @@ xfs_btroot_init(
struct xfs_buf *bp,
struct aghdr_init_data *id)
{
xfs_btree_init_block(mp, bp, id->type, 0, 0, id->agno);
xfs_btree_init_buf(mp, bp, id->bc_ops, 0, 0, id->agno);
}
/* Finish initializing a free space btree. */
@@ -549,7 +551,7 @@ xfs_freesp_init_recs(
}
/*
* Alloc btree root block init functions
* bnobt/cntbt btree root block init functions
*/
static void
xfs_bnoroot_init(
@@ -557,17 +559,7 @@ xfs_bnoroot_init(
struct xfs_buf *bp,
struct aghdr_init_data *id)
{
xfs_btree_init_block(mp, bp, XFS_BTNUM_BNO, 0, 0, id->agno);
xfs_freesp_init_recs(mp, bp, id);
}
static void
xfs_cntroot_init(
struct xfs_mount *mp,
struct xfs_buf *bp,
struct aghdr_init_data *id)
{
xfs_btree_init_block(mp, bp, XFS_BTNUM_CNT, 0, 0, id->agno);
xfs_btree_init_buf(mp, bp, id->bc_ops, 0, 0, id->agno);
xfs_freesp_init_recs(mp, bp, id);
}
@@ -583,7 +575,7 @@ xfs_rmaproot_init(
struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp);
struct xfs_rmap_rec *rrec;
xfs_btree_init_block(mp, bp, XFS_BTNUM_RMAP, 0, 4, id->agno);
xfs_btree_init_buf(mp, bp, id->bc_ops, 0, 4, id->agno);
/*
* mark the AG header regions as static metadata The BNO
@@ -678,14 +670,13 @@ xfs_agfblock_init(
agf->agf_versionnum = cpu_to_be32(XFS_AGF_VERSION);
agf->agf_seqno = cpu_to_be32(id->agno);
agf->agf_length = cpu_to_be32(id->agsize);
agf->agf_roots[XFS_BTNUM_BNOi] = cpu_to_be32(XFS_BNO_BLOCK(mp));
agf->agf_roots[XFS_BTNUM_CNTi] = cpu_to_be32(XFS_CNT_BLOCK(mp));
agf->agf_levels[XFS_BTNUM_BNOi] = cpu_to_be32(1);
agf->agf_levels[XFS_BTNUM_CNTi] = cpu_to_be32(1);
agf->agf_bno_root = cpu_to_be32(XFS_BNO_BLOCK(mp));
agf->agf_cnt_root = cpu_to_be32(XFS_CNT_BLOCK(mp));
agf->agf_bno_level = cpu_to_be32(1);
agf->agf_cnt_level = cpu_to_be32(1);
if (xfs_has_rmapbt(mp)) {
agf->agf_roots[XFS_BTNUM_RMAPi] =
cpu_to_be32(XFS_RMAP_BLOCK(mp));
agf->agf_levels[XFS_BTNUM_RMAPi] = cpu_to_be32(1);
agf->agf_rmap_root = cpu_to_be32(XFS_RMAP_BLOCK(mp));
agf->agf_rmap_level = cpu_to_be32(1);
agf->agf_rmap_blocks = cpu_to_be32(1);
}
@@ -796,7 +787,7 @@ struct xfs_aghdr_grow_data {
size_t numblks;
const struct xfs_buf_ops *ops;
aghdr_init_work_f work;
xfs_btnum_t type;
const struct xfs_btree_ops *bc_ops;
bool need_init;
};
@@ -850,13 +841,15 @@ xfs_ag_init_headers(
.numblks = BTOBB(mp->m_sb.sb_blocksize),
.ops = &xfs_bnobt_buf_ops,
.work = &xfs_bnoroot_init,
.bc_ops = &xfs_bnobt_ops,
.need_init = true
},
{ /* CNT root block */
.daddr = XFS_AGB_TO_DADDR(mp, id->agno, XFS_CNT_BLOCK(mp)),
.numblks = BTOBB(mp->m_sb.sb_blocksize),
.ops = &xfs_cntbt_buf_ops,
.work = &xfs_cntroot_init,
.work = &xfs_bnoroot_init,
.bc_ops = &xfs_cntbt_ops,
.need_init = true
},
{ /* INO root block */
@@ -864,7 +857,7 @@ xfs_ag_init_headers(
.numblks = BTOBB(mp->m_sb.sb_blocksize),
.ops = &xfs_inobt_buf_ops,
.work = &xfs_btroot_init,
.type = XFS_BTNUM_INO,
.bc_ops = &xfs_inobt_ops,
.need_init = true
},
{ /* FINO root block */
@@ -872,7 +865,7 @@ xfs_ag_init_headers(
.numblks = BTOBB(mp->m_sb.sb_blocksize),
.ops = &xfs_finobt_buf_ops,
.work = &xfs_btroot_init,
.type = XFS_BTNUM_FINO,
.bc_ops = &xfs_finobt_ops,
.need_init = xfs_has_finobt(mp)
},
{ /* RMAP root block */
@@ -880,6 +873,7 @@ xfs_ag_init_headers(
.numblks = BTOBB(mp->m_sb.sb_blocksize),
.ops = &xfs_rmapbt_buf_ops,
.work = &xfs_rmaproot_init,
.bc_ops = &xfs_rmapbt_ops,
.need_init = xfs_has_rmapbt(mp)
},
{ /* REFC root block */
@@ -887,7 +881,7 @@ xfs_ag_init_headers(
.numblks = BTOBB(mp->m_sb.sb_blocksize),
.ops = &xfs_refcountbt_buf_ops,
.work = &xfs_btroot_init,
.type = XFS_BTNUM_REFC,
.bc_ops = &xfs_refcountbt_ops,
.need_init = xfs_has_reflink(mp)
},
{ /* NULL terminating block */
@@ -905,7 +899,7 @@ xfs_ag_init_headers(
id->daddr = dp->daddr;
id->numblks = dp->numblks;
id->type = dp->type;
id->bc_ops = dp->bc_ops;
error = xfs_ag_init_hdr(mp, id, dp->work, dp->ops);
if (error)
break;
@@ -950,8 +944,10 @@ xfs_ag_shrink_space(
agf = agfbp->b_addr;
aglen = be32_to_cpu(agi->agi_length);
/* some extra paranoid checks before we shrink the ag */
if (XFS_IS_CORRUPT(mp, agf->agf_length != agi->agi_length))
if (XFS_IS_CORRUPT(mp, agf->agf_length != agi->agi_length)) {
xfs_ag_mark_sick(pag, XFS_SICK_AG_AGF);
return -EFSCORRUPTED;
}
if (delta >= aglen)
return -EINVAL;
@@ -979,14 +975,23 @@ xfs_ag_shrink_space(
if (error) {
/*
* if extent allocation fails, need to roll the transaction to
* If extent allocation fails, need to roll the transaction to
* ensure that the AGFL fixup has been committed anyway.
*
* We need to hold the AGF across the roll to ensure nothing can
* access the AG for allocation until the shrink is fully
* cleaned up. And due to the resetting of the AG block
* reservation space needing to lock the AGI, we also have to
* hold that so we don't get AGI/AGF lock order inversions in
* the error handling path.
*/
xfs_trans_bhold(*tpp, agfbp);
xfs_trans_bhold(*tpp, agibp);
err2 = xfs_trans_roll(tpp);
if (err2)
return err2;
xfs_trans_bjoin(*tpp, agfbp);
xfs_trans_bjoin(*tpp, agibp);
goto resv_init_out;
}

View File

@@ -36,8 +36,9 @@ struct xfs_perag {
atomic_t pag_active_ref; /* active reference count */
wait_queue_head_t pag_active_wq;/* woken active_ref falls to zero */
unsigned long pag_opstate;
uint8_t pagf_levels[XFS_BTNUM_AGF];
/* # of levels in bno & cnt btree */
uint8_t pagf_bno_level; /* # of levels in bno btree */
uint8_t pagf_cnt_level; /* # of levels in cnt btree */
uint8_t pagf_rmap_level;/* # of levels in rmap btree */
uint32_t pagf_flcount; /* count of blocks in freelist */
xfs_extlen_t pagf_freeblks; /* total free blocks */
xfs_extlen_t pagf_longest; /* longest free space */
@@ -86,8 +87,10 @@ struct xfs_perag {
* Alternate btree heights so that online repair won't trip the write
* verifiers while rebuilding the AG btrees.
*/
uint8_t pagf_repair_levels[XFS_BTNUM_AGF];
uint8_t pagf_repair_bno_level;
uint8_t pagf_repair_cnt_level;
uint8_t pagf_repair_refcount_level;
uint8_t pagf_repair_rmap_level;
#endif
spinlock_t pag_state_lock;
@@ -104,9 +107,7 @@ struct xfs_perag {
int pag_ici_reclaimable; /* reclaimable inodes */
unsigned long pag_ici_reclaim_cursor; /* reclaim restart point */
/* buffer cache index */
spinlock_t pag_buf_lock; /* lock for pag_buf_hash */
struct rhashtable pag_buf_hash;
struct xfs_buf_cache pag_bcache;
/* background prealloc block trimming */
struct delayed_work pag_blockgc_work;
@@ -119,6 +120,9 @@ struct xfs_perag {
* inconsistencies.
*/
struct xfs_defer_drain pag_intents_drain;
/* Hook to feed rmapbt updates to an active online repair. */
struct xfs_hooks pag_rmap_update_hooks;
#endif /* __KERNEL__ */
};
@@ -331,7 +335,7 @@ struct aghdr_init_data {
/* per header data */
xfs_daddr_t daddr; /* header location */
size_t numblks; /* size of header */
xfs_btnum_t type; /* type of btree root block */
const struct xfs_btree_ops *bc_ops; /* btree ops */
};
int xfs_ag_init_headers(struct xfs_mount *mp, struct aghdr_init_data *id);

File diff suppressed because it is too large Load Diff

View File

@@ -16,6 +16,7 @@
#include "xfs_alloc.h"
#include "xfs_extent_busy.h"
#include "xfs_error.h"
#include "xfs_health.h"
#include "xfs_trace.h"
#include "xfs_trans.h"
#include "xfs_ag.h"
@@ -23,13 +24,22 @@
static struct kmem_cache *xfs_allocbt_cur_cache;
STATIC struct xfs_btree_cur *
xfs_allocbt_dup_cursor(
xfs_bnobt_dup_cursor(
struct xfs_btree_cur *cur)
{
return xfs_allocbt_init_cursor(cur->bc_mp, cur->bc_tp,
cur->bc_ag.agbp, cur->bc_ag.pag, cur->bc_btnum);
return xfs_bnobt_init_cursor(cur->bc_mp, cur->bc_tp, cur->bc_ag.agbp,
cur->bc_ag.pag);
}
STATIC struct xfs_btree_cur *
xfs_cntbt_dup_cursor(
struct xfs_btree_cur *cur)
{
return xfs_cntbt_init_cursor(cur->bc_mp, cur->bc_tp, cur->bc_ag.agbp,
cur->bc_ag.pag);
}
STATIC void
xfs_allocbt_set_root(
struct xfs_btree_cur *cur,
@@ -38,13 +48,18 @@ xfs_allocbt_set_root(
{
struct xfs_buf *agbp = cur->bc_ag.agbp;
struct xfs_agf *agf = agbp->b_addr;
int btnum = cur->bc_btnum;
ASSERT(ptr->s != 0);
agf->agf_roots[btnum] = ptr->s;
be32_add_cpu(&agf->agf_levels[btnum], inc);
cur->bc_ag.pag->pagf_levels[btnum] += inc;
if (xfs_btree_is_bno(cur->bc_ops)) {
agf->agf_bno_root = ptr->s;
be32_add_cpu(&agf->agf_bno_level, inc);
cur->bc_ag.pag->pagf_bno_level += inc;
} else {
agf->agf_cnt_root = ptr->s;
be32_add_cpu(&agf->agf_cnt_level, inc);
cur->bc_ag.pag->pagf_cnt_level += inc;
}
xfs_alloc_log_agf(cur->bc_tp, agbp, XFS_AGF_ROOTS | XFS_AGF_LEVELS);
}
@@ -116,7 +131,7 @@ xfs_allocbt_update_lastrec(
__be32 len;
int numrecs;
ASSERT(cur->bc_btnum == XFS_BTNUM_CNT);
ASSERT(!xfs_btree_is_bno(cur->bc_ops));
switch (reason) {
case LASTREC_UPDATE:
@@ -226,7 +241,10 @@ xfs_allocbt_init_ptr_from_cur(
ASSERT(cur->bc_ag.pag->pag_agno == be32_to_cpu(agf->agf_seqno));
ptr->s = agf->agf_roots[cur->bc_btnum];
if (xfs_btree_is_bno(cur->bc_ops))
ptr->s = agf->agf_bno_root;
else
ptr->s = agf->agf_cnt_root;
}
STATIC int64_t
@@ -299,13 +317,12 @@ xfs_allocbt_verify(
struct xfs_perag *pag = bp->b_pag;
xfs_failaddr_t fa;
unsigned int level;
xfs_btnum_t btnum = XFS_BTNUM_BNOi;
if (!xfs_verify_magic(bp, block->bb_magic))
return __this_address;
if (xfs_has_crc(mp)) {
fa = xfs_btree_sblock_v5hdr_verify(bp);
fa = xfs_btree_agblock_v5hdr_verify(bp);
if (fa)
return fa;
}
@@ -320,26 +337,32 @@ xfs_allocbt_verify(
* against.
*/
level = be16_to_cpu(block->bb_level);
if (bp->b_ops->magic[0] == cpu_to_be32(XFS_ABTC_MAGIC))
btnum = XFS_BTNUM_CNTi;
if (pag && xfs_perag_initialised_agf(pag)) {
unsigned int maxlevel = pag->pagf_levels[btnum];
unsigned int maxlevel, repair_maxlevel = 0;
#ifdef CONFIG_XFS_ONLINE_REPAIR
/*
* Online repair could be rewriting the free space btrees, so
* we'll validate against the larger of either tree while this
* is going on.
*/
maxlevel = max_t(unsigned int, maxlevel,
pag->pagf_repair_levels[btnum]);
if (bp->b_ops->magic[0] == cpu_to_be32(XFS_ABTC_MAGIC)) {
maxlevel = pag->pagf_cnt_level;
#ifdef CONFIG_XFS_ONLINE_REPAIR
repair_maxlevel = pag->pagf_repair_cnt_level;
#endif
if (level >= maxlevel)
} else {
maxlevel = pag->pagf_bno_level;
#ifdef CONFIG_XFS_ONLINE_REPAIR
repair_maxlevel = pag->pagf_repair_bno_level;
#endif
}
if (level >= max(maxlevel, repair_maxlevel))
return __this_address;
} else if (level >= mp->m_alloc_maxlevels)
return __this_address;
return xfs_btree_sblock_verify(bp, mp->m_alloc_mxr[level != 0]);
return xfs_btree_agblock_verify(bp, mp->m_alloc_mxr[level != 0]);
}
static void
@@ -348,7 +371,7 @@ xfs_allocbt_read_verify(
{
xfs_failaddr_t fa;
if (!xfs_btree_sblock_verify_crc(bp))
if (!xfs_btree_agblock_verify_crc(bp))
xfs_verifier_error(bp, -EFSBADCRC, __this_address);
else {
fa = xfs_allocbt_verify(bp);
@@ -372,7 +395,7 @@ xfs_allocbt_write_verify(
xfs_verifier_error(bp, -EFSCORRUPTED, fa);
return;
}
xfs_btree_sblock_calc_crc(bp);
xfs_btree_agblock_calc_crc(bp);
}
@@ -454,11 +477,19 @@ xfs_allocbt_keys_contiguous(
be32_to_cpu(key2->alloc.ar_startblock));
}
static const struct xfs_btree_ops xfs_bnobt_ops = {
const struct xfs_btree_ops xfs_bnobt_ops = {
.name = "bno",
.type = XFS_BTREE_TYPE_AG,
.rec_len = sizeof(xfs_alloc_rec_t),
.key_len = sizeof(xfs_alloc_key_t),
.ptr_len = XFS_BTREE_SHORT_PTR_LEN,
.dup_cursor = xfs_allocbt_dup_cursor,
.lru_refs = XFS_ALLOC_BTREE_REF,
.statoff = XFS_STATS_CALC_INDEX(xs_abtb_2),
.sick_mask = XFS_SICK_AG_BNOBT,
.dup_cursor = xfs_bnobt_dup_cursor,
.set_root = xfs_allocbt_set_root,
.alloc_block = xfs_allocbt_alloc_block,
.free_block = xfs_allocbt_free_block,
@@ -477,11 +508,20 @@ static const struct xfs_btree_ops xfs_bnobt_ops = {
.keys_contiguous = xfs_allocbt_keys_contiguous,
};
static const struct xfs_btree_ops xfs_cntbt_ops = {
const struct xfs_btree_ops xfs_cntbt_ops = {
.name = "cnt",
.type = XFS_BTREE_TYPE_AG,
.geom_flags = XFS_BTGEO_LASTREC_UPDATE,
.rec_len = sizeof(xfs_alloc_rec_t),
.key_len = sizeof(xfs_alloc_key_t),
.ptr_len = XFS_BTREE_SHORT_PTR_LEN,
.dup_cursor = xfs_allocbt_dup_cursor,
.lru_refs = XFS_ALLOC_BTREE_REF,
.statoff = XFS_STATS_CALC_INDEX(xs_abtc_2),
.sick_mask = XFS_SICK_AG_CNTBT,
.dup_cursor = xfs_cntbt_dup_cursor,
.set_root = xfs_allocbt_set_root,
.alloc_block = xfs_allocbt_alloc_block,
.free_block = xfs_allocbt_free_block,
@@ -500,76 +540,55 @@ static const struct xfs_btree_ops xfs_cntbt_ops = {
.keys_contiguous = NULL, /* not needed right now */
};
/* Allocate most of a new allocation btree cursor. */
STATIC struct xfs_btree_cur *
xfs_allocbt_init_common(
/*
* Allocate a new bnobt cursor.
*
* For staging cursors tp and agbp are NULL.
*/
struct xfs_btree_cur *
xfs_bnobt_init_cursor(
struct xfs_mount *mp,
struct xfs_trans *tp,
struct xfs_perag *pag,
xfs_btnum_t btnum)
struct xfs_buf *agbp,
struct xfs_perag *pag)
{
struct xfs_btree_cur *cur;
ASSERT(btnum == XFS_BTNUM_BNO || btnum == XFS_BTNUM_CNT);
cur = xfs_btree_alloc_cursor(mp, tp, btnum, mp->m_alloc_maxlevels,
xfs_allocbt_cur_cache);
cur->bc_ag.abt.active = false;
if (btnum == XFS_BTNUM_CNT) {
cur->bc_ops = &xfs_cntbt_ops;
cur->bc_statoff = XFS_STATS_CALC_INDEX(xs_abtc_2);
cur->bc_flags = XFS_BTREE_LASTREC_UPDATE;
} else {
cur->bc_ops = &xfs_bnobt_ops;
cur->bc_statoff = XFS_STATS_CALC_INDEX(xs_abtb_2);
}
cur = xfs_btree_alloc_cursor(mp, tp, &xfs_bnobt_ops,
mp->m_alloc_maxlevels, xfs_allocbt_cur_cache);
cur->bc_ag.pag = xfs_perag_hold(pag);
cur->bc_ag.agbp = agbp;
if (agbp) {
struct xfs_agf *agf = agbp->b_addr;
if (xfs_has_crc(mp))
cur->bc_flags |= XFS_BTREE_CRC_BLOCKS;
cur->bc_nlevels = be32_to_cpu(agf->agf_bno_level);
}
return cur;
}
/*
* Allocate a new allocation btree cursor.
* Allocate a new cntbt cursor.
*
* For staging cursors tp and agbp are NULL.
*/
struct xfs_btree_cur * /* new alloc btree cursor */
xfs_allocbt_init_cursor(
struct xfs_mount *mp, /* file system mount point */
struct xfs_trans *tp, /* transaction pointer */
struct xfs_buf *agbp, /* buffer for agf structure */
struct xfs_perag *pag,
xfs_btnum_t btnum) /* btree identifier */
{
struct xfs_agf *agf = agbp->b_addr;
struct xfs_btree_cur *cur;
cur = xfs_allocbt_init_common(mp, tp, pag, btnum);
if (btnum == XFS_BTNUM_CNT)
cur->bc_nlevels = be32_to_cpu(agf->agf_levels[XFS_BTNUM_CNT]);
else
cur->bc_nlevels = be32_to_cpu(agf->agf_levels[XFS_BTNUM_BNO]);
cur->bc_ag.agbp = agbp;
return cur;
}
/* Create a free space btree cursor with a fake root for staging. */
struct xfs_btree_cur *
xfs_allocbt_stage_cursor(
xfs_cntbt_init_cursor(
struct xfs_mount *mp,
struct xbtree_afakeroot *afake,
struct xfs_perag *pag,
xfs_btnum_t btnum)
struct xfs_trans *tp,
struct xfs_buf *agbp,
struct xfs_perag *pag)
{
struct xfs_btree_cur *cur;
cur = xfs_allocbt_init_common(mp, NULL, pag, btnum);
xfs_btree_stage_afakeroot(cur, afake);
cur = xfs_btree_alloc_cursor(mp, tp, &xfs_cntbt_ops,
mp->m_alloc_maxlevels, xfs_allocbt_cur_cache);
cur->bc_ag.pag = xfs_perag_hold(pag);
cur->bc_ag.agbp = agbp;
if (agbp) {
struct xfs_agf *agf = agbp->b_addr;
cur->bc_nlevels = be32_to_cpu(agf->agf_cnt_level);
}
return cur;
}
@@ -588,16 +607,16 @@ xfs_allocbt_commit_staged_btree(
ASSERT(cur->bc_flags & XFS_BTREE_STAGING);
agf->agf_roots[cur->bc_btnum] = cpu_to_be32(afake->af_root);
agf->agf_levels[cur->bc_btnum] = cpu_to_be32(afake->af_levels);
if (xfs_btree_is_bno(cur->bc_ops)) {
agf->agf_bno_root = cpu_to_be32(afake->af_root);
agf->agf_bno_level = cpu_to_be32(afake->af_levels);
} else {
agf->agf_cnt_root = cpu_to_be32(afake->af_root);
agf->agf_cnt_level = cpu_to_be32(afake->af_levels);
}
xfs_alloc_log_agf(tp, agbp, XFS_AGF_ROOTS | XFS_AGF_LEVELS);
if (cur->bc_btnum == XFS_BTNUM_BNO) {
xfs_btree_commit_afakeroot(cur, tp, agbp, &xfs_bnobt_ops);
} else {
cur->bc_flags |= XFS_BTREE_LASTREC_UPDATE;
xfs_btree_commit_afakeroot(cur, tp, agbp, &xfs_cntbt_ops);
}
xfs_btree_commit_afakeroot(cur, tp, agbp);
}
/* Calculate number of records in an alloc btree block. */

View File

@@ -47,12 +47,12 @@ struct xbtree_afakeroot;
(maxrecs) * sizeof(xfs_alloc_key_t) + \
((index) - 1) * sizeof(xfs_alloc_ptr_t)))
extern struct xfs_btree_cur *xfs_allocbt_init_cursor(struct xfs_mount *mp,
struct xfs_btree_cur *xfs_bnobt_init_cursor(struct xfs_mount *mp,
struct xfs_trans *tp, struct xfs_buf *bp,
struct xfs_perag *pag, xfs_btnum_t btnum);
struct xfs_btree_cur *xfs_allocbt_stage_cursor(struct xfs_mount *mp,
struct xbtree_afakeroot *afake, struct xfs_perag *pag,
xfs_btnum_t btnum);
struct xfs_perag *pag);
struct xfs_btree_cur *xfs_cntbt_init_cursor(struct xfs_mount *mp,
struct xfs_trans *tp, struct xfs_buf *bp,
struct xfs_perag *pag);
extern int xfs_allocbt_maxrecs(struct xfs_mount *, int, int);
extern xfs_extlen_t xfs_allocbt_calc_size(struct xfs_mount *mp,
unsigned long long len);

View File

@@ -224,7 +224,7 @@ int
xfs_attr_get_ilocked(
struct xfs_da_args *args)
{
ASSERT(xfs_isilocked(args->dp, XFS_ILOCK_SHARED | XFS_ILOCK_EXCL));
xfs_assert_ilocked(args->dp, XFS_ILOCK_SHARED | XFS_ILOCK_EXCL);
if (!xfs_inode_hasattr(args->dp))
return -ENOATTR;
@@ -891,7 +891,8 @@ xfs_attr_defer_add(
struct xfs_attr_intent *new;
new = kmem_cache_zalloc(xfs_attr_intent_cache, GFP_NOFS | __GFP_NOFAIL);
new = kmem_cache_zalloc(xfs_attr_intent_cache,
GFP_KERNEL | __GFP_NOFAIL);
new->xattri_op_flags = op_flags;
new->xattri_da_args = args;

View File

@@ -29,6 +29,7 @@
#include "xfs_log.h"
#include "xfs_ag.h"
#include "xfs_errortag.h"
#include "xfs_health.h"
/*
@@ -879,8 +880,7 @@ xfs_attr_shortform_to_leaf(
trace_xfs_attr_sf_to_leaf(args);
tmpbuffer = kmem_alloc(size, 0);
ASSERT(tmpbuffer != NULL);
tmpbuffer = kmalloc(size, GFP_KERNEL | __GFP_NOFAIL);
memcpy(tmpbuffer, ifp->if_data, size);
sf = (struct xfs_attr_sf_hdr *)tmpbuffer;
@@ -924,7 +924,7 @@ xfs_attr_shortform_to_leaf(
}
error = 0;
out:
kmem_free(tmpbuffer);
kfree(tmpbuffer);
return error;
}
@@ -1059,7 +1059,7 @@ xfs_attr3_leaf_to_shortform(
trace_xfs_attr_leaf_to_sf(args);
tmpbuffer = kmem_alloc(args->geo->blksize, 0);
tmpbuffer = kmalloc(args->geo->blksize, GFP_KERNEL | __GFP_NOFAIL);
if (!tmpbuffer)
return -ENOMEM;
@@ -1125,7 +1125,7 @@ xfs_attr3_leaf_to_shortform(
error = 0;
out:
kmem_free(tmpbuffer);
kfree(tmpbuffer);
return error;
}
@@ -1533,7 +1533,7 @@ xfs_attr3_leaf_compact(
trace_xfs_attr_leaf_compact(args);
tmpbuffer = kmem_alloc(args->geo->blksize, 0);
tmpbuffer = kmalloc(args->geo->blksize, GFP_KERNEL | __GFP_NOFAIL);
memcpy(tmpbuffer, bp->b_addr, args->geo->blksize);
memset(bp->b_addr, 0, args->geo->blksize);
leaf_src = (xfs_attr_leafblock_t *)tmpbuffer;
@@ -1571,7 +1571,7 @@ xfs_attr3_leaf_compact(
*/
xfs_trans_log_buf(trans, bp, 0, args->geo->blksize - 1);
kmem_free(tmpbuffer);
kfree(tmpbuffer);
}
/*
@@ -2250,7 +2250,8 @@ xfs_attr3_leaf_unbalance(
struct xfs_attr_leafblock *tmp_leaf;
struct xfs_attr3_icleaf_hdr tmphdr;
tmp_leaf = kmem_zalloc(state->args->geo->blksize, 0);
tmp_leaf = kzalloc(state->args->geo->blksize,
GFP_KERNEL | __GFP_NOFAIL);
/*
* Copy the header into the temp leaf so that all the stuff
@@ -2290,7 +2291,7 @@ xfs_attr3_leaf_unbalance(
}
memcpy(save_leaf, tmp_leaf, state->args->geo->blksize);
savehdr = tmphdr; /* struct copy */
kmem_free(tmp_leaf);
kfree(tmp_leaf);
}
xfs_attr3_leaf_hdr_to_disk(state->args->geo, save_leaf, &savehdr);
@@ -2343,6 +2344,7 @@ xfs_attr3_leaf_lookup_int(
entries = xfs_attr3_leaf_entryp(leaf);
if (ichdr.count >= args->geo->blksize / 8) {
xfs_buf_mark_corrupt(bp);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
@@ -2362,10 +2364,12 @@ xfs_attr3_leaf_lookup_int(
}
if (!(probe >= 0 && (!ichdr.count || probe < ichdr.count))) {
xfs_buf_mark_corrupt(bp);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
if (!(span <= 4 || be32_to_cpu(entry->hashval) == hashval)) {
xfs_buf_mark_corrupt(bp);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}

View File

@@ -22,6 +22,7 @@
#include "xfs_attr_remote.h"
#include "xfs_trace.h"
#include "xfs_error.h"
#include "xfs_health.h"
#define ATTR_RMTVALUE_MAPSIZE 1 /* # of map entries at once */
@@ -276,17 +277,18 @@ xfs_attr3_rmt_hdr_set(
*/
STATIC int
xfs_attr_rmtval_copyout(
struct xfs_mount *mp,
struct xfs_buf *bp,
xfs_ino_t ino,
int *offset,
int *valuelen,
uint8_t **dst)
struct xfs_mount *mp,
struct xfs_buf *bp,
struct xfs_inode *dp,
int *offset,
int *valuelen,
uint8_t **dst)
{
char *src = bp->b_addr;
xfs_daddr_t bno = xfs_buf_daddr(bp);
int len = BBTOB(bp->b_length);
int blksize = mp->m_attr_geo->blksize;
char *src = bp->b_addr;
xfs_ino_t ino = dp->i_ino;
xfs_daddr_t bno = xfs_buf_daddr(bp);
int len = BBTOB(bp->b_length);
int blksize = mp->m_attr_geo->blksize;
ASSERT(len >= blksize);
@@ -302,6 +304,7 @@ xfs_attr_rmtval_copyout(
xfs_alert(mp,
"remote attribute header mismatch bno/off/len/owner (0x%llx/0x%x/Ox%x/0x%llx)",
bno, *offset, byte_cnt, ino);
xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
return -EFSCORRUPTED;
}
hdr_size = sizeof(struct xfs_attr3_rmt_hdr);
@@ -418,10 +421,12 @@ xfs_attr_rmtval_get(
dblkcnt = XFS_FSB_TO_BB(mp, map[i].br_blockcount);
error = xfs_buf_read(mp->m_ddev_targp, dblkno, dblkcnt,
0, &bp, &xfs_attr3_rmt_buf_ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(args->dp, XFS_ATTR_FORK);
if (error)
return error;
error = xfs_attr_rmtval_copyout(mp, bp, args->dp->i_ino,
error = xfs_attr_rmtval_copyout(mp, bp, args->dp,
&offset, &valuelen,
&dst);
xfs_buf_relse(bp);
@@ -545,11 +550,13 @@ xfs_attr_rmtval_stale(
struct xfs_buf *bp;
int error;
ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
xfs_assert_ilocked(ip, XFS_ILOCK_EXCL);
if (XFS_IS_CORRUPT(mp, map->br_startblock == DELAYSTARTBLOCK) ||
XFS_IS_CORRUPT(mp, map->br_startblock == HOLESTARTBLOCK))
XFS_IS_CORRUPT(mp, map->br_startblock == HOLESTARTBLOCK)) {
xfs_bmap_mark_sick(ip, XFS_ATTR_FORK);
return -EFSCORRUPTED;
}
error = xfs_buf_incore(mp->m_ddev_targp,
XFS_FSB_TO_DADDR(mp, map->br_startblock),
@@ -659,8 +666,10 @@ xfs_attr_rmtval_invalidate(
blkcnt, &map, &nmap, XFS_BMAPI_ATTRFORK);
if (error)
return error;
if (XFS_IS_CORRUPT(args->dp->i_mount, nmap != 1))
if (XFS_IS_CORRUPT(args->dp->i_mount, nmap != 1)) {
xfs_bmap_mark_sick(args->dp, XFS_ATTR_FORK);
return -EFSCORRUPTED;
}
error = xfs_attr_rmtval_stale(args->dp, &map, XBF_TRYLOCK);
if (error)
return error;

File diff suppressed because it is too large Load Diff

View File

@@ -232,6 +232,10 @@ enum xfs_bmap_intent_type {
XFS_BMAP_UNMAP,
};
#define XFS_BMAP_INTENT_STRINGS \
{ XFS_BMAP_MAP, "map" }, \
{ XFS_BMAP_UNMAP, "unmap" }
struct xfs_bmap_intent {
struct list_head bi_list;
enum xfs_bmap_intent_type bi_type;
@@ -241,14 +245,11 @@ struct xfs_bmap_intent {
struct xfs_bmbt_irec bi_bmap;
};
void xfs_bmap_update_get_group(struct xfs_mount *mp,
struct xfs_bmap_intent *bi);
int xfs_bmap_finish_one(struct xfs_trans *tp, struct xfs_bmap_intent *bi);
void xfs_bmap_map_extent(struct xfs_trans *tp, struct xfs_inode *ip,
struct xfs_bmbt_irec *imap);
int whichfork, struct xfs_bmbt_irec *imap);
void xfs_bmap_unmap_extent(struct xfs_trans *tp, struct xfs_inode *ip,
struct xfs_bmbt_irec *imap);
int whichfork, struct xfs_bmbt_irec *imap);
static inline uint32_t xfs_bmap_fork_to_state(int whichfork)
{
@@ -280,4 +281,12 @@ extern struct kmem_cache *xfs_bmap_intent_cache;
int __init xfs_bmap_intent_init_cache(void);
void xfs_bmap_intent_destroy_cache(void);
typedef int (*xfs_bmap_query_range_fn)(
struct xfs_btree_cur *cur,
struct xfs_bmbt_irec *rec,
void *priv);
int xfs_bmap_query_all(struct xfs_btree_cur *cur, xfs_bmap_query_range_fn fn,
void *priv);
#endif /* __XFS_BMAP_H__ */

View File

@@ -26,6 +26,22 @@
static struct kmem_cache *xfs_bmbt_cur_cache;
void
xfs_bmbt_init_block(
struct xfs_inode *ip,
struct xfs_btree_block *buf,
struct xfs_buf *bp,
__u16 level,
__u16 numrecs)
{
if (bp)
xfs_btree_init_buf(ip->i_mount, bp, &xfs_bmbt_ops, level,
numrecs, ip->i_ino);
else
xfs_btree_init_block(ip->i_mount, buf, &xfs_bmbt_ops, level,
numrecs, ip->i_ino);
}
/*
* Convert on-disk form of btree root to in-memory form.
*/
@@ -44,9 +60,7 @@ xfs_bmdr_to_bmbt(
xfs_bmbt_key_t *tkp;
__be64 *tpp;
xfs_btree_init_block_int(mp, rblock, XFS_BUF_DADDR_NULL,
XFS_BTNUM_BMAP, 0, 0, ip->i_ino,
XFS_BTREE_LONG_PTRS);
xfs_bmbt_init_block(ip, rblock, NULL, 0, 0);
rblock->bb_level = dblock->bb_level;
ASSERT(be16_to_cpu(rblock->bb_level) > 0);
rblock->bb_numrecs = dblock->bb_numrecs;
@@ -171,13 +185,8 @@ xfs_bmbt_dup_cursor(
new = xfs_bmbt_init_cursor(cur->bc_mp, cur->bc_tp,
cur->bc_ino.ip, cur->bc_ino.whichfork);
/*
* Copy the firstblock, dfops, and flags values,
* since init cursor doesn't get them.
*/
new->bc_ino.flags = cur->bc_ino.flags;
new->bc_flags |= (cur->bc_flags &
(XFS_BTREE_BMBT_INVALID_OWNER | XFS_BTREE_BMBT_WASDEL));
return new;
}
@@ -189,10 +198,10 @@ xfs_bmbt_update_cursor(
ASSERT((dst->bc_tp->t_highest_agno != NULLAGNUMBER) ||
(dst->bc_ino.ip->i_diflags & XFS_DIFLAG_REALTIME));
dst->bc_ino.allocated += src->bc_ino.allocated;
dst->bc_bmap.allocated += src->bc_bmap.allocated;
dst->bc_tp->t_highest_agno = src->bc_tp->t_highest_agno;
src->bc_ino.allocated = 0;
src->bc_bmap.allocated = 0;
}
STATIC int
@@ -211,7 +220,7 @@ xfs_bmbt_alloc_block(
xfs_rmap_ino_bmbt_owner(&args.oinfo, cur->bc_ino.ip->i_ino,
cur->bc_ino.whichfork);
args.minlen = args.maxlen = args.prod = 1;
args.wasdel = cur->bc_ino.flags & XFS_BTCUR_BMBT_WASDEL;
args.wasdel = cur->bc_flags & XFS_BTREE_BMBT_WASDEL;
if (!args.wasdel && args.tp->t_blk_res == 0)
return -ENOSPC;
@@ -247,7 +256,7 @@ xfs_bmbt_alloc_block(
}
ASSERT(args.len == 1);
cur->bc_ino.allocated++;
cur->bc_bmap.allocated++;
cur->bc_ino.ip->i_nblocks++;
xfs_trans_log_inode(args.tp, cur->bc_ino.ip, XFS_ILOG_CORE);
xfs_trans_mod_dquot_byino(args.tp, cur->bc_ino.ip,
@@ -360,14 +369,6 @@ xfs_bmbt_init_rec_from_cur(
xfs_bmbt_disk_set_all(&rec->bmbt, &cur->bc_rec.b);
}
STATIC void
xfs_bmbt_init_ptr_from_cur(
struct xfs_btree_cur *cur,
union xfs_btree_ptr *ptr)
{
ptr->l = 0;
}
STATIC int64_t
xfs_bmbt_key_diff(
struct xfs_btree_cur *cur,
@@ -419,7 +420,7 @@ xfs_bmbt_verify(
* XXX: need a better way of verifying the owner here. Right now
* just make sure there has been one set.
*/
fa = xfs_btree_lblock_v5hdr_verify(bp, XFS_RMAP_OWN_UNKNOWN);
fa = xfs_btree_fsblock_v5hdr_verify(bp, XFS_RMAP_OWN_UNKNOWN);
if (fa)
return fa;
}
@@ -435,7 +436,7 @@ xfs_bmbt_verify(
if (level > max(mp->m_bm_maxlevels[0], mp->m_bm_maxlevels[1]))
return __this_address;
return xfs_btree_lblock_verify(bp, mp->m_bmap_dmxr[level != 0]);
return xfs_btree_fsblock_verify(bp, mp->m_bmap_dmxr[level != 0]);
}
static void
@@ -444,7 +445,7 @@ xfs_bmbt_read_verify(
{
xfs_failaddr_t fa;
if (!xfs_btree_lblock_verify_crc(bp))
if (!xfs_btree_fsblock_verify_crc(bp))
xfs_verifier_error(bp, -EFSBADCRC, __this_address);
else {
fa = xfs_bmbt_verify(bp);
@@ -468,7 +469,7 @@ xfs_bmbt_write_verify(
xfs_verifier_error(bp, -EFSCORRUPTED, fa);
return;
}
xfs_btree_lblock_calc_crc(bp);
xfs_btree_fsblock_calc_crc(bp);
}
const struct xfs_buf_ops xfs_bmbt_buf_ops = {
@@ -515,9 +516,16 @@ xfs_bmbt_keys_contiguous(
be64_to_cpu(key2->bmbt.br_startoff));
}
static const struct xfs_btree_ops xfs_bmbt_ops = {
const struct xfs_btree_ops xfs_bmbt_ops = {
.name = "bmap",
.type = XFS_BTREE_TYPE_INODE,
.rec_len = sizeof(xfs_bmbt_rec_t),
.key_len = sizeof(xfs_bmbt_key_t),
.ptr_len = XFS_BTREE_LONG_PTR_LEN,
.lru_refs = XFS_BMAP_BTREE_REF,
.statoff = XFS_STATS_CALC_INDEX(xs_bmbt_2),
.dup_cursor = xfs_bmbt_dup_cursor,
.update_cursor = xfs_bmbt_update_cursor,
@@ -529,7 +537,6 @@ static const struct xfs_btree_ops xfs_bmbt_ops = {
.init_key_from_rec = xfs_bmbt_init_key_from_rec,
.init_high_key_from_rec = xfs_bmbt_init_high_key_from_rec,
.init_rec_from_cur = xfs_bmbt_init_rec_from_cur,
.init_ptr_from_cur = xfs_bmbt_init_ptr_from_cur,
.key_diff = xfs_bmbt_key_diff,
.diff_two_keys = xfs_bmbt_diff_two_keys,
.buf_ops = &xfs_bmbt_buf_ops,
@@ -538,35 +545,10 @@ static const struct xfs_btree_ops xfs_bmbt_ops = {
.keys_contiguous = xfs_bmbt_keys_contiguous,
};
static struct xfs_btree_cur *
xfs_bmbt_init_common(
struct xfs_mount *mp,
struct xfs_trans *tp,
struct xfs_inode *ip,
int whichfork)
{
struct xfs_btree_cur *cur;
ASSERT(whichfork != XFS_COW_FORK);
cur = xfs_btree_alloc_cursor(mp, tp, XFS_BTNUM_BMAP,
mp->m_bm_maxlevels[whichfork], xfs_bmbt_cur_cache);
cur->bc_statoff = XFS_STATS_CALC_INDEX(xs_bmbt_2);
cur->bc_ops = &xfs_bmbt_ops;
cur->bc_flags = XFS_BTREE_LONG_PTRS | XFS_BTREE_ROOT_IN_INODE;
if (xfs_has_crc(mp))
cur->bc_flags |= XFS_BTREE_CRC_BLOCKS;
cur->bc_ino.ip = ip;
cur->bc_ino.allocated = 0;
cur->bc_ino.flags = 0;
return cur;
}
/*
* Allocate a new bmap btree cursor.
* Create a new bmap btree cursor.
*
* For staging cursors -1 in passed in whichfork.
*/
struct xfs_btree_cur *
xfs_bmbt_init_cursor(
@@ -575,15 +557,34 @@ xfs_bmbt_init_cursor(
struct xfs_inode *ip,
int whichfork)
{
struct xfs_ifork *ifp = xfs_ifork_ptr(ip, whichfork);
struct xfs_btree_cur *cur;
unsigned int maxlevels;
cur = xfs_bmbt_init_common(mp, tp, ip, whichfork);
ASSERT(whichfork != XFS_COW_FORK);
cur->bc_nlevels = be16_to_cpu(ifp->if_broot->bb_level) + 1;
cur->bc_ino.forksize = xfs_inode_fork_size(ip, whichfork);
/*
* The Data fork always has larger maxlevel, so use that for staging
* cursors.
*/
switch (whichfork) {
case XFS_STAGING_FORK:
maxlevels = mp->m_bm_maxlevels[XFS_DATA_FORK];
break;
default:
maxlevels = mp->m_bm_maxlevels[whichfork];
break;
}
cur = xfs_btree_alloc_cursor(mp, tp, &xfs_bmbt_ops, maxlevels,
xfs_bmbt_cur_cache);
cur->bc_ino.ip = ip;
cur->bc_ino.whichfork = whichfork;
cur->bc_bmap.allocated = 0;
if (whichfork != XFS_STAGING_FORK) {
struct xfs_ifork *ifp = xfs_ifork_ptr(ip, whichfork);
cur->bc_nlevels = be16_to_cpu(ifp->if_broot->bb_level) + 1;
cur->bc_ino.forksize = xfs_inode_fork_size(ip, whichfork);
}
return cur;
}
@@ -598,33 +599,6 @@ xfs_bmbt_block_maxrecs(
return blocklen / (sizeof(xfs_bmbt_key_t) + sizeof(xfs_bmbt_ptr_t));
}
/*
* Allocate a new bmap btree cursor for reloading an inode block mapping data
* structure. Note that callers can use the staged cursor to reload extents
* format inode forks if they rebuild the iext tree and commit the staged
* cursor immediately.
*/
struct xfs_btree_cur *
xfs_bmbt_stage_cursor(
struct xfs_mount *mp,
struct xfs_inode *ip,
struct xbtree_ifakeroot *ifake)
{
struct xfs_btree_cur *cur;
struct xfs_btree_ops *ops;
/* data fork always has larger maxheight */
cur = xfs_bmbt_init_common(mp, NULL, ip, XFS_DATA_FORK);
cur->bc_nlevels = ifake->if_levels;
cur->bc_ino.forksize = ifake->if_fork_size;
/* Don't let anyone think we're attached to the real fork yet. */
cur->bc_ino.whichfork = -1;
xfs_btree_stage_ifakeroot(cur, ifake, &ops);
ops->update_cursor = NULL;
return cur;
}
/*
* Swap in the new inode fork root. Once we pass this point the newly rebuilt
* mappings are in place and we have to kill off any old btree blocks.
@@ -665,7 +639,7 @@ xfs_bmbt_commit_staged_btree(
break;
}
xfs_trans_log_inode(tp, cur->bc_ino.ip, flags);
xfs_btree_commit_ifakeroot(cur, tp, whichfork, &xfs_bmbt_ops);
xfs_btree_commit_ifakeroot(cur, tp, whichfork);
}
/*
@@ -751,7 +725,7 @@ xfs_bmbt_change_owner(
ASSERT(xfs_ifork_ptr(ip, whichfork)->if_format == XFS_DINODE_FMT_BTREE);
cur = xfs_bmbt_init_cursor(ip->i_mount, tp, ip, whichfork);
cur->bc_ino.flags |= XFS_BTCUR_BMBT_INVALID_OWNER;
cur->bc_flags |= XFS_BTREE_BMBT_INVALID_OWNER;
error = xfs_btree_change_owner(cur, new_owner, buffer_list);
xfs_btree_del_cursor(cur, error);

View File

@@ -107,8 +107,6 @@ extern int xfs_bmbt_change_owner(struct xfs_trans *tp, struct xfs_inode *ip,
extern struct xfs_btree_cur *xfs_bmbt_init_cursor(struct xfs_mount *,
struct xfs_trans *, struct xfs_inode *, int);
struct xfs_btree_cur *xfs_bmbt_stage_cursor(struct xfs_mount *mp,
struct xfs_inode *ip, struct xbtree_ifakeroot *ifake);
void xfs_bmbt_commit_staged_btree(struct xfs_btree_cur *cur,
struct xfs_trans *tp, int whichfork);
@@ -120,4 +118,7 @@ unsigned int xfs_bmbt_maxlevels_ondisk(void);
int __init xfs_bmbt_init_cur_cache(void);
void xfs_bmbt_destroy_cur_cache(void);
void xfs_bmbt_init_block(struct xfs_inode *ip, struct xfs_btree_block *buf,
struct xfs_buf *bp, __u16 level, __u16 numrecs);
#endif /* __XFS_BMAP_BTREE_H__ */

File diff suppressed because it is too large Load Diff

View File

@@ -55,15 +55,8 @@ union xfs_btree_rec {
#define XFS_LOOKUP_LE ((xfs_lookup_t)XFS_LOOKUP_LEi)
#define XFS_LOOKUP_GE ((xfs_lookup_t)XFS_LOOKUP_GEi)
#define XFS_BTNUM_BNO ((xfs_btnum_t)XFS_BTNUM_BNOi)
#define XFS_BTNUM_CNT ((xfs_btnum_t)XFS_BTNUM_CNTi)
#define XFS_BTNUM_BMAP ((xfs_btnum_t)XFS_BTNUM_BMAPi)
#define XFS_BTNUM_INO ((xfs_btnum_t)XFS_BTNUM_INOi)
#define XFS_BTNUM_FINO ((xfs_btnum_t)XFS_BTNUM_FINOi)
#define XFS_BTNUM_RMAP ((xfs_btnum_t)XFS_BTNUM_RMAPi)
#define XFS_BTNUM_REFC ((xfs_btnum_t)XFS_BTNUM_REFCi)
uint32_t xfs_btree_magic(int crc, xfs_btnum_t btnum);
struct xfs_btree_ops;
uint32_t xfs_btree_magic(struct xfs_mount *mp, const struct xfs_btree_ops *ops);
/*
* For logging record fields.
@@ -86,9 +79,11 @@ uint32_t xfs_btree_magic(int crc, xfs_btnum_t btnum);
* Generic stats interface
*/
#define XFS_BTREE_STATS_INC(cur, stat) \
XFS_STATS_INC_OFF((cur)->bc_mp, (cur)->bc_statoff + __XBTS_ ## stat)
XFS_STATS_INC_OFF((cur)->bc_mp, \
(cur)->bc_ops->statoff + __XBTS_ ## stat)
#define XFS_BTREE_STATS_ADD(cur, stat, val) \
XFS_STATS_ADD_OFF((cur)->bc_mp, (cur)->bc_statoff + __XBTS_ ## stat, val)
XFS_STATS_ADD_OFF((cur)->bc_mp, \
(cur)->bc_ops->statoff + __XBTS_ ## stat, val)
enum xbtree_key_contig {
XBTREE_KEY_GAP = 0,
@@ -111,10 +106,37 @@ static inline enum xbtree_key_contig xbtree_key_contig(uint64_t x, uint64_t y)
return XBTREE_KEY_OVERLAP;
}
#define XFS_BTREE_LONG_PTR_LEN (sizeof(__be64))
#define XFS_BTREE_SHORT_PTR_LEN (sizeof(__be32))
enum xfs_btree_type {
XFS_BTREE_TYPE_AG,
XFS_BTREE_TYPE_INODE,
XFS_BTREE_TYPE_MEM,
};
struct xfs_btree_ops {
/* size of the key and record structures */
size_t key_len;
size_t rec_len;
const char *name;
/* Type of btree - AG-rooted or inode-rooted */
enum xfs_btree_type type;
/* XFS_BTGEO_* flags that determine the geometry of the btree */
unsigned int geom_flags;
/* size of the key, pointer, and record structures */
size_t key_len;
size_t ptr_len;
size_t rec_len;
/* LRU refcount to set on each btree buffer created */
unsigned int lru_refs;
/* offset of btree stats array */
unsigned int statoff;
/* sick mask for health reporting (only for XFS_BTREE_TYPE_AG) */
unsigned int sick_mask;
/* cursor operations */
struct xfs_btree_cur *(*dup_cursor)(struct xfs_btree_cur *);
@@ -199,6 +221,10 @@ struct xfs_btree_ops {
const union xfs_btree_key *mask);
};
/* btree geometry flags */
#define XFS_BTGEO_LASTREC_UPDATE (1U << 0) /* track last rec externally */
#define XFS_BTGEO_OVERLAPPING (1U << 1) /* overlapping intervals */
/*
* Reasons for the update_lastrec method to be called.
*/
@@ -215,39 +241,6 @@ union xfs_btree_irec {
struct xfs_refcount_irec rc;
};
/* Per-AG btree information. */
struct xfs_btree_cur_ag {
struct xfs_perag *pag;
union {
struct xfs_buf *agbp;
struct xbtree_afakeroot *afake; /* for staging cursor */
};
union {
struct {
unsigned int nr_ops; /* # record updates */
unsigned int shape_changes; /* # of extent splits */
} refc;
struct {
bool active; /* allocation cursor state */
} abt;
};
};
/* Btree-in-inode cursor information */
struct xfs_btree_cur_ino {
struct xfs_inode *ip;
struct xbtree_ifakeroot *ifake; /* for staging cursor */
int allocated;
short forksize;
char whichfork;
char flags;
/* We are converting a delalloc reservation */
#define XFS_BTCUR_BMBT_WASDEL (1 << 0)
/* For extent swap, ignore owner check in verifier */
#define XFS_BTCUR_BMBT_INVALID_OWNER (1 << 1)
};
struct xfs_btree_level {
/* buffer pointer */
struct xfs_buf *bp;
@@ -272,21 +265,38 @@ struct xfs_btree_cur
const struct xfs_btree_ops *bc_ops;
struct kmem_cache *bc_cache; /* cursor cache */
unsigned int bc_flags; /* btree features - below */
xfs_btnum_t bc_btnum; /* identifies which btree type */
union xfs_btree_irec bc_rec; /* current insert/search record value */
uint8_t bc_nlevels; /* number of levels in the tree */
uint8_t bc_maxlevels; /* maximum levels for this btree type */
int bc_statoff; /* offset of btree stats array */
/*
* Short btree pointers need an agno to be able to turn the pointers
* into physical addresses for IO, so the btree cursor switches between
* bc_ino and bc_ag based on whether XFS_BTREE_LONG_PTRS is set for the
* cursor.
*/
/* per-type information */
union {
struct xfs_btree_cur_ag bc_ag;
struct xfs_btree_cur_ino bc_ino;
struct {
struct xfs_inode *ip;
short forksize;
char whichfork;
struct xbtree_ifakeroot *ifake; /* for staging cursor */
} bc_ino;
struct {
struct xfs_perag *pag;
struct xfs_buf *agbp;
struct xbtree_afakeroot *afake; /* for staging cursor */
} bc_ag;
struct {
struct xfbtree *xfbtree;
struct xfs_perag *pag;
} bc_mem;
};
/* per-format private data */
union {
struct {
int allocated;
} bc_bmap; /* bmapbt */
struct {
unsigned int nr_ops; /* # record updates */
unsigned int shape_changes; /* # of extent splits */
} bc_refc; /* refcountbt */
};
/* Must be at the end of the struct! */
@@ -304,18 +314,22 @@ xfs_btree_cur_sizeof(unsigned int nlevels)
return struct_size_t(struct xfs_btree_cur, bc_levels, nlevels);
}
/* cursor flags */
#define XFS_BTREE_LONG_PTRS (1<<0) /* pointers are 64bits long */
#define XFS_BTREE_ROOT_IN_INODE (1<<1) /* root may be variable size */
#define XFS_BTREE_LASTREC_UPDATE (1<<2) /* track last rec externally */
#define XFS_BTREE_CRC_BLOCKS (1<<3) /* uses extended btree blocks */
#define XFS_BTREE_OVERLAPPING (1<<4) /* overlapping intervals */
/* cursor state flags */
/*
* The root of this btree is a fakeroot structure so that we can stage a btree
* rebuild without leaving it accessible via primary metadata. The ops struct
* is dynamically allocated and must be freed when the cursor is deleted.
*/
#define XFS_BTREE_STAGING (1<<5)
#define XFS_BTREE_STAGING (1U << 0)
/* We are converting a delalloc reservation (only for bmbt btrees) */
#define XFS_BTREE_BMBT_WASDEL (1U << 1)
/* For extent swap, ignore owner check in verifier (only for bmbt btrees) */
#define XFS_BTREE_BMBT_INVALID_OWNER (1U << 2)
/* Cursor is active (only for allocbt btrees) */
#define XFS_BTREE_ALLOCBT_ACTIVE (1U << 3)
#define XFS_BTREE_NOERROR 0
#define XFS_BTREE_ERROR 1
@@ -325,14 +339,10 @@ xfs_btree_cur_sizeof(unsigned int nlevels)
*/
#define XFS_BUF_TO_BLOCK(bp) ((struct xfs_btree_block *)((bp)->b_addr))
/*
* Internal long and short btree block checks. They return NULL if the
* block is ok or the address of the failed check otherwise.
*/
xfs_failaddr_t __xfs_btree_check_lblock(struct xfs_btree_cur *cur,
struct xfs_btree_block *block, int level, struct xfs_buf *bp);
xfs_failaddr_t __xfs_btree_check_sblock(struct xfs_btree_cur *cur,
xfs_failaddr_t __xfs_btree_check_block(struct xfs_btree_cur *cur,
struct xfs_btree_block *block, int level, struct xfs_buf *bp);
int __xfs_btree_check_ptr(struct xfs_btree_cur *cur,
const union xfs_btree_ptr *ptr, int index, int level);
/*
* Check that block header is ok.
@@ -344,24 +354,6 @@ xfs_btree_check_block(
int level, /* level of the btree block */
struct xfs_buf *bp); /* buffer containing block, if any */
/*
* Check that (long) pointer is ok.
*/
bool /* error (0 or EFSCORRUPTED) */
xfs_btree_check_lptr(
struct xfs_btree_cur *cur, /* btree cursor */
xfs_fsblock_t fsbno, /* btree block disk address */
int level); /* btree block level */
/*
* Check that (short) pointer is ok.
*/
bool /* error (0 or EFSCORRUPTED) */
xfs_btree_check_sptr(
struct xfs_btree_cur *cur, /* btree cursor */
xfs_agblock_t agbno, /* btree block disk address */
int level); /* btree block level */
/*
* Delete the btree cursor.
*/
@@ -391,64 +383,15 @@ xfs_btree_offsets(
int *first, /* output: first byte offset */
int *last); /* output: last byte offset */
/*
* Get a buffer for the block, return it read in.
* Long-form addressing.
*/
int /* error */
xfs_btree_read_bufl(
struct xfs_mount *mp, /* file system mount point */
struct xfs_trans *tp, /* transaction pointer */
xfs_fsblock_t fsbno, /* file system block number */
struct xfs_buf **bpp, /* buffer for fsbno */
int refval, /* ref count value for buffer */
const struct xfs_buf_ops *ops);
/*
* Read-ahead the block, don't wait for it, don't return a buffer.
* Long-form addressing.
*/
void /* error */
xfs_btree_reada_bufl(
struct xfs_mount *mp, /* file system mount point */
xfs_fsblock_t fsbno, /* file system block number */
xfs_extlen_t count, /* count of filesystem blocks */
const struct xfs_buf_ops *ops);
/*
* Read-ahead the block, don't wait for it, don't return a buffer.
* Short-form addressing.
*/
void /* error */
xfs_btree_reada_bufs(
struct xfs_mount *mp, /* file system mount point */
xfs_agnumber_t agno, /* allocation group number */
xfs_agblock_t agbno, /* allocation group block number */
xfs_extlen_t count, /* count of filesystem blocks */
const struct xfs_buf_ops *ops);
/*
* Initialise a new btree block header
*/
void
xfs_btree_init_block(
struct xfs_mount *mp,
struct xfs_buf *bp,
xfs_btnum_t btnum,
__u16 level,
__u16 numrecs,
__u64 owner);
void
xfs_btree_init_block_int(
struct xfs_mount *mp,
struct xfs_btree_block *buf,
xfs_daddr_t blkno,
xfs_btnum_t btnum,
__u16 level,
__u16 numrecs,
__u64 owner,
unsigned int flags);
void xfs_btree_init_buf(struct xfs_mount *mp, struct xfs_buf *bp,
const struct xfs_btree_ops *ops, __u16 level, __u16 numrecs,
__u64 owner);
void xfs_btree_init_block(struct xfs_mount *mp,
struct xfs_btree_block *buf, const struct xfs_btree_ops *ops,
__u16 level, __u16 numrecs, __u64 owner);
/*
* Common btree core entry points.
@@ -467,10 +410,10 @@ int xfs_btree_change_owner(struct xfs_btree_cur *cur, uint64_t new_owner,
/*
* btree block CRC helpers
*/
void xfs_btree_lblock_calc_crc(struct xfs_buf *);
bool xfs_btree_lblock_verify_crc(struct xfs_buf *);
void xfs_btree_sblock_calc_crc(struct xfs_buf *);
bool xfs_btree_sblock_verify_crc(struct xfs_buf *);
void xfs_btree_fsblock_calc_crc(struct xfs_buf *);
bool xfs_btree_fsblock_verify_crc(struct xfs_buf *);
void xfs_btree_agblock_calc_crc(struct xfs_buf *);
bool xfs_btree_agblock_verify_crc(struct xfs_buf *);
/*
* Internal btree helpers also used by xfs_bmap.c.
@@ -510,12 +453,14 @@ static inline int xfs_btree_get_level(const struct xfs_btree_block *block)
#define XFS_FILBLKS_MIN(a,b) min_t(xfs_filblks_t, (a), (b))
#define XFS_FILBLKS_MAX(a,b) max_t(xfs_filblks_t, (a), (b))
xfs_failaddr_t xfs_btree_sblock_v5hdr_verify(struct xfs_buf *bp);
xfs_failaddr_t xfs_btree_sblock_verify(struct xfs_buf *bp,
xfs_failaddr_t xfs_btree_agblock_v5hdr_verify(struct xfs_buf *bp);
xfs_failaddr_t xfs_btree_agblock_verify(struct xfs_buf *bp,
unsigned int max_recs);
xfs_failaddr_t xfs_btree_lblock_v5hdr_verify(struct xfs_buf *bp,
xfs_failaddr_t xfs_btree_fsblock_v5hdr_verify(struct xfs_buf *bp,
uint64_t owner);
xfs_failaddr_t xfs_btree_lblock_verify(struct xfs_buf *bp,
xfs_failaddr_t xfs_btree_fsblock_verify(struct xfs_buf *bp,
unsigned int max_recs);
xfs_failaddr_t xfs_btree_memblock_verify(struct xfs_buf *bp,
unsigned int max_recs);
unsigned int xfs_btree_compute_maxlevels(const unsigned int *limits,
@@ -690,7 +635,7 @@ xfs_btree_islastblock(
block = xfs_btree_get_block(cur, level, &bp);
if (cur->bc_flags & XFS_BTREE_LONG_PTRS)
if (cur->bc_ops->ptr_len == XFS_BTREE_LONG_PTR_LEN)
return block->bb_u.l.bb_rightsib == cpu_to_be64(NULLFSBLOCK);
return block->bb_u.s.bb_rightsib == cpu_to_be32(NULLAGBLOCK);
}
@@ -714,21 +659,28 @@ void xfs_btree_copy_ptrs(struct xfs_btree_cur *cur,
void xfs_btree_copy_keys(struct xfs_btree_cur *cur,
union xfs_btree_key *dst_key,
const union xfs_btree_key *src_key, int numkeys);
void xfs_btree_init_ptr_from_cur(struct xfs_btree_cur *cur,
union xfs_btree_ptr *ptr);
static inline struct xfs_btree_cur *
xfs_btree_alloc_cursor(
struct xfs_mount *mp,
struct xfs_trans *tp,
xfs_btnum_t btnum,
const struct xfs_btree_ops *ops,
uint8_t maxlevels,
struct kmem_cache *cache)
{
struct xfs_btree_cur *cur;
cur = kmem_cache_zalloc(cache, GFP_NOFS | __GFP_NOFAIL);
ASSERT(ops->ptr_len == XFS_BTREE_LONG_PTR_LEN ||
ops->ptr_len == XFS_BTREE_SHORT_PTR_LEN);
/* BMBT allocations can come through from non-transactional context. */
cur = kmem_cache_zalloc(cache,
GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL);
cur->bc_ops = ops;
cur->bc_tp = tp;
cur->bc_mp = mp;
cur->bc_btnum = btnum;
cur->bc_maxlevels = maxlevels;
cur->bc_cache = cache;
@@ -740,4 +692,14 @@ void xfs_btree_destroy_cur_caches(void);
int xfs_btree_goto_left_edge(struct xfs_btree_cur *cur);
/* Does this level of the cursor point to the inode root (and not a block)? */
static inline bool
xfs_btree_at_iroot(
const struct xfs_btree_cur *cur,
int level)
{
return cur->bc_ops->type == XFS_BTREE_TYPE_INODE &&
level == cur->bc_nlevels - 1;
}
#endif /* __XFS_BTREE_H__ */

View File

@@ -0,0 +1,347 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2021-2024 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <djwong@kernel.org>
*/
#include "xfs.h"
#include "xfs_fs.h"
#include "xfs_shared.h"
#include "xfs_format.h"
#include "xfs_log_format.h"
#include "xfs_trans_resv.h"
#include "xfs_mount.h"
#include "xfs_trans.h"
#include "xfs_btree.h"
#include "xfs_error.h"
#include "xfs_buf_mem.h"
#include "xfs_btree_mem.h"
#include "xfs_ag.h"
#include "xfs_buf_item.h"
#include "xfs_trace.h"
/* Set the root of an in-memory btree. */
void
xfbtree_set_root(
struct xfs_btree_cur *cur,
const union xfs_btree_ptr *ptr,
int inc)
{
ASSERT(cur->bc_ops->type == XFS_BTREE_TYPE_MEM);
cur->bc_mem.xfbtree->root = *ptr;
cur->bc_mem.xfbtree->nlevels += inc;
}
/* Initialize a pointer from the in-memory btree header. */
void
xfbtree_init_ptr_from_cur(
struct xfs_btree_cur *cur,
union xfs_btree_ptr *ptr)
{
ASSERT(cur->bc_ops->type == XFS_BTREE_TYPE_MEM);
*ptr = cur->bc_mem.xfbtree->root;
}
/* Duplicate an in-memory btree cursor. */
struct xfs_btree_cur *
xfbtree_dup_cursor(
struct xfs_btree_cur *cur)
{
struct xfs_btree_cur *ncur;
ASSERT(cur->bc_ops->type == XFS_BTREE_TYPE_MEM);
ncur = xfs_btree_alloc_cursor(cur->bc_mp, cur->bc_tp, cur->bc_ops,
cur->bc_maxlevels, cur->bc_cache);
ncur->bc_flags = cur->bc_flags;
ncur->bc_nlevels = cur->bc_nlevels;
ncur->bc_mem.xfbtree = cur->bc_mem.xfbtree;
if (cur->bc_mem.pag)
ncur->bc_mem.pag = xfs_perag_hold(cur->bc_mem.pag);
return ncur;
}
/* Close the btree xfile and release all resources. */
void
xfbtree_destroy(
struct xfbtree *xfbt)
{
xfs_buftarg_drain(xfbt->target);
}
/* Compute the number of bytes available for records. */
static inline unsigned int
xfbtree_rec_bytes(
struct xfs_mount *mp,
const struct xfs_btree_ops *ops)
{
return XMBUF_BLOCKSIZE - XFS_BTREE_LBLOCK_CRC_LEN;
}
/* Initialize an empty leaf block as the btree root. */
STATIC int
xfbtree_init_leaf_block(
struct xfs_mount *mp,
struct xfbtree *xfbt,
const struct xfs_btree_ops *ops)
{
struct xfs_buf *bp;
xfbno_t bno = xfbt->highest_bno++;
int error;
error = xfs_buf_get(xfbt->target, xfbno_to_daddr(bno), XFBNO_BBSIZE,
&bp);
if (error)
return error;
trace_xfbtree_create_root_buf(xfbt, bp);
bp->b_ops = ops->buf_ops;
xfs_btree_init_buf(mp, bp, ops, 0, 0, xfbt->owner);
xfs_buf_relse(bp);
xfbt->root.l = cpu_to_be64(bno);
return 0;
}
/*
* Create an in-memory btree root that can be used with the given xmbuf.
* Callers must set xfbt->owner.
*/
int
xfbtree_init(
struct xfs_mount *mp,
struct xfbtree *xfbt,
struct xfs_buftarg *btp,
const struct xfs_btree_ops *ops)
{
unsigned int blocklen = xfbtree_rec_bytes(mp, ops);
unsigned int keyptr_len;
int error;
/* Requires a long-format CRC-format btree */
if (!xfs_has_crc(mp)) {
ASSERT(xfs_has_crc(mp));
return -EINVAL;
}
if (ops->ptr_len != XFS_BTREE_LONG_PTR_LEN) {
ASSERT(ops->ptr_len == XFS_BTREE_LONG_PTR_LEN);
return -EINVAL;
}
memset(xfbt, 0, sizeof(*xfbt));
xfbt->target = btp;
/* Set up min/maxrecs for this btree. */
keyptr_len = ops->key_len + sizeof(__be64);
xfbt->maxrecs[0] = blocklen / ops->rec_len;
xfbt->maxrecs[1] = blocklen / keyptr_len;
xfbt->minrecs[0] = xfbt->maxrecs[0] / 2;
xfbt->minrecs[1] = xfbt->maxrecs[1] / 2;
xfbt->highest_bno = 0;
xfbt->nlevels = 1;
/* Initialize the empty btree. */
error = xfbtree_init_leaf_block(mp, xfbt, ops);
if (error)
goto err_freesp;
trace_xfbtree_init(mp, xfbt, ops);
return 0;
err_freesp:
xfs_buftarg_drain(xfbt->target);
return error;
}
/* Allocate a block to our in-memory btree. */
int
xfbtree_alloc_block(
struct xfs_btree_cur *cur,
const union xfs_btree_ptr *start,
union xfs_btree_ptr *new,
int *stat)
{
struct xfbtree *xfbt = cur->bc_mem.xfbtree;
xfbno_t bno = xfbt->highest_bno++;
ASSERT(cur->bc_ops->type == XFS_BTREE_TYPE_MEM);
trace_xfbtree_alloc_block(xfbt, cur, bno);
/* Fail if the block address exceeds the maximum for the buftarg. */
if (!xfbtree_verify_bno(xfbt, bno)) {
ASSERT(xfbtree_verify_bno(xfbt, bno));
*stat = 0;
return 0;
}
new->l = cpu_to_be64(bno);
*stat = 1;
return 0;
}
/* Free a block from our in-memory btree. */
int
xfbtree_free_block(
struct xfs_btree_cur *cur,
struct xfs_buf *bp)
{
struct xfbtree *xfbt = cur->bc_mem.xfbtree;
xfs_daddr_t daddr = xfs_buf_daddr(bp);
xfbno_t bno = xfs_daddr_to_xfbno(daddr);
ASSERT(cur->bc_ops->type == XFS_BTREE_TYPE_MEM);
trace_xfbtree_free_block(xfbt, cur, bno);
if (bno + 1 == xfbt->highest_bno)
xfbt->highest_bno--;
return 0;
}
/* Return the minimum number of records for a btree block. */
int
xfbtree_get_minrecs(
struct xfs_btree_cur *cur,
int level)
{
struct xfbtree *xfbt = cur->bc_mem.xfbtree;
return xfbt->minrecs[level != 0];
}
/* Return the maximum number of records for a btree block. */
int
xfbtree_get_maxrecs(
struct xfs_btree_cur *cur,
int level)
{
struct xfbtree *xfbt = cur->bc_mem.xfbtree;
return xfbt->maxrecs[level != 0];
}
/* If this log item is a buffer item that came from the xfbtree, return it. */
static inline struct xfs_buf *
xfbtree_buf_match(
struct xfbtree *xfbt,
const struct xfs_log_item *lip)
{
const struct xfs_buf_log_item *bli;
struct xfs_buf *bp;
if (lip->li_type != XFS_LI_BUF)
return NULL;
bli = container_of(lip, struct xfs_buf_log_item, bli_item);
bp = bli->bli_buf;
if (bp->b_target != xfbt->target)
return NULL;
return bp;
}
/*
* Commit changes to the incore btree immediately by writing all dirty xfbtree
* buffers to the backing xfile. This detaches all xfbtree buffers from the
* transaction, even on failure. The buffer locks are dropped between the
* delwri queue and submit, so the caller must synchronize btree access.
*
* Normally we'd let the buffers commit with the transaction and get written to
* the xfile via the log, but online repair stages ephemeral btrees in memory
* and uses the btree_staging functions to write new btrees to disk atomically.
* The in-memory btree (and its backing store) are discarded at the end of the
* repair phase, which means that xfbtree buffers cannot commit with the rest
* of a transaction.
*
* In other words, online repair only needs the transaction to collect buffer
* pointers and to avoid buffer deadlocks, not to guarantee consistency of
* updates.
*/
int
xfbtree_trans_commit(
struct xfbtree *xfbt,
struct xfs_trans *tp)
{
struct xfs_log_item *lip, *n;
bool tp_dirty = false;
int error = 0;
/*
* For each xfbtree buffer attached to the transaction, write the dirty
* buffers to the xfile and release them.
*/
list_for_each_entry_safe(lip, n, &tp->t_items, li_trans) {
struct xfs_buf *bp = xfbtree_buf_match(xfbt, lip);
if (!bp) {
if (test_bit(XFS_LI_DIRTY, &lip->li_flags))
tp_dirty |= true;
continue;
}
trace_xfbtree_trans_commit_buf(xfbt, bp);
xmbuf_trans_bdetach(tp, bp);
/*
* If the buffer fails verification, note the failure but
* continue walking the transaction items so that we remove all
* ephemeral btree buffers.
*/
if (!error)
error = xmbuf_finalize(bp);
xfs_buf_relse(bp);
}
/*
* Reset the transaction's dirty flag to reflect the dirty state of the
* log items that are still attached.
*/
tp->t_flags = (tp->t_flags & ~XFS_TRANS_DIRTY) |
(tp_dirty ? XFS_TRANS_DIRTY : 0);
return error;
}
/*
* Cancel changes to the incore btree by detaching all the xfbtree buffers.
* Changes are not undone, so callers must not access the btree ever again.
*/
void
xfbtree_trans_cancel(
struct xfbtree *xfbt,
struct xfs_trans *tp)
{
struct xfs_log_item *lip, *n;
bool tp_dirty = false;
list_for_each_entry_safe(lip, n, &tp->t_items, li_trans) {
struct xfs_buf *bp = xfbtree_buf_match(xfbt, lip);
if (!bp) {
if (test_bit(XFS_LI_DIRTY, &lip->li_flags))
tp_dirty |= true;
continue;
}
trace_xfbtree_trans_cancel_buf(xfbt, bp);
xmbuf_trans_bdetach(tp, bp);
xfs_buf_relse(bp);
}
/*
* Reset the transaction's dirty flag to reflect the dirty state of the
* log items that are still attached.
*/
tp->t_flags = (tp->t_flags & ~XFS_TRANS_DIRTY) |
(tp_dirty ? XFS_TRANS_DIRTY : 0);
}

Some files were not shown because too many files have changed in this diff Show More