You've already forked linux-apfs
mirror of
https://github.com/linux-apfs/linux-apfs.git
synced 2026-05-01 15:00:59 -07:00
Merge tag 'xfs-reflink-for-linus-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs
< XFS has gained super CoW powers! >
----------------------------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||
Pull XFS support for shared data extents from Dave Chinner:
"This is the second part of the XFS updates for this merge cycle. This
pullreq contains the new shared data extents feature for XFS.
Given the complexity and size of this change I am expecting - like the
addition of reverse mapping last cycle - that there will be some
follow-up bug fixes and cleanups around the -rc3 stage for issues that
I'm sure will show up once the code hits a wider userbase.
What it is:
At the most basic level we are simply adding shared data extents to
XFS - i.e. a single extent on disk can now have multiple owners. To do
this we have to add new on-disk features to both track the shared
extents and the number of times they've been shared. This is done by
the new "refcount" btree that sits in every allocation group. When we
share or unshare an extent, this tree gets updated.
Along with this new tree, the reverse mapping tree needs to be updated
to track each owner or a shared extent. This also needs to be updated
ever share/unshare operation. These interactions at extent allocation
and freeing time have complex ordering and recovery constraints, so
there's a significant amount of new intent-based transaction code to
ensure that operations are performed atomically from both the runtime
and integrity/crash recovery perspectives.
We also need to break sharing when writes hit a shared extent - this
is where the new copy-on-write implementation comes in. We allocate
new storage and copy the original data along with the overwrite data
into the new location. We only do this for data as we don't share
metadata at all - each inode has it's own metadata that tracks the
shared data extents, the extents undergoing CoW and it's own private
extents.
Of course, being XFS, nothing is simple - we use delayed allocation
for CoW similar to how we use it for normal writes. ENOSPC is a
significant issue here - we build on the reservation code added in
4.8-rc1 with the reverse mapping feature to ensure we don't get
spurious ENOSPC issues part way through a CoW operation. These
mechanisms also help minimise fragmentation due to repeated CoW
operations. To further reduce fragmentation overhead, we've also
introduced a CoW extent size hint, which indicates how large a region
we should allocate when we execute a CoW operation.
With all this functionality in place, we can hook up .copy_file_range,
.clone_file_range and .dedupe_file_range and we gain all the
capabilities of reflink and other vfs provided functionality that
enable manipulation to shared extents. We also added a fallocate mode
that explicitly unshares a range of a file, which we implemented as an
explicit CoW of all the shared extents in a file.
As such, it's a huge chunk of new functionality with new on-disk
format features and internal infrastructure. It warns at mount time as
an experimental feature and that it may eat data (as we do with all
new on-disk features until they stabilise). We have not released
userspace suport for it yet - userspace support currently requires
download from Darrick's xfsprogs repo and build from source, so the
access to this feature is really developer/tester only at this point.
Initial userspace support will be released at the same time the kernel
with this code in it is released.
The new code causes 5-6 new failures with xfstests - these aren't
serious functional failures but things the output of tests changing
slightly due to perturbations in layouts, space usage, etc. OTOH,
we've added 150+ new tests to xfstests that specifically exercise this
new functionality so it's got far better test coverage than any
functionality we've previously added to XFS.
Darrick has done a pretty amazing job getting us to this stage, and
special mention also needs to go to Christoph (review, testing,
improvements and bug fixes) and Brian (caught several intricate bugs
during review) for the effort they've also put in.
Summary:
- unshare range (FALLOC_FL_UNSHARE) support for fallocate
- copy-on-write extent size hints (FS_XFLAG_COWEXTSIZE) for fsxattr
interface
- shared extent support for XFS
- copy-on-write support for shared extents
- copy_file_range support
- clone_file_range support (implements reflink)
- dedupe_file_range support
- defrag support for reverse mapping enabled filesystems"
* tag 'xfs-reflink-for-linus-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (71 commits)
xfs: convert COW blocks to real blocks before unwritten extent conversion
xfs: rework refcount cow recovery error handling
xfs: clear reflink flag if setting realtime flag
xfs: fix error initialization
xfs: fix label inaccuracies
xfs: remove isize check from unshare operation
xfs: reduce stack usage of _reflink_clear_inode_flag
xfs: check inode reflink flag before calling reflink functions
xfs: implement swapext for rmap filesystems
xfs: refactor swapext code
xfs: various swapext cleanups
xfs: recognize the reflink feature bit
xfs: simulate per-AG reservations being critically low
xfs: don't mix reflink and DAX mode for now
xfs: check for invalid inode reflink flags
xfs: set a default CoW extent size of 32 blocks
xfs: convert unwritten status of reverse mappings for shared files
xfs: use interval query for rmap alloc operations on shared files
xfs: add shared rmap map/unmap/convert log item types
xfs: increase log reservations for reflink
...
This commit is contained in:
@@ -267,6 +267,11 @@ int vfs_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
|
||||
(mode & ~FALLOC_FL_INSERT_RANGE))
|
||||
return -EINVAL;
|
||||
|
||||
/* Unshare range should only be used with allocate mode. */
|
||||
if ((mode & FALLOC_FL_UNSHARE_RANGE) &&
|
||||
(mode & ~(FALLOC_FL_UNSHARE_RANGE | FALLOC_FL_KEEP_SIZE)))
|
||||
return -EINVAL;
|
||||
|
||||
if (!(file->f_mode & FMODE_WRITE))
|
||||
return -EBADF;
|
||||
|
||||
|
||||
@@ -55,6 +55,8 @@ xfs-y += $(addprefix libxfs/, \
|
||||
xfs_ag_resv.o \
|
||||
xfs_rmap.o \
|
||||
xfs_rmap_btree.o \
|
||||
xfs_refcount.o \
|
||||
xfs_refcount_btree.o \
|
||||
xfs_sb.o \
|
||||
xfs_symlink_remote.o \
|
||||
xfs_trans_resv.o \
|
||||
@@ -88,6 +90,7 @@ xfs-y += xfs_aops.o \
|
||||
xfs_message.o \
|
||||
xfs_mount.o \
|
||||
xfs_mru_cache.o \
|
||||
xfs_reflink.o \
|
||||
xfs_stats.o \
|
||||
xfs_super.o \
|
||||
xfs_symlink.o \
|
||||
@@ -100,16 +103,20 @@ xfs-y += xfs_aops.o \
|
||||
# low-level transaction/log code
|
||||
xfs-y += xfs_log.o \
|
||||
xfs_log_cil.o \
|
||||
xfs_bmap_item.o \
|
||||
xfs_buf_item.o \
|
||||
xfs_extfree_item.o \
|
||||
xfs_icreate_item.o \
|
||||
xfs_inode_item.o \
|
||||
xfs_refcount_item.o \
|
||||
xfs_rmap_item.o \
|
||||
xfs_log_recover.o \
|
||||
xfs_trans_ail.o \
|
||||
xfs_trans_bmap.o \
|
||||
xfs_trans_buf.o \
|
||||
xfs_trans_extfree.o \
|
||||
xfs_trans_inode.o \
|
||||
xfs_trans_refcount.o \
|
||||
xfs_trans_rmap.o \
|
||||
|
||||
# optional features
|
||||
|
||||
@@ -38,6 +38,7 @@
|
||||
#include "xfs_trans_space.h"
|
||||
#include "xfs_rmap_btree.h"
|
||||
#include "xfs_btree.h"
|
||||
#include "xfs_refcount_btree.h"
|
||||
|
||||
/*
|
||||
* Per-AG Block Reservations
|
||||
@@ -108,7 +109,9 @@ xfs_ag_resv_critical(
|
||||
trace_xfs_ag_resv_critical(pag, type, avail);
|
||||
|
||||
/* Critically low if less than 10% or max btree height remains. */
|
||||
return avail < orig / 10 || avail < XFS_BTREE_MAXLEVELS;
|
||||
return XFS_TEST_ERROR(avail < orig / 10 || avail < XFS_BTREE_MAXLEVELS,
|
||||
pag->pag_mount, XFS_ERRTAG_AG_RESV_CRITICAL,
|
||||
XFS_RANDOM_AG_RESV_CRITICAL);
|
||||
}
|
||||
|
||||
/*
|
||||
@@ -228,6 +231,11 @@ xfs_ag_resv_init(
|
||||
if (pag->pag_meta_resv.ar_asked == 0) {
|
||||
ask = used = 0;
|
||||
|
||||
error = xfs_refcountbt_calc_reserves(pag->pag_mount,
|
||||
pag->pag_agno, &ask, &used);
|
||||
if (error)
|
||||
goto out;
|
||||
|
||||
error = __xfs_ag_resv_init(pag, XFS_AG_RESV_METADATA,
|
||||
ask, used);
|
||||
if (error)
|
||||
@@ -238,6 +246,11 @@ xfs_ag_resv_init(
|
||||
if (pag->pag_agfl_resv.ar_asked == 0) {
|
||||
ask = used = 0;
|
||||
|
||||
error = xfs_rmapbt_calc_reserves(pag->pag_mount, pag->pag_agno,
|
||||
&ask, &used);
|
||||
if (error)
|
||||
goto out;
|
||||
|
||||
error = __xfs_ag_resv_init(pag, XFS_AG_RESV_AGFL, ask, used);
|
||||
if (error)
|
||||
goto out;
|
||||
|
||||
@@ -52,10 +52,23 @@ STATIC int xfs_alloc_ag_vextent_size(xfs_alloc_arg_t *);
|
||||
STATIC int xfs_alloc_ag_vextent_small(xfs_alloc_arg_t *,
|
||||
xfs_btree_cur_t *, xfs_agblock_t *, xfs_extlen_t *, int *);
|
||||
|
||||
unsigned int
|
||||
xfs_refc_block(
|
||||
struct xfs_mount *mp)
|
||||
{
|
||||
if (xfs_sb_version_hasrmapbt(&mp->m_sb))
|
||||
return XFS_RMAP_BLOCK(mp) + 1;
|
||||
if (xfs_sb_version_hasfinobt(&mp->m_sb))
|
||||
return XFS_FIBT_BLOCK(mp) + 1;
|
||||
return XFS_IBT_BLOCK(mp) + 1;
|
||||
}
|
||||
|
||||
xfs_extlen_t
|
||||
xfs_prealloc_blocks(
|
||||
struct xfs_mount *mp)
|
||||
{
|
||||
if (xfs_sb_version_hasreflink(&mp->m_sb))
|
||||
return xfs_refc_block(mp) + 1;
|
||||
if (xfs_sb_version_hasrmapbt(&mp->m_sb))
|
||||
return XFS_RMAP_BLOCK(mp) + 1;
|
||||
if (xfs_sb_version_hasfinobt(&mp->m_sb))
|
||||
@@ -115,6 +128,8 @@ xfs_alloc_ag_max_usable(
|
||||
blocks++; /* finobt root block */
|
||||
if (xfs_sb_version_hasrmapbt(&mp->m_sb))
|
||||
blocks++; /* rmap root block */
|
||||
if (xfs_sb_version_hasreflink(&mp->m_sb))
|
||||
blocks++; /* refcount root block */
|
||||
|
||||
return mp->m_sb.sb_agblocks - blocks;
|
||||
}
|
||||
@@ -2321,6 +2336,9 @@ xfs_alloc_log_agf(
|
||||
offsetof(xfs_agf_t, agf_btreeblks),
|
||||
offsetof(xfs_agf_t, agf_uuid),
|
||||
offsetof(xfs_agf_t, agf_rmap_blocks),
|
||||
offsetof(xfs_agf_t, agf_refcount_blocks),
|
||||
offsetof(xfs_agf_t, agf_refcount_root),
|
||||
offsetof(xfs_agf_t, agf_refcount_level),
|
||||
/* needed so that we don't log the whole rest of the structure: */
|
||||
offsetof(xfs_agf_t, agf_spare64),
|
||||
sizeof(xfs_agf_t)
|
||||
@@ -2458,6 +2476,10 @@ xfs_agf_verify(
|
||||
be32_to_cpu(agf->agf_btreeblks) > be32_to_cpu(agf->agf_length))
|
||||
return false;
|
||||
|
||||
if (xfs_sb_version_hasreflink(&mp->m_sb) &&
|
||||
be32_to_cpu(agf->agf_refcount_level) > XFS_BTREE_MAXLEVELS)
|
||||
return false;
|
||||
|
||||
return true;;
|
||||
|
||||
}
|
||||
@@ -2578,6 +2600,7 @@ xfs_alloc_read_agf(
|
||||
be32_to_cpu(agf->agf_levels[XFS_BTNUM_CNTi]);
|
||||
pag->pagf_levels[XFS_BTNUM_RMAPi] =
|
||||
be32_to_cpu(agf->agf_levels[XFS_BTNUM_RMAPi]);
|
||||
pag->pagf_refcount_level = be32_to_cpu(agf->agf_refcount_level);
|
||||
spin_lock_init(&pag->pagb_lock);
|
||||
pag->pagb_count = 0;
|
||||
pag->pagb_tree = RB_ROOT;
|
||||
|
||||
+534
-41
File diff suppressed because it is too large
Load Diff
@@ -97,6 +97,19 @@ struct xfs_extent_free_item
|
||||
*/
|
||||
#define XFS_BMAPI_ZERO 0x080
|
||||
|
||||
/*
|
||||
* Map the inode offset to the block given in ap->firstblock. Primarily
|
||||
* used for reflink. The range must be in a hole, and this flag cannot be
|
||||
* turned on with PREALLOC or CONVERT, and cannot be used on the attr fork.
|
||||
*
|
||||
* For bunmapi, this flag unmaps the range without adjusting quota, reducing
|
||||
* refcount, or freeing the blocks.
|
||||
*/
|
||||
#define XFS_BMAPI_REMAP 0x100
|
||||
|
||||
/* Map something in the CoW fork. */
|
||||
#define XFS_BMAPI_COWFORK 0x200
|
||||
|
||||
#define XFS_BMAPI_FLAGS \
|
||||
{ XFS_BMAPI_ENTIRE, "ENTIRE" }, \
|
||||
{ XFS_BMAPI_METADATA, "METADATA" }, \
|
||||
@@ -105,12 +118,24 @@ struct xfs_extent_free_item
|
||||
{ XFS_BMAPI_IGSTATE, "IGSTATE" }, \
|
||||
{ XFS_BMAPI_CONTIG, "CONTIG" }, \
|
||||
{ XFS_BMAPI_CONVERT, "CONVERT" }, \
|
||||
{ XFS_BMAPI_ZERO, "ZERO" }
|
||||
{ XFS_BMAPI_ZERO, "ZERO" }, \
|
||||
{ XFS_BMAPI_REMAP, "REMAP" }, \
|
||||
{ XFS_BMAPI_COWFORK, "COWFORK" }
|
||||
|
||||
|
||||
static inline int xfs_bmapi_aflag(int w)
|
||||
{
|
||||
return (w == XFS_ATTR_FORK ? XFS_BMAPI_ATTRFORK : 0);
|
||||
return (w == XFS_ATTR_FORK ? XFS_BMAPI_ATTRFORK :
|
||||
(w == XFS_COW_FORK ? XFS_BMAPI_COWFORK : 0));
|
||||
}
|
||||
|
||||
static inline int xfs_bmapi_whichfork(int bmapi_flags)
|
||||
{
|
||||
if (bmapi_flags & XFS_BMAPI_COWFORK)
|
||||
return XFS_COW_FORK;
|
||||
else if (bmapi_flags & XFS_BMAPI_ATTRFORK)
|
||||
return XFS_ATTR_FORK;
|
||||
return XFS_DATA_FORK;
|
||||
}
|
||||
|
||||
/*
|
||||
@@ -131,13 +156,15 @@ static inline int xfs_bmapi_aflag(int w)
|
||||
#define BMAP_LEFT_VALID (1 << 6)
|
||||
#define BMAP_RIGHT_VALID (1 << 7)
|
||||
#define BMAP_ATTRFORK (1 << 8)
|
||||
#define BMAP_COWFORK (1 << 9)
|
||||
|
||||
#define XFS_BMAP_EXT_FLAGS \
|
||||
{ BMAP_LEFT_CONTIG, "LC" }, \
|
||||
{ BMAP_RIGHT_CONTIG, "RC" }, \
|
||||
{ BMAP_LEFT_FILLING, "LF" }, \
|
||||
{ BMAP_RIGHT_FILLING, "RF" }, \
|
||||
{ BMAP_ATTRFORK, "ATTR" }
|
||||
{ BMAP_ATTRFORK, "ATTR" }, \
|
||||
{ BMAP_COWFORK, "COW" }
|
||||
|
||||
|
||||
/*
|
||||
@@ -186,10 +213,15 @@ int xfs_bmapi_write(struct xfs_trans *tp, struct xfs_inode *ip,
|
||||
xfs_fsblock_t *firstblock, xfs_extlen_t total,
|
||||
struct xfs_bmbt_irec *mval, int *nmap,
|
||||
struct xfs_defer_ops *dfops);
|
||||
int __xfs_bunmapi(struct xfs_trans *tp, struct xfs_inode *ip,
|
||||
xfs_fileoff_t bno, xfs_filblks_t *rlen, int flags,
|
||||
xfs_extnum_t nexts, xfs_fsblock_t *firstblock,
|
||||
struct xfs_defer_ops *dfops);
|
||||
int xfs_bunmapi(struct xfs_trans *tp, struct xfs_inode *ip,
|
||||
xfs_fileoff_t bno, xfs_filblks_t len, int flags,
|
||||
xfs_extnum_t nexts, xfs_fsblock_t *firstblock,
|
||||
struct xfs_defer_ops *dfops, int *done);
|
||||
int xfs_bunmapi_cow(struct xfs_inode *ip, struct xfs_bmbt_irec *del);
|
||||
int xfs_check_nostate_extents(struct xfs_ifork *ifp, xfs_extnum_t idx,
|
||||
xfs_extnum_t num);
|
||||
uint xfs_default_attroffset(struct xfs_inode *ip);
|
||||
@@ -203,8 +235,31 @@ struct xfs_bmbt_rec_host *
|
||||
xfs_bmap_search_extents(struct xfs_inode *ip, xfs_fileoff_t bno,
|
||||
int fork, int *eofp, xfs_extnum_t *lastxp,
|
||||
struct xfs_bmbt_irec *gotp, struct xfs_bmbt_irec *prevp);
|
||||
int xfs_bmapi_reserve_delalloc(struct xfs_inode *ip, xfs_fileoff_t aoff,
|
||||
xfs_filblks_t len, struct xfs_bmbt_irec *got,
|
||||
struct xfs_bmbt_irec *prev, xfs_extnum_t *lastx, int eof);
|
||||
int xfs_bmapi_reserve_delalloc(struct xfs_inode *ip, int whichfork,
|
||||
xfs_fileoff_t aoff, xfs_filblks_t len,
|
||||
struct xfs_bmbt_irec *got, struct xfs_bmbt_irec *prev,
|
||||
xfs_extnum_t *lastx, int eof);
|
||||
|
||||
enum xfs_bmap_intent_type {
|
||||
XFS_BMAP_MAP = 1,
|
||||
XFS_BMAP_UNMAP,
|
||||
};
|
||||
|
||||
struct xfs_bmap_intent {
|
||||
struct list_head bi_list;
|
||||
enum xfs_bmap_intent_type bi_type;
|
||||
struct xfs_inode *bi_owner;
|
||||
int bi_whichfork;
|
||||
struct xfs_bmbt_irec bi_bmap;
|
||||
};
|
||||
|
||||
int xfs_bmap_finish_one(struct xfs_trans *tp, struct xfs_defer_ops *dfops,
|
||||
struct xfs_inode *ip, enum xfs_bmap_intent_type type,
|
||||
int whichfork, xfs_fileoff_t startoff, xfs_fsblock_t startblock,
|
||||
xfs_filblks_t blockcount, xfs_exntst_t state);
|
||||
int xfs_bmap_map_extent(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
|
||||
struct xfs_inode *ip, struct xfs_bmbt_irec *imap);
|
||||
int xfs_bmap_unmap_extent(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
|
||||
struct xfs_inode *ip, struct xfs_bmbt_irec *imap);
|
||||
|
||||
#endif /* __XFS_BMAP_H__ */
|
||||
|
||||
@@ -453,6 +453,7 @@ xfs_bmbt_alloc_block(
|
||||
|
||||
if (args.fsbno == NULLFSBLOCK) {
|
||||
args.fsbno = be64_to_cpu(start->l);
|
||||
try_another_ag:
|
||||
args.type = XFS_ALLOCTYPE_START_BNO;
|
||||
/*
|
||||
* Make sure there is sufficient room left in the AG to
|
||||
@@ -482,6 +483,22 @@ xfs_bmbt_alloc_block(
|
||||
if (error)
|
||||
goto error0;
|
||||
|
||||
/*
|
||||
* During a CoW operation, the allocation and bmbt updates occur in
|
||||
* different transactions. The mapping code tries to put new bmbt
|
||||
* blocks near extents being mapped, but the only way to guarantee this
|
||||
* is if the alloc and the mapping happen in a single transaction that
|
||||
* has a block reservation. That isn't the case here, so if we run out
|
||||
* of space we'll try again with another AG.
|
||||
*/
|
||||
if (xfs_sb_version_hasreflink(&cur->bc_mp->m_sb) &&
|
||||
args.fsbno == NULLFSBLOCK &&
|
||||
args.type == XFS_ALLOCTYPE_NEAR_BNO) {
|
||||
cur->bc_private.b.dfops->dop_low = true;
|
||||
args.fsbno = cur->bc_private.b.firstblock;
|
||||
goto try_another_ag;
|
||||
}
|
||||
|
||||
if (args.fsbno == NULLFSBLOCK && args.minleft) {
|
||||
/*
|
||||
* Could not find an AG with enough free space to satisfy
|
||||
@@ -777,6 +794,7 @@ xfs_bmbt_init_cursor(
|
||||
{
|
||||
struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, whichfork);
|
||||
struct xfs_btree_cur *cur;
|
||||
ASSERT(whichfork != XFS_COW_FORK);
|
||||
|
||||
cur = kmem_zone_zalloc(xfs_btree_cur_zone, KM_SLEEP);
|
||||
|
||||
|
||||
@@ -45,9 +45,10 @@ kmem_zone_t *xfs_btree_cur_zone;
|
||||
*/
|
||||
static const __uint32_t xfs_magics[2][XFS_BTNUM_MAX] = {
|
||||
{ XFS_ABTB_MAGIC, XFS_ABTC_MAGIC, 0, XFS_BMAP_MAGIC, XFS_IBT_MAGIC,
|
||||
XFS_FIBT_MAGIC },
|
||||
XFS_FIBT_MAGIC, 0 },
|
||||
{ XFS_ABTB_CRC_MAGIC, XFS_ABTC_CRC_MAGIC, XFS_RMAP_CRC_MAGIC,
|
||||
XFS_BMAP_CRC_MAGIC, XFS_IBT_CRC_MAGIC, XFS_FIBT_CRC_MAGIC }
|
||||
XFS_BMAP_CRC_MAGIC, XFS_IBT_CRC_MAGIC, XFS_FIBT_CRC_MAGIC,
|
||||
XFS_REFC_CRC_MAGIC }
|
||||
};
|
||||
#define xfs_btree_magic(cur) \
|
||||
xfs_magics[!!((cur)->bc_flags & XFS_BTREE_CRC_BLOCKS)][cur->bc_btnum]
|
||||
@@ -1216,6 +1217,9 @@ xfs_btree_set_refs(
|
||||
case XFS_BTNUM_RMAP:
|
||||
xfs_buf_set_ref(bp, XFS_RMAP_BTREE_REF);
|
||||
break;
|
||||
case XFS_BTNUM_REFC:
|
||||
xfs_buf_set_ref(bp, XFS_REFC_BTREE_REF);
|
||||
break;
|
||||
default:
|
||||
ASSERT(0);
|
||||
}
|
||||
|
||||
@@ -49,6 +49,7 @@ union xfs_btree_key {
|
||||
struct xfs_inobt_key inobt;
|
||||
struct xfs_rmap_key rmap;
|
||||
struct xfs_rmap_key __rmap_bigkey[2];
|
||||
struct xfs_refcount_key refc;
|
||||
};
|
||||
|
||||
union xfs_btree_rec {
|
||||
@@ -57,6 +58,7 @@ union xfs_btree_rec {
|
||||
struct xfs_alloc_rec alloc;
|
||||
struct xfs_inobt_rec inobt;
|
||||
struct xfs_rmap_rec rmap;
|
||||
struct xfs_refcount_rec refc;
|
||||
};
|
||||
|
||||
/*
|
||||
@@ -72,6 +74,7 @@ union xfs_btree_rec {
|
||||
#define XFS_BTNUM_INO ((xfs_btnum_t)XFS_BTNUM_INOi)
|
||||
#define XFS_BTNUM_FINO ((xfs_btnum_t)XFS_BTNUM_FINOi)
|
||||
#define XFS_BTNUM_RMAP ((xfs_btnum_t)XFS_BTNUM_RMAPi)
|
||||
#define XFS_BTNUM_REFC ((xfs_btnum_t)XFS_BTNUM_REFCi)
|
||||
|
||||
/*
|
||||
* For logging record fields.
|
||||
@@ -105,6 +108,7 @@ do { \
|
||||
case XFS_BTNUM_INO: __XFS_BTREE_STATS_INC(__mp, ibt, stat); break; \
|
||||
case XFS_BTNUM_FINO: __XFS_BTREE_STATS_INC(__mp, fibt, stat); break; \
|
||||
case XFS_BTNUM_RMAP: __XFS_BTREE_STATS_INC(__mp, rmap, stat); break; \
|
||||
case XFS_BTNUM_REFC: __XFS_BTREE_STATS_INC(__mp, refcbt, stat); break; \
|
||||
case XFS_BTNUM_MAX: ASSERT(0); /* fucking gcc */ ; break; \
|
||||
} \
|
||||
} while (0)
|
||||
@@ -127,6 +131,8 @@ do { \
|
||||
__XFS_BTREE_STATS_ADD(__mp, fibt, stat, val); break; \
|
||||
case XFS_BTNUM_RMAP: \
|
||||
__XFS_BTREE_STATS_ADD(__mp, rmap, stat, val); break; \
|
||||
case XFS_BTNUM_REFC: \
|
||||
__XFS_BTREE_STATS_ADD(__mp, refcbt, stat, val); break; \
|
||||
case XFS_BTNUM_MAX: ASSERT(0); /* fucking gcc */ ; break; \
|
||||
} \
|
||||
} while (0)
|
||||
@@ -217,6 +223,15 @@ union xfs_btree_irec {
|
||||
struct xfs_bmbt_irec b;
|
||||
struct xfs_inobt_rec_incore i;
|
||||
struct xfs_rmap_irec r;
|
||||
struct xfs_refcount_irec rc;
|
||||
};
|
||||
|
||||
/* Per-AG btree private information. */
|
||||
union xfs_btree_cur_private {
|
||||
struct {
|
||||
unsigned long nr_ops; /* # record updates */
|
||||
int shape_changes; /* # of extent splits */
|
||||
} refc;
|
||||
};
|
||||
|
||||
/*
|
||||
@@ -243,6 +258,7 @@ typedef struct xfs_btree_cur
|
||||
struct xfs_buf *agbp; /* agf/agi buffer pointer */
|
||||
struct xfs_defer_ops *dfops; /* deferred updates */
|
||||
xfs_agnumber_t agno; /* ag number */
|
||||
union xfs_btree_cur_private priv;
|
||||
} a;
|
||||
struct { /* needed for BMAP */
|
||||
struct xfs_inode *ip; /* pointer to our inode */
|
||||
|
||||
@@ -51,6 +51,8 @@ struct xfs_defer_pending {
|
||||
* find all the space it needs.
|
||||
*/
|
||||
enum xfs_defer_ops_type {
|
||||
XFS_DEFER_OPS_TYPE_BMAP,
|
||||
XFS_DEFER_OPS_TYPE_REFCOUNT,
|
||||
XFS_DEFER_OPS_TYPE_RMAP,
|
||||
XFS_DEFER_OPS_TYPE_FREE,
|
||||
XFS_DEFER_OPS_TYPE_MAX,
|
||||
|
||||
@@ -456,9 +456,11 @@ xfs_sb_has_compat_feature(
|
||||
|
||||
#define XFS_SB_FEAT_RO_COMPAT_FINOBT (1 << 0) /* free inode btree */
|
||||
#define XFS_SB_FEAT_RO_COMPAT_RMAPBT (1 << 1) /* reverse map btree */
|
||||
#define XFS_SB_FEAT_RO_COMPAT_REFLINK (1 << 2) /* reflinked files */
|
||||
#define XFS_SB_FEAT_RO_COMPAT_ALL \
|
||||
(XFS_SB_FEAT_RO_COMPAT_FINOBT | \
|
||||
XFS_SB_FEAT_RO_COMPAT_RMAPBT)
|
||||
XFS_SB_FEAT_RO_COMPAT_RMAPBT | \
|
||||
XFS_SB_FEAT_RO_COMPAT_REFLINK)
|
||||
#define XFS_SB_FEAT_RO_COMPAT_UNKNOWN ~XFS_SB_FEAT_RO_COMPAT_ALL
|
||||
static inline bool
|
||||
xfs_sb_has_ro_compat_feature(
|
||||
@@ -546,6 +548,12 @@ static inline bool xfs_sb_version_hasrmapbt(struct xfs_sb *sbp)
|
||||
(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_RMAPBT);
|
||||
}
|
||||
|
||||
static inline bool xfs_sb_version_hasreflink(struct xfs_sb *sbp)
|
||||
{
|
||||
return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 &&
|
||||
(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_REFLINK);
|
||||
}
|
||||
|
||||
/*
|
||||
* end of superblock version macros
|
||||
*/
|
||||
@@ -641,14 +649,17 @@ typedef struct xfs_agf {
|
||||
uuid_t agf_uuid; /* uuid of filesystem */
|
||||
|
||||
__be32 agf_rmap_blocks; /* rmapbt blocks used */
|
||||
__be32 agf_padding; /* padding */
|
||||
__be32 agf_refcount_blocks; /* refcountbt blocks used */
|
||||
|
||||
__be32 agf_refcount_root; /* refcount tree root block */
|
||||
__be32 agf_refcount_level; /* refcount btree levels */
|
||||
|
||||
/*
|
||||
* reserve some contiguous space for future logged fields before we add
|
||||
* the unlogged fields. This makes the range logging via flags and
|
||||
* structure offsets much simpler.
|
||||
*/
|
||||
__be64 agf_spare64[15];
|
||||
__be64 agf_spare64[14];
|
||||
|
||||
/* unlogged fields, written during buffer writeback. */
|
||||
__be64 agf_lsn; /* last write sequence */
|
||||
@@ -674,8 +685,11 @@ typedef struct xfs_agf {
|
||||
#define XFS_AGF_BTREEBLKS 0x00000800
|
||||
#define XFS_AGF_UUID 0x00001000
|
||||
#define XFS_AGF_RMAP_BLOCKS 0x00002000
|
||||
#define XFS_AGF_SPARE64 0x00004000
|
||||
#define XFS_AGF_NUM_BITS 15
|
||||
#define XFS_AGF_REFCOUNT_BLOCKS 0x00004000
|
||||
#define XFS_AGF_REFCOUNT_ROOT 0x00008000
|
||||
#define XFS_AGF_REFCOUNT_LEVEL 0x00010000
|
||||
#define XFS_AGF_SPARE64 0x00020000
|
||||
#define XFS_AGF_NUM_BITS 18
|
||||
#define XFS_AGF_ALL_BITS ((1 << XFS_AGF_NUM_BITS) - 1)
|
||||
|
||||
#define XFS_AGF_FLAGS \
|
||||
@@ -693,6 +707,9 @@ typedef struct xfs_agf {
|
||||
{ XFS_AGF_BTREEBLKS, "BTREEBLKS" }, \
|
||||
{ XFS_AGF_UUID, "UUID" }, \
|
||||
{ XFS_AGF_RMAP_BLOCKS, "RMAP_BLOCKS" }, \
|
||||
{ XFS_AGF_REFCOUNT_BLOCKS, "REFCOUNT_BLOCKS" }, \
|
||||
{ XFS_AGF_REFCOUNT_ROOT, "REFCOUNT_ROOT" }, \
|
||||
{ XFS_AGF_REFCOUNT_LEVEL, "REFCOUNT_LEVEL" }, \
|
||||
{ XFS_AGF_SPARE64, "SPARE64" }
|
||||
|
||||
/* disk block (xfs_daddr_t) in the AG */
|
||||
@@ -885,7 +902,8 @@ typedef struct xfs_dinode {
|
||||
__be64 di_changecount; /* number of attribute changes */
|
||||
__be64 di_lsn; /* flush sequence */
|
||||
__be64 di_flags2; /* more random flags */
|
||||
__u8 di_pad2[16]; /* more padding for future expansion */
|
||||
__be32 di_cowextsize; /* basic cow extent size for file */
|
||||
__u8 di_pad2[12]; /* more padding for future expansion */
|
||||
|
||||
/* fields only written to during inode creation */
|
||||
xfs_timestamp_t di_crtime; /* time created */
|
||||
@@ -1041,9 +1059,14 @@ static inline void xfs_dinode_put_rdev(struct xfs_dinode *dip, xfs_dev_t rdev)
|
||||
* 16 bits of the XFS_XFLAG_s range.
|
||||
*/
|
||||
#define XFS_DIFLAG2_DAX_BIT 0 /* use DAX for this inode */
|
||||
#define XFS_DIFLAG2_REFLINK_BIT 1 /* file's blocks may be shared */
|
||||
#define XFS_DIFLAG2_COWEXTSIZE_BIT 2 /* copy on write extent size hint */
|
||||
#define XFS_DIFLAG2_DAX (1 << XFS_DIFLAG2_DAX_BIT)
|
||||
#define XFS_DIFLAG2_REFLINK (1 << XFS_DIFLAG2_REFLINK_BIT)
|
||||
#define XFS_DIFLAG2_COWEXTSIZE (1 << XFS_DIFLAG2_COWEXTSIZE_BIT)
|
||||
|
||||
#define XFS_DIFLAG2_ANY (XFS_DIFLAG2_DAX)
|
||||
#define XFS_DIFLAG2_ANY \
|
||||
(XFS_DIFLAG2_DAX | XFS_DIFLAG2_REFLINK | XFS_DIFLAG2_COWEXTSIZE)
|
||||
|
||||
/*
|
||||
* Inode number format:
|
||||
@@ -1353,7 +1376,9 @@ struct xfs_owner_info {
|
||||
#define XFS_RMAP_OWN_AG (-5ULL) /* AG freespace btree blocks */
|
||||
#define XFS_RMAP_OWN_INOBT (-6ULL) /* Inode btree blocks */
|
||||
#define XFS_RMAP_OWN_INODES (-7ULL) /* Inode chunk */
|
||||
#define XFS_RMAP_OWN_MIN (-8ULL) /* guard */
|
||||
#define XFS_RMAP_OWN_REFC (-8ULL) /* refcount tree */
|
||||
#define XFS_RMAP_OWN_COW (-9ULL) /* cow allocations */
|
||||
#define XFS_RMAP_OWN_MIN (-10ULL) /* guard */
|
||||
|
||||
#define XFS_RMAP_NON_INODE_OWNER(owner) (!!((owner) & (1ULL << 63)))
|
||||
|
||||
@@ -1433,6 +1458,62 @@ typedef __be32 xfs_rmap_ptr_t;
|
||||
XFS_FIBT_BLOCK(mp) + 1 : \
|
||||
XFS_IBT_BLOCK(mp) + 1)
|
||||
|
||||
/*
|
||||
* Reference Count Btree format definitions
|
||||
*
|
||||
*/
|
||||
#define XFS_REFC_CRC_MAGIC 0x52334643 /* 'R3FC' */
|
||||
|
||||
unsigned int xfs_refc_block(struct xfs_mount *mp);
|
||||
|
||||
/*
|
||||
* Data record/key structure
|
||||
*
|
||||
* Each record associates a range of physical blocks (starting at
|
||||
* rc_startblock and ending rc_blockcount blocks later) with a reference
|
||||
* count (rc_refcount). Extents that are being used to stage a copy on
|
||||
* write (CoW) operation are recorded in the refcount btree with a
|
||||
* refcount of 1. All other records must have a refcount > 1 and must
|
||||
* track an extent mapped only by file data forks.
|
||||
*
|
||||
* Extents with a single owner (attributes, metadata, non-shared file
|
||||
* data) are not tracked here. Free space is also not tracked here.
|
||||
* This is consistent with pre-reflink XFS.
|
||||
*/
|
||||
|
||||
/*
|
||||
* Extents that are being used to stage a copy on write are stored
|
||||
* in the refcount btree with a refcount of 1 and the upper bit set
|
||||
* on the startblock. This speeds up mount time deletion of stale
|
||||
* staging extents because they're all at the right side of the tree.
|
||||
*/
|
||||
#define XFS_REFC_COW_START ((xfs_agblock_t)(1U << 31))
|
||||
#define REFCNTBT_COWFLAG_BITLEN 1
|
||||
#define REFCNTBT_AGBLOCK_BITLEN 31
|
||||
|
||||
struct xfs_refcount_rec {
|
||||
__be32 rc_startblock; /* starting block number */
|
||||
__be32 rc_blockcount; /* count of blocks */
|
||||
__be32 rc_refcount; /* number of inodes linked here */
|
||||
};
|
||||
|
||||
struct xfs_refcount_key {
|
||||
__be32 rc_startblock; /* starting block number */
|
||||
};
|
||||
|
||||
struct xfs_refcount_irec {
|
||||
xfs_agblock_t rc_startblock; /* starting block number */
|
||||
xfs_extlen_t rc_blockcount; /* count of free blocks */
|
||||
xfs_nlink_t rc_refcount; /* number of inodes linked here */
|
||||
};
|
||||
|
||||
#define MAXREFCOUNT ((xfs_nlink_t)~0U)
|
||||
#define MAXREFCEXTLEN ((xfs_extlen_t)~0U)
|
||||
|
||||
/* btree pointer type */
|
||||
typedef __be32 xfs_refcount_ptr_t;
|
||||
|
||||
|
||||
/*
|
||||
* BMAP Btree format definitions
|
||||
*
|
||||
|
||||
@@ -81,14 +81,16 @@ struct getbmapx {
|
||||
#define BMV_IF_PREALLOC 0x4 /* rtn status BMV_OF_PREALLOC if req */
|
||||
#define BMV_IF_DELALLOC 0x8 /* rtn status BMV_OF_DELALLOC if req */
|
||||
#define BMV_IF_NO_HOLES 0x10 /* Do not return holes */
|
||||
#define BMV_IF_COWFORK 0x20 /* return CoW fork rather than data */
|
||||
#define BMV_IF_VALID \
|
||||
(BMV_IF_ATTRFORK|BMV_IF_NO_DMAPI_READ|BMV_IF_PREALLOC| \
|
||||
BMV_IF_DELALLOC|BMV_IF_NO_HOLES)
|
||||
BMV_IF_DELALLOC|BMV_IF_NO_HOLES|BMV_IF_COWFORK)
|
||||
|
||||
/* bmv_oflags values - returned for each non-header segment */
|
||||
#define BMV_OF_PREALLOC 0x1 /* segment = unwritten pre-allocation */
|
||||
#define BMV_OF_DELALLOC 0x2 /* segment = delayed allocation */
|
||||
#define BMV_OF_LAST 0x4 /* segment is the last in the file */
|
||||
#define BMV_OF_SHARED 0x8 /* segment shared with another file */
|
||||
|
||||
/*
|
||||
* Structure for XFS_IOC_FSSETDM.
|
||||
@@ -206,7 +208,8 @@ typedef struct xfs_fsop_resblks {
|
||||
#define XFS_FSOP_GEOM_FLAGS_FTYPE 0x10000 /* inode directory types */
|
||||
#define XFS_FSOP_GEOM_FLAGS_FINOBT 0x20000 /* free inode btree */
|
||||
#define XFS_FSOP_GEOM_FLAGS_SPINODES 0x40000 /* sparse inode chunks */
|
||||
#define XFS_FSOP_GEOM_FLAGS_RMAPBT 0x80000 /* Reverse mapping btree */
|
||||
#define XFS_FSOP_GEOM_FLAGS_RMAPBT 0x80000 /* reverse mapping btree */
|
||||
#define XFS_FSOP_GEOM_FLAGS_REFLINK 0x100000 /* files can share blocks */
|
||||
|
||||
/*
|
||||
* Minimum and maximum sizes need for growth checks.
|
||||
@@ -275,7 +278,8 @@ typedef struct xfs_bstat {
|
||||
#define bs_projid bs_projid_lo /* (previously just bs_projid) */
|
||||
__u16 bs_forkoff; /* inode fork offset in bytes */
|
||||
__u16 bs_projid_hi; /* higher part of project id */
|
||||
unsigned char bs_pad[10]; /* pad space, unused */
|
||||
unsigned char bs_pad[6]; /* pad space, unused */
|
||||
__u32 bs_cowextsize; /* cow extent size */
|
||||
__u32 bs_dmevmask; /* DMIG event mask */
|
||||
__u16 bs_dmstate; /* DMIG state info */
|
||||
__u16 bs_aextents; /* attribute number of extents */
|
||||
|
||||
@@ -256,6 +256,7 @@ xfs_inode_from_disk(
|
||||
to->di_crtime.t_sec = be32_to_cpu(from->di_crtime.t_sec);
|
||||
to->di_crtime.t_nsec = be32_to_cpu(from->di_crtime.t_nsec);
|
||||
to->di_flags2 = be64_to_cpu(from->di_flags2);
|
||||
to->di_cowextsize = be32_to_cpu(from->di_cowextsize);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -305,7 +306,7 @@ xfs_inode_to_disk(
|
||||
to->di_crtime.t_sec = cpu_to_be32(from->di_crtime.t_sec);
|
||||
to->di_crtime.t_nsec = cpu_to_be32(from->di_crtime.t_nsec);
|
||||
to->di_flags2 = cpu_to_be64(from->di_flags2);
|
||||
|
||||
to->di_cowextsize = cpu_to_be32(from->di_cowextsize);
|
||||
to->di_ino = cpu_to_be64(ip->i_ino);
|
||||
to->di_lsn = cpu_to_be64(lsn);
|
||||
memset(to->di_pad2, 0, sizeof(to->di_pad2));
|
||||
@@ -357,6 +358,7 @@ xfs_log_dinode_to_disk(
|
||||
to->di_crtime.t_sec = cpu_to_be32(from->di_crtime.t_sec);
|
||||
to->di_crtime.t_nsec = cpu_to_be32(from->di_crtime.t_nsec);
|
||||
to->di_flags2 = cpu_to_be64(from->di_flags2);
|
||||
to->di_cowextsize = cpu_to_be32(from->di_cowextsize);
|
||||
to->di_ino = cpu_to_be64(from->di_ino);
|
||||
to->di_lsn = cpu_to_be64(from->di_lsn);
|
||||
memcpy(to->di_pad2, from->di_pad2, sizeof(to->di_pad2));
|
||||
@@ -373,6 +375,9 @@ xfs_dinode_verify(
|
||||
struct xfs_inode *ip,
|
||||
struct xfs_dinode *dip)
|
||||
{
|
||||
uint16_t flags;
|
||||
uint64_t flags2;
|
||||
|
||||
if (dip->di_magic != cpu_to_be16(XFS_DINODE_MAGIC))
|
||||
return false;
|
||||
|
||||
@@ -389,6 +394,23 @@ xfs_dinode_verify(
|
||||
return false;
|
||||
if (!uuid_equal(&dip->di_uuid, &mp->m_sb.sb_meta_uuid))
|
||||
return false;
|
||||
|
||||
flags = be16_to_cpu(dip->di_flags);
|
||||
flags2 = be64_to_cpu(dip->di_flags2);
|
||||
|
||||
/* don't allow reflink/cowextsize if we don't have reflink */
|
||||
if ((flags2 & (XFS_DIFLAG2_REFLINK | XFS_DIFLAG2_COWEXTSIZE)) &&
|
||||
!xfs_sb_version_hasreflink(&mp->m_sb))
|
||||
return false;
|
||||
|
||||
/* don't let reflink and realtime mix */
|
||||
if ((flags2 & XFS_DIFLAG2_REFLINK) && (flags & XFS_DIFLAG_REALTIME))
|
||||
return false;
|
||||
|
||||
/* don't let reflink and dax mix */
|
||||
if ((flags2 & XFS_DIFLAG2_REFLINK) && (flags2 & XFS_DIFLAG2_DAX))
|
||||
return false;
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
|
||||
@@ -47,6 +47,7 @@ struct xfs_icdinode {
|
||||
__uint16_t di_flags; /* random flags, XFS_DIFLAG_... */
|
||||
|
||||
__uint64_t di_flags2; /* more random flags */
|
||||
__uint32_t di_cowextsize; /* basic cow extent size for file */
|
||||
|
||||
xfs_ictimestamp_t di_crtime; /* time created */
|
||||
};
|
||||
|
||||
@@ -121,6 +121,26 @@ xfs_iformat_fork(
|
||||
return -EFSCORRUPTED;
|
||||
}
|
||||
|
||||
if (unlikely(xfs_is_reflink_inode(ip) &&
|
||||
(VFS_I(ip)->i_mode & S_IFMT) != S_IFREG)) {
|
||||
xfs_warn(ip->i_mount,
|
||||
"corrupt dinode %llu, wrong file type for reflink.",
|
||||
ip->i_ino);
|
||||
XFS_CORRUPTION_ERROR("xfs_iformat(reflink)",
|
||||
XFS_ERRLEVEL_LOW, ip->i_mount, dip);
|
||||
return -EFSCORRUPTED;
|
||||
}
|
||||
|
||||
if (unlikely(xfs_is_reflink_inode(ip) &&
|
||||
(ip->i_d.di_flags & XFS_DIFLAG_REALTIME))) {
|
||||
xfs_warn(ip->i_mount,
|
||||
"corrupt dinode %llu, has reflink+realtime flag set.",
|
||||
ip->i_ino);
|
||||
XFS_CORRUPTION_ERROR("xfs_iformat(reflink)",
|
||||
XFS_ERRLEVEL_LOW, ip->i_mount, dip);
|
||||
return -EFSCORRUPTED;
|
||||
}
|
||||
|
||||
switch (VFS_I(ip)->i_mode & S_IFMT) {
|
||||
case S_IFIFO:
|
||||
case S_IFCHR:
|
||||
@@ -186,9 +206,14 @@ xfs_iformat_fork(
|
||||
XFS_ERROR_REPORT("xfs_iformat(7)", XFS_ERRLEVEL_LOW, ip->i_mount);
|
||||
return -EFSCORRUPTED;
|
||||
}
|
||||
if (error) {
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
if (xfs_is_reflink_inode(ip)) {
|
||||
ASSERT(ip->i_cowfp == NULL);
|
||||
xfs_ifork_init_cow(ip);
|
||||
}
|
||||
|
||||
if (!XFS_DFORK_Q(dip))
|
||||
return 0;
|
||||
|
||||
@@ -208,7 +233,8 @@ xfs_iformat_fork(
|
||||
XFS_CORRUPTION_ERROR("xfs_iformat(8)",
|
||||
XFS_ERRLEVEL_LOW,
|
||||
ip->i_mount, dip);
|
||||
return -EFSCORRUPTED;
|
||||
error = -EFSCORRUPTED;
|
||||
break;
|
||||
}
|
||||
|
||||
error = xfs_iformat_local(ip, dip, XFS_ATTR_FORK, size);
|
||||
@@ -226,6 +252,9 @@ xfs_iformat_fork(
|
||||
if (error) {
|
||||
kmem_zone_free(xfs_ifork_zone, ip->i_afp);
|
||||
ip->i_afp = NULL;
|
||||
if (ip->i_cowfp)
|
||||
kmem_zone_free(xfs_ifork_zone, ip->i_cowfp);
|
||||
ip->i_cowfp = NULL;
|
||||
xfs_idestroy_fork(ip, XFS_DATA_FORK);
|
||||
}
|
||||
return error;
|
||||
@@ -740,6 +769,9 @@ xfs_idestroy_fork(
|
||||
if (whichfork == XFS_ATTR_FORK) {
|
||||
kmem_zone_free(xfs_ifork_zone, ip->i_afp);
|
||||
ip->i_afp = NULL;
|
||||
} else if (whichfork == XFS_COW_FORK) {
|
||||
kmem_zone_free(xfs_ifork_zone, ip->i_cowfp);
|
||||
ip->i_cowfp = NULL;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -927,6 +959,19 @@ xfs_iext_get_ext(
|
||||
}
|
||||
}
|
||||
|
||||
/* Convert bmap state flags to an inode fork. */
|
||||
struct xfs_ifork *
|
||||
xfs_iext_state_to_fork(
|
||||
struct xfs_inode *ip,
|
||||
int state)
|
||||
{
|
||||
if (state & BMAP_COWFORK)
|
||||
return ip->i_cowfp;
|
||||
else if (state & BMAP_ATTRFORK)
|
||||
return ip->i_afp;
|
||||
return &ip->i_df;
|
||||
}
|
||||
|
||||
/*
|
||||
* Insert new item(s) into the extent records for incore inode
|
||||
* fork 'ifp'. 'count' new items are inserted at index 'idx'.
|
||||
@@ -939,7 +984,7 @@ xfs_iext_insert(
|
||||
xfs_bmbt_irec_t *new, /* items to insert */
|
||||
int state) /* type of extent conversion */
|
||||
{
|
||||
xfs_ifork_t *ifp = (state & BMAP_ATTRFORK) ? ip->i_afp : &ip->i_df;
|
||||
xfs_ifork_t *ifp = xfs_iext_state_to_fork(ip, state);
|
||||
xfs_extnum_t i; /* extent record index */
|
||||
|
||||
trace_xfs_iext_insert(ip, idx, new, state, _RET_IP_);
|
||||
@@ -1189,7 +1234,7 @@ xfs_iext_remove(
|
||||
int ext_diff, /* number of extents to remove */
|
||||
int state) /* type of extent conversion */
|
||||
{
|
||||
xfs_ifork_t *ifp = (state & BMAP_ATTRFORK) ? ip->i_afp : &ip->i_df;
|
||||
xfs_ifork_t *ifp = xfs_iext_state_to_fork(ip, state);
|
||||
xfs_extnum_t nextents; /* number of extents in file */
|
||||
int new_size; /* size of extents after removal */
|
||||
|
||||
@@ -1934,3 +1979,20 @@ xfs_iext_irec_update_extoffs(
|
||||
ifp->if_u1.if_ext_irec[i].er_extoff += ext_diff;
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Initialize an inode's copy-on-write fork.
|
||||
*/
|
||||
void
|
||||
xfs_ifork_init_cow(
|
||||
struct xfs_inode *ip)
|
||||
{
|
||||
if (ip->i_cowfp)
|
||||
return;
|
||||
|
||||
ip->i_cowfp = kmem_zone_zalloc(xfs_ifork_zone,
|
||||
KM_SLEEP | KM_NOFS);
|
||||
ip->i_cowfp->if_flags = XFS_IFEXTENTS;
|
||||
ip->i_cformat = XFS_DINODE_FMT_EXTENTS;
|
||||
ip->i_cnextents = 0;
|
||||
}
|
||||
|
||||
@@ -92,7 +92,9 @@ typedef struct xfs_ifork {
|
||||
#define XFS_IFORK_PTR(ip,w) \
|
||||
((w) == XFS_DATA_FORK ? \
|
||||
&(ip)->i_df : \
|
||||
(ip)->i_afp)
|
||||
((w) == XFS_ATTR_FORK ? \
|
||||
(ip)->i_afp : \
|
||||
(ip)->i_cowfp))
|
||||
#define XFS_IFORK_DSIZE(ip) \
|
||||
(XFS_IFORK_Q(ip) ? \
|
||||
XFS_IFORK_BOFF(ip) : \
|
||||
@@ -105,26 +107,38 @@ typedef struct xfs_ifork {
|
||||
#define XFS_IFORK_SIZE(ip,w) \
|
||||
((w) == XFS_DATA_FORK ? \
|
||||
XFS_IFORK_DSIZE(ip) : \
|
||||
XFS_IFORK_ASIZE(ip))
|
||||
((w) == XFS_ATTR_FORK ? \
|
||||
XFS_IFORK_ASIZE(ip) : \
|
||||
0))
|
||||
#define XFS_IFORK_FORMAT(ip,w) \
|
||||
((w) == XFS_DATA_FORK ? \
|
||||
(ip)->i_d.di_format : \
|
||||
(ip)->i_d.di_aformat)
|
||||
((w) == XFS_ATTR_FORK ? \
|
||||
(ip)->i_d.di_aformat : \
|
||||
(ip)->i_cformat))
|
||||
#define XFS_IFORK_FMT_SET(ip,w,n) \
|
||||
((w) == XFS_DATA_FORK ? \
|
||||
((ip)->i_d.di_format = (n)) : \
|
||||
((ip)->i_d.di_aformat = (n)))
|
||||
((w) == XFS_ATTR_FORK ? \
|
||||
((ip)->i_d.di_aformat = (n)) : \
|
||||
((ip)->i_cformat = (n))))
|
||||
#define XFS_IFORK_NEXTENTS(ip,w) \
|
||||
((w) == XFS_DATA_FORK ? \
|
||||
(ip)->i_d.di_nextents : \
|
||||
(ip)->i_d.di_anextents)
|
||||
((w) == XFS_ATTR_FORK ? \
|
||||
(ip)->i_d.di_anextents : \
|
||||
(ip)->i_cnextents))
|
||||
#define XFS_IFORK_NEXT_SET(ip,w,n) \
|
||||
((w) == XFS_DATA_FORK ? \
|
||||
((ip)->i_d.di_nextents = (n)) : \
|
||||
((ip)->i_d.di_anextents = (n)))
|
||||
((w) == XFS_ATTR_FORK ? \
|
||||
((ip)->i_d.di_anextents = (n)) : \
|
||||
((ip)->i_cnextents = (n))))
|
||||
#define XFS_IFORK_MAXEXT(ip, w) \
|
||||
(XFS_IFORK_SIZE(ip, w) / sizeof(xfs_bmbt_rec_t))
|
||||
|
||||
struct xfs_ifork *xfs_iext_state_to_fork(struct xfs_inode *ip, int state);
|
||||
|
||||
int xfs_iformat_fork(struct xfs_inode *, struct xfs_dinode *);
|
||||
void xfs_iflush_fork(struct xfs_inode *, struct xfs_dinode *,
|
||||
struct xfs_inode_log_item *, int);
|
||||
@@ -169,4 +183,6 @@ void xfs_iext_irec_update_extoffs(struct xfs_ifork *, int, int);
|
||||
|
||||
extern struct kmem_zone *xfs_ifork_zone;
|
||||
|
||||
extern void xfs_ifork_init_cow(struct xfs_inode *ip);
|
||||
|
||||
#endif /* __XFS_INODE_FORK_H__ */
|
||||
|
||||
@@ -112,7 +112,11 @@ static inline uint xlog_get_cycle(char *ptr)
|
||||
#define XLOG_REG_TYPE_ICREATE 20
|
||||
#define XLOG_REG_TYPE_RUI_FORMAT 21
|
||||
#define XLOG_REG_TYPE_RUD_FORMAT 22
|
||||
#define XLOG_REG_TYPE_MAX 22
|
||||
#define XLOG_REG_TYPE_CUI_FORMAT 23
|
||||
#define XLOG_REG_TYPE_CUD_FORMAT 24
|
||||
#define XLOG_REG_TYPE_BUI_FORMAT 25
|
||||
#define XLOG_REG_TYPE_BUD_FORMAT 26
|
||||
#define XLOG_REG_TYPE_MAX 26
|
||||
|
||||
/*
|
||||
* Flags to log operation header
|
||||
@@ -231,6 +235,10 @@ typedef struct xfs_trans_header {
|
||||
#define XFS_LI_ICREATE 0x123f
|
||||
#define XFS_LI_RUI 0x1240 /* rmap update intent */
|
||||
#define XFS_LI_RUD 0x1241
|
||||
#define XFS_LI_CUI 0x1242 /* refcount update intent */
|
||||
#define XFS_LI_CUD 0x1243
|
||||
#define XFS_LI_BUI 0x1244 /* bmbt update intent */
|
||||
#define XFS_LI_BUD 0x1245
|
||||
|
||||
#define XFS_LI_TYPE_DESC \
|
||||
{ XFS_LI_EFI, "XFS_LI_EFI" }, \
|
||||
@@ -242,7 +250,11 @@ typedef struct xfs_trans_header {
|
||||
{ XFS_LI_QUOTAOFF, "XFS_LI_QUOTAOFF" }, \
|
||||
{ XFS_LI_ICREATE, "XFS_LI_ICREATE" }, \
|
||||
{ XFS_LI_RUI, "XFS_LI_RUI" }, \
|
||||
{ XFS_LI_RUD, "XFS_LI_RUD" }
|
||||
{ XFS_LI_RUD, "XFS_LI_RUD" }, \
|
||||
{ XFS_LI_CUI, "XFS_LI_CUI" }, \
|
||||
{ XFS_LI_CUD, "XFS_LI_CUD" }, \
|
||||
{ XFS_LI_BUI, "XFS_LI_BUI" }, \
|
||||
{ XFS_LI_BUD, "XFS_LI_BUD" }
|
||||
|
||||
/*
|
||||
* Inode Log Item Format definitions.
|
||||
@@ -411,7 +423,8 @@ struct xfs_log_dinode {
|
||||
__uint64_t di_changecount; /* number of attribute changes */
|
||||
xfs_lsn_t di_lsn; /* flush sequence */
|
||||
__uint64_t di_flags2; /* more random flags */
|
||||
__uint8_t di_pad2[16]; /* more padding for future expansion */
|
||||
__uint32_t di_cowextsize; /* basic cow extent size for file */
|
||||
__uint8_t di_pad2[12]; /* more padding for future expansion */
|
||||
|
||||
/* fields only written to during inode creation */
|
||||
xfs_ictimestamp_t di_crtime; /* time created */
|
||||
@@ -622,8 +635,11 @@ struct xfs_map_extent {
|
||||
|
||||
/* rmap me_flags: upper bits are flags, lower byte is type code */
|
||||
#define XFS_RMAP_EXTENT_MAP 1
|
||||
#define XFS_RMAP_EXTENT_MAP_SHARED 2
|
||||
#define XFS_RMAP_EXTENT_UNMAP 3
|
||||
#define XFS_RMAP_EXTENT_UNMAP_SHARED 4
|
||||
#define XFS_RMAP_EXTENT_CONVERT 5
|
||||
#define XFS_RMAP_EXTENT_CONVERT_SHARED 6
|
||||
#define XFS_RMAP_EXTENT_ALLOC 7
|
||||
#define XFS_RMAP_EXTENT_FREE 8
|
||||
#define XFS_RMAP_EXTENT_TYPE_MASK 0xFF
|
||||
@@ -670,6 +686,102 @@ struct xfs_rud_log_format {
|
||||
__uint64_t rud_rui_id; /* id of corresponding rui */
|
||||
};
|
||||
|
||||
/*
|
||||
* CUI/CUD (refcount update) log format definitions
|
||||
*/
|
||||
struct xfs_phys_extent {
|
||||
__uint64_t pe_startblock;
|
||||
__uint32_t pe_len;
|
||||
__uint32_t pe_flags;
|
||||
};
|
||||
|
||||
/* refcount pe_flags: upper bits are flags, lower byte is type code */
|
||||
/* Type codes are taken directly from enum xfs_refcount_intent_type. */
|
||||
#define XFS_REFCOUNT_EXTENT_TYPE_MASK 0xFF
|
||||
|
||||
#define XFS_REFCOUNT_EXTENT_FLAGS (XFS_REFCOUNT_EXTENT_TYPE_MASK)
|
||||
|
||||
/*
|
||||
* This is the structure used to lay out a cui log item in the
|
||||
* log. The cui_extents field is a variable size array whose
|
||||
* size is given by cui_nextents.
|
||||
*/
|
||||
struct xfs_cui_log_format {
|
||||
__uint16_t cui_type; /* cui log item type */
|
||||
__uint16_t cui_size; /* size of this item */
|
||||
__uint32_t cui_nextents; /* # extents to free */
|
||||
__uint64_t cui_id; /* cui identifier */
|
||||
struct xfs_phys_extent cui_extents[]; /* array of extents */
|
||||
};
|
||||
|
||||
static inline size_t
|
||||
xfs_cui_log_format_sizeof(
|
||||
unsigned int nr)
|
||||
{
|
||||
return sizeof(struct xfs_cui_log_format) +
|
||||
nr * sizeof(struct xfs_phys_extent);
|
||||
}
|
||||
|
||||
/*
|
||||
* This is the structure used to lay out a cud log item in the
|
||||
* log. The cud_extents array is a variable size array whose
|
||||
* size is given by cud_nextents;
|
||||
*/
|
||||
struct xfs_cud_log_format {
|
||||
__uint16_t cud_type; /* cud log item type */
|
||||
__uint16_t cud_size; /* size of this item */
|
||||
__uint32_t __pad;
|
||||
__uint64_t cud_cui_id; /* id of corresponding cui */
|
||||
};
|
||||
|
||||
/*
|
||||
* BUI/BUD (inode block mapping) log format definitions
|
||||
*/
|
||||
|
||||
/* bmbt me_flags: upper bits are flags, lower byte is type code */
|
||||
/* Type codes are taken directly from enum xfs_bmap_intent_type. */
|
||||
#define XFS_BMAP_EXTENT_TYPE_MASK 0xFF
|
||||
|
||||
#define XFS_BMAP_EXTENT_ATTR_FORK (1U << 31)
|
||||
#define XFS_BMAP_EXTENT_UNWRITTEN (1U << 30)
|
||||
|
||||
#define XFS_BMAP_EXTENT_FLAGS (XFS_BMAP_EXTENT_TYPE_MASK | \
|
||||
XFS_BMAP_EXTENT_ATTR_FORK | \
|
||||
XFS_BMAP_EXTENT_UNWRITTEN)
|
||||
|
||||
/*
|
||||
* This is the structure used to lay out an bui log item in the
|
||||
* log. The bui_extents field is a variable size array whose
|
||||
* size is given by bui_nextents.
|
||||
*/
|
||||
struct xfs_bui_log_format {
|
||||
__uint16_t bui_type; /* bui log item type */
|
||||
__uint16_t bui_size; /* size of this item */
|
||||
__uint32_t bui_nextents; /* # extents to free */
|
||||
__uint64_t bui_id; /* bui identifier */
|
||||
struct xfs_map_extent bui_extents[]; /* array of extents to bmap */
|
||||
};
|
||||
|
||||
static inline size_t
|
||||
xfs_bui_log_format_sizeof(
|
||||
unsigned int nr)
|
||||
{
|
||||
return sizeof(struct xfs_bui_log_format) +
|
||||
nr * sizeof(struct xfs_map_extent);
|
||||
}
|
||||
|
||||
/*
|
||||
* This is the structure used to lay out an bud log item in the
|
||||
* log. The bud_extents array is a variable size array whose
|
||||
* size is given by bud_nextents;
|
||||
*/
|
||||
struct xfs_bud_log_format {
|
||||
__uint16_t bud_type; /* bud log item type */
|
||||
__uint16_t bud_size; /* size of this item */
|
||||
__uint32_t __pad;
|
||||
__uint64_t bud_bui_id; /* id of corresponding bui */
|
||||
};
|
||||
|
||||
/*
|
||||
* Dquot Log format definitions.
|
||||
*
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,70 @@
|
||||
/*
|
||||
* Copyright (C) 2016 Oracle. All Rights Reserved.
|
||||
*
|
||||
* Author: Darrick J. Wong <darrick.wong@oracle.com>
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU General Public License
|
||||
* as published by the Free Software Foundation; either version 2
|
||||
* of the License, or (at your option) any later version.
|
||||
*
|
||||
* This program is distributed in the hope that it would be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU General Public License
|
||||
* along with this program; if not, write the Free Software Foundation,
|
||||
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
|
||||
*/
|
||||
#ifndef __XFS_REFCOUNT_H__
|
||||
#define __XFS_REFCOUNT_H__
|
||||
|
||||
extern int xfs_refcount_lookup_le(struct xfs_btree_cur *cur,
|
||||
xfs_agblock_t bno, int *stat);
|
||||
extern int xfs_refcount_lookup_ge(struct xfs_btree_cur *cur,
|
||||
xfs_agblock_t bno, int *stat);
|
||||
extern int xfs_refcount_get_rec(struct xfs_btree_cur *cur,
|
||||
struct xfs_refcount_irec *irec, int *stat);
|
||||
|
||||
enum xfs_refcount_intent_type {
|
||||
XFS_REFCOUNT_INCREASE = 1,
|
||||
XFS_REFCOUNT_DECREASE,
|
||||
XFS_REFCOUNT_ALLOC_COW,
|
||||
XFS_REFCOUNT_FREE_COW,
|
||||
};
|
||||
|
||||
struct xfs_refcount_intent {
|
||||
struct list_head ri_list;
|
||||
enum xfs_refcount_intent_type ri_type;
|
||||
xfs_fsblock_t ri_startblock;
|
||||
xfs_extlen_t ri_blockcount;
|
||||
};
|
||||
|
||||
extern int xfs_refcount_increase_extent(struct xfs_mount *mp,
|
||||
struct xfs_defer_ops *dfops, struct xfs_bmbt_irec *irec);
|
||||
extern int xfs_refcount_decrease_extent(struct xfs_mount *mp,
|
||||
struct xfs_defer_ops *dfops, struct xfs_bmbt_irec *irec);
|
||||
|
||||
extern void xfs_refcount_finish_one_cleanup(struct xfs_trans *tp,
|
||||
struct xfs_btree_cur *rcur, int error);
|
||||
extern int xfs_refcount_finish_one(struct xfs_trans *tp,
|
||||
struct xfs_defer_ops *dfops, enum xfs_refcount_intent_type type,
|
||||
xfs_fsblock_t startblock, xfs_extlen_t blockcount,
|
||||
xfs_fsblock_t *new_fsb, xfs_extlen_t *new_len,
|
||||
struct xfs_btree_cur **pcur);
|
||||
|
||||
extern int xfs_refcount_find_shared(struct xfs_btree_cur *cur,
|
||||
xfs_agblock_t agbno, xfs_extlen_t aglen, xfs_agblock_t *fbno,
|
||||
xfs_extlen_t *flen, bool find_end_of_shared);
|
||||
|
||||
extern int xfs_refcount_alloc_cow_extent(struct xfs_mount *mp,
|
||||
struct xfs_defer_ops *dfops, xfs_fsblock_t fsb,
|
||||
xfs_extlen_t len);
|
||||
extern int xfs_refcount_free_cow_extent(struct xfs_mount *mp,
|
||||
struct xfs_defer_ops *dfops, xfs_fsblock_t fsb,
|
||||
xfs_extlen_t len);
|
||||
extern int xfs_refcount_recover_cow_leftovers(struct xfs_mount *mp,
|
||||
xfs_agnumber_t agno);
|
||||
|
||||
#endif /* __XFS_REFCOUNT_H__ */
|
||||
@@ -0,0 +1,451 @@
|
||||
/*
|
||||
* Copyright (C) 2016 Oracle. All Rights Reserved.
|
||||
*
|
||||
* Author: Darrick J. Wong <darrick.wong@oracle.com>
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or
|
||||
* modify it under the terms of the GNU General Public License
|
||||
* as published by the Free Software Foundation; either version 2
|
||||
* of the License, or (at your option) any later version.
|
||||
*
|
||||
* This program is distributed in the hope that it would be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
* You should have received a copy of the GNU General Public License
|
||||
* along with this program; if not, write the Free Software Foundation,
|
||||
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
|
||||
*/
|
||||
#include "xfs.h"
|
||||
#include "xfs_fs.h"
|
||||
#include "xfs_shared.h"
|
||||
#include "xfs_format.h"
|
||||
#include "xfs_log_format.h"
|
||||
#include "xfs_trans_resv.h"
|
||||
#include "xfs_sb.h"
|
||||
#include "xfs_mount.h"
|
||||
#include "xfs_btree.h"
|
||||
#include "xfs_bmap.h"
|
||||
#include "xfs_refcount_btree.h"
|
||||
#include "xfs_alloc.h"
|
||||
#include "xfs_error.h"
|
||||
#include "xfs_trace.h"
|
||||
#include "xfs_cksum.h"
|
||||
#include "xfs_trans.h"
|
||||
#include "xfs_bit.h"
|
||||
#include "xfs_rmap.h"
|
||||
|
||||
static struct xfs_btree_cur *
|
||||
xfs_refcountbt_dup_cursor(
|
||||
struct xfs_btree_cur *cur)
|
||||
{
|
||||
return xfs_refcountbt_init_cursor(cur->bc_mp, cur->bc_tp,
|
||||
cur->bc_private.a.agbp, cur->bc_private.a.agno,
|
||||
cur->bc_private.a.dfops);
|
||||
}
|
||||
|
||||
STATIC void
|
||||
xfs_refcountbt_set_root(
|
||||
struct xfs_btree_cur *cur,
|
||||
union xfs_btree_ptr *ptr,
|
||||
int inc)
|
||||
{
|
||||
struct xfs_buf *agbp = cur->bc_private.a.agbp;
|
||||
struct xfs_agf *agf = XFS_BUF_TO_AGF(agbp);
|
||||
xfs_agnumber_t seqno = be32_to_cpu(agf->agf_seqno);
|
||||
struct xfs_perag *pag = xfs_perag_get(cur->bc_mp, seqno);
|
||||
|
||||
ASSERT(ptr->s != 0);
|
||||
|
||||
agf->agf_refcount_root = ptr->s;
|
||||
be32_add_cpu(&agf->agf_refcount_level, inc);
|
||||
pag->pagf_refcount_level += inc;
|
||||
xfs_perag_put(pag);
|
||||
|
||||
xfs_alloc_log_agf(cur->bc_tp, agbp,
|
||||
XFS_AGF_REFCOUNT_ROOT | XFS_AGF_REFCOUNT_LEVEL);
|
||||
}
|
||||
|
||||
STATIC int
|
||||
xfs_refcountbt_alloc_block(
|
||||
struct xfs_btree_cur *cur,
|
||||
union xfs_btree_ptr *start,
|
||||
union xfs_btree_ptr *new,
|
||||
int *stat)
|
||||
{
|
||||
struct xfs_buf *agbp = cur->bc_private.a.agbp;
|
||||
struct xfs_agf *agf = XFS_BUF_TO_AGF(agbp);
|
||||
struct xfs_alloc_arg args; /* block allocation args */
|
||||
int error; /* error return value */
|
||||
|
||||
XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY);
|
||||
|
||||
memset(&args, 0, sizeof(args));
|
||||
args.tp = cur->bc_tp;
|
||||
args.mp = cur->bc_mp;
|
||||
args.type = XFS_ALLOCTYPE_NEAR_BNO;
|
||||
args.fsbno = XFS_AGB_TO_FSB(cur->bc_mp, cur->bc_private.a.agno,
|
||||
xfs_refc_block(args.mp));
|
||||
args.firstblock = args.fsbno;
|
||||
xfs_rmap_ag_owner(&args.oinfo, XFS_RMAP_OWN_REFC);
|
||||
args.minlen = args.maxlen = args.prod = 1;
|
||||
args.resv = XFS_AG_RESV_METADATA;
|
||||
|
||||
error = xfs_alloc_vextent(&args);
|
||||
if (error)
|
||||
goto out_error;
|
||||
trace_xfs_refcountbt_alloc_block(cur->bc_mp, cur->bc_private.a.agno,
|
||||
args.agbno, 1);
|
||||
if (args.fsbno == NULLFSBLOCK) {
|
||||
XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT);
|
||||
*stat = 0;
|
||||
return 0;
|
||||
}
|
||||
ASSERT(args.agno == cur->bc_private.a.agno);
|
||||
ASSERT(args.len == 1);
|
||||
|
||||
new->s = cpu_to_be32(args.agbno);
|
||||
be32_add_cpu(&agf->agf_refcount_blocks, 1);
|
||||
xfs_alloc_log_agf(cur->bc_tp, agbp, XFS_AGF_REFCOUNT_BLOCKS);
|
||||
|
||||
XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT);
|
||||
*stat = 1;
|
||||
return 0;
|
||||
|
||||
out_error:
|
||||
XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR);
|
||||
return error;
|
||||
}
|
||||
|
||||
STATIC int
|
||||
xfs_refcountbt_free_block(
|
||||
struct xfs_btree_cur *cur,
|
||||
struct xfs_buf *bp)
|
||||
{
|
||||
struct xfs_mount *mp = cur->bc_mp;
|
||||
struct xfs_buf *agbp = cur->bc_private.a.agbp;
|
||||
struct xfs_agf *agf = XFS_BUF_TO_AGF(agbp);
|
||||
xfs_fsblock_t fsbno = XFS_DADDR_TO_FSB(mp, XFS_BUF_ADDR(bp));
|
||||
struct xfs_owner_info oinfo;
|
||||
int error;
|
||||
|
||||
trace_xfs_refcountbt_free_block(cur->bc_mp, cur->bc_private.a.agno,
|
||||
XFS_FSB_TO_AGBNO(cur->bc_mp, fsbno), 1);
|
||||
xfs_rmap_ag_owner(&oinfo, XFS_RMAP_OWN_REFC);
|
||||
be32_add_cpu(&agf->agf_refcount_blocks, -1);
|
||||
xfs_alloc_log_agf(cur->bc_tp, agbp, XFS_AGF_REFCOUNT_BLOCKS);
|
||||
error = xfs_free_extent(cur->bc_tp, fsbno, 1, &oinfo,
|
||||
XFS_AG_RESV_METADATA);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
return error;
|
||||
}
|
||||
|
||||
STATIC int
|
||||
xfs_refcountbt_get_minrecs(
|
||||
struct xfs_btree_cur *cur,
|
||||
int level)
|
||||
{
|
||||
return cur->bc_mp->m_refc_mnr[level != 0];
|
||||
}
|
||||
|
||||
STATIC int
|
||||
xfs_refcountbt_get_maxrecs(
|
||||
struct xfs_btree_cur *cur,
|
||||
int level)
|
||||
{
|
||||
return cur->bc_mp->m_refc_mxr[level != 0];
|
||||
}
|
||||
|
||||
STATIC void
|
||||
xfs_refcountbt_init_key_from_rec(
|
||||
union xfs_btree_key *key,
|
||||
union xfs_btree_rec *rec)
|
||||
{
|
||||
key->refc.rc_startblock = rec->refc.rc_startblock;
|
||||
}
|
||||
|
||||
STATIC void
|
||||
xfs_refcountbt_init_high_key_from_rec(
|
||||
union xfs_btree_key *key,
|
||||
union xfs_btree_rec *rec)
|
||||
{
|
||||
__u32 x;
|
||||
|
||||
x = be32_to_cpu(rec->refc.rc_startblock);
|
||||
x += be32_to_cpu(rec->refc.rc_blockcount) - 1;
|
||||
key->refc.rc_startblock = cpu_to_be32(x);
|
||||
}
|
||||
|
||||
STATIC void
|
||||
xfs_refcountbt_init_rec_from_cur(
|
||||
struct xfs_btree_cur *cur,
|
||||
union xfs_btree_rec *rec)
|
||||
{
|
||||
rec->refc.rc_startblock = cpu_to_be32(cur->bc_rec.rc.rc_startblock);
|
||||
rec->refc.rc_blockcount = cpu_to_be32(cur->bc_rec.rc.rc_blockcount);
|
||||
rec->refc.rc_refcount = cpu_to_be32(cur->bc_rec.rc.rc_refcount);
|
||||
}
|
||||
|
||||
STATIC void
|
||||
xfs_refcountbt_init_ptr_from_cur(
|
||||
struct xfs_btree_cur *cur,
|
||||
union xfs_btree_ptr *ptr)
|
||||
{
|
||||
struct xfs_agf *agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp);
|
||||
|
||||
ASSERT(cur->bc_private.a.agno == be32_to_cpu(agf->agf_seqno));
|
||||
ASSERT(agf->agf_refcount_root != 0);
|
||||
|
||||
ptr->s = agf->agf_refcount_root;
|
||||
}
|
||||
|
||||
STATIC __int64_t
|
||||
xfs_refcountbt_key_diff(
|
||||
struct xfs_btree_cur *cur,
|
||||
union xfs_btree_key *key)
|
||||
{
|
||||
struct xfs_refcount_irec *rec = &cur->bc_rec.rc;
|
||||
struct xfs_refcount_key *kp = &key->refc;
|
||||
|
||||
return (__int64_t)be32_to_cpu(kp->rc_startblock) - rec->rc_startblock;
|
||||
}
|
||||
|
||||
STATIC __int64_t
|
||||
xfs_refcountbt_diff_two_keys(
|
||||
struct xfs_btree_cur *cur,
|
||||
union xfs_btree_key *k1,
|
||||
union xfs_btree_key *k2)
|
||||
{
|
||||
return (__int64_t)be32_to_cpu(k1->refc.rc_startblock) -
|
||||
be32_to_cpu(k2->refc.rc_startblock);
|
||||
}
|
||||
|
||||
STATIC bool
|
||||
xfs_refcountbt_verify(
|
||||
struct xfs_buf *bp)
|
||||
{
|
||||
struct xfs_mount *mp = bp->b_target->bt_mount;
|
||||
struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp);
|
||||
struct xfs_perag *pag = bp->b_pag;
|
||||
unsigned int level;
|
||||
|
||||
if (block->bb_magic != cpu_to_be32(XFS_REFC_CRC_MAGIC))
|
||||
return false;
|
||||
|
||||
if (!xfs_sb_version_hasreflink(&mp->m_sb))
|
||||
return false;
|
||||
if (!xfs_btree_sblock_v5hdr_verify(bp))
|
||||
return false;
|
||||
|
||||
level = be16_to_cpu(block->bb_level);
|
||||
if (pag && pag->pagf_init) {
|
||||
if (level >= pag->pagf_refcount_level)
|
||||
return false;
|
||||
} else if (level >= mp->m_refc_maxlevels)
|
||||
return false;
|
||||
|
||||
return xfs_btree_sblock_verify(bp, mp->m_refc_mxr[level != 0]);
|
||||
}
|
||||
|
||||
STATIC void
|
||||
xfs_refcountbt_read_verify(
|
||||
struct xfs_buf *bp)
|
||||
{
|
||||
if (!xfs_btree_sblock_verify_crc(bp))
|
||||
xfs_buf_ioerror(bp, -EFSBADCRC);
|
||||
else if (!xfs_refcountbt_verify(bp))
|
||||
xfs_buf_ioerror(bp, -EFSCORRUPTED);
|
||||
|
||||
if (bp->b_error) {
|
||||
trace_xfs_btree_corrupt(bp, _RET_IP_);
|
||||
xfs_verifier_error(bp);
|
||||
}
|
||||
}
|
||||
|
||||
STATIC void
|
||||
xfs_refcountbt_write_verify(
|
||||
struct xfs_buf *bp)
|
||||
{
|
||||
if (!xfs_refcountbt_verify(bp)) {
|
||||
trace_xfs_btree_corrupt(bp, _RET_IP_);
|
||||
xfs_buf_ioerror(bp, -EFSCORRUPTED);
|
||||
xfs_verifier_error(bp);
|
||||
return;
|
||||
}
|
||||
xfs_btree_sblock_calc_crc(bp);
|
||||
|
||||
}
|
||||
|
||||
const struct xfs_buf_ops xfs_refcountbt_buf_ops = {
|
||||
.name = "xfs_refcountbt",
|
||||
.verify_read = xfs_refcountbt_read_verify,
|
||||
.verify_write = xfs_refcountbt_write_verify,
|
||||
};
|
||||
|
||||
#if defined(DEBUG) || defined(XFS_WARN)
|
||||
STATIC int
|
||||
xfs_refcountbt_keys_inorder(
|
||||
struct xfs_btree_cur *cur,
|
||||
union xfs_btree_key *k1,
|
||||
union xfs_btree_key *k2)
|
||||
{
|
||||
return be32_to_cpu(k1->refc.rc_startblock) <
|
||||
be32_to_cpu(k2->refc.rc_startblock);
|
||||
}
|
||||
|
||||
STATIC int
|
||||
xfs_refcountbt_recs_inorder(
|
||||
struct xfs_btree_cur *cur,
|
||||
union xfs_btree_rec *r1,
|
||||
union xfs_btree_rec *r2)
|
||||
{
|
||||
return be32_to_cpu(r1->refc.rc_startblock) +
|
||||
be32_to_cpu(r1->refc.rc_blockcount) <=
|
||||
be32_to_cpu(r2->refc.rc_startblock);
|
||||
}
|
||||
#endif
|
||||
|
||||
static const struct xfs_btree_ops xfs_refcountbt_ops = {
|
||||
.rec_len = sizeof(struct xfs_refcount_rec),
|
||||
.key_len = sizeof(struct xfs_refcount_key),
|
||||
|
||||
.dup_cursor = xfs_refcountbt_dup_cursor,
|
||||
.set_root = xfs_refcountbt_set_root,
|
||||
.alloc_block = xfs_refcountbt_alloc_block,
|
||||
.free_block = xfs_refcountbt_free_block,
|
||||
.get_minrecs = xfs_refcountbt_get_minrecs,
|
||||
.get_maxrecs = xfs_refcountbt_get_maxrecs,
|
||||
.init_key_from_rec = xfs_refcountbt_init_key_from_rec,
|
||||
.init_high_key_from_rec = xfs_refcountbt_init_high_key_from_rec,
|
||||
.init_rec_from_cur = xfs_refcountbt_init_rec_from_cur,
|
||||
.init_ptr_from_cur = xfs_refcountbt_init_ptr_from_cur,
|
||||
.key_diff = xfs_refcountbt_key_diff,
|
||||
.buf_ops = &xfs_refcountbt_buf_ops,
|
||||
.diff_two_keys = xfs_refcountbt_diff_two_keys,
|
||||
#if defined(DEBUG) || defined(XFS_WARN)
|
||||
.keys_inorder = xfs_refcountbt_keys_inorder,
|
||||
.recs_inorder = xfs_refcountbt_recs_inorder,
|
||||
#endif
|
||||
};
|
||||
|
||||
/*
|
||||
* Allocate a new refcount btree cursor.
|
||||
*/
|
||||
struct xfs_btree_cur *
|
||||
xfs_refcountbt_init_cursor(
|
||||
struct xfs_mount *mp,
|
||||
struct xfs_trans *tp,
|
||||
struct xfs_buf *agbp,
|
||||
xfs_agnumber_t agno,
|
||||
struct xfs_defer_ops *dfops)
|
||||
{
|
||||
struct xfs_agf *agf = XFS_BUF_TO_AGF(agbp);
|
||||
struct xfs_btree_cur *cur;
|
||||
|
||||
ASSERT(agno != NULLAGNUMBER);
|
||||
ASSERT(agno < mp->m_sb.sb_agcount);
|
||||
cur = kmem_zone_zalloc(xfs_btree_cur_zone, KM_NOFS);
|
||||
|
||||
cur->bc_tp = tp;
|
||||
cur->bc_mp = mp;
|
||||
cur->bc_btnum = XFS_BTNUM_REFC;
|
||||
cur->bc_blocklog = mp->m_sb.sb_blocklog;
|
||||
cur->bc_ops = &xfs_refcountbt_ops;
|
||||
|
||||
cur->bc_nlevels = be32_to_cpu(agf->agf_refcount_level);
|
||||
|
||||
cur->bc_private.a.agbp = agbp;
|
||||
cur->bc_private.a.agno = agno;
|
||||
cur->bc_private.a.dfops = dfops;
|
||||
cur->bc_flags |= XFS_BTREE_CRC_BLOCKS;
|
||||
|
||||
cur->bc_private.a.priv.refc.nr_ops = 0;
|
||||
cur->bc_private.a.priv.refc.shape_changes = 0;
|
||||
|
||||
return cur;
|
||||
}
|
||||
|
||||
/*
|
||||
* Calculate the number of records in a refcount btree block.
|
||||
*/
|
||||
int
|
||||
xfs_refcountbt_maxrecs(
|
||||
struct xfs_mount *mp,
|
||||
int blocklen,
|
||||
bool leaf)
|
||||
{
|
||||
blocklen -= XFS_REFCOUNT_BLOCK_LEN;
|
||||
|
||||
if (leaf)
|
||||
return blocklen / sizeof(struct xfs_refcount_rec);
|
||||
return blocklen / (sizeof(struct xfs_refcount_key) +
|
||||
sizeof(xfs_refcount_ptr_t));
|
||||
}
|
||||
|
||||
/* Compute the maximum height of a refcount btree. */
|
||||
void
|
||||
xfs_refcountbt_compute_maxlevels(
|
||||
struct xfs_mount *mp)
|
||||
{
|
||||
mp->m_refc_maxlevels = xfs_btree_compute_maxlevels(mp,
|
||||
mp->m_refc_mnr, mp->m_sb.sb_agblocks);
|
||||
}
|
||||
|
||||
/* Calculate the refcount btree size for some records. */
|
||||
xfs_extlen_t
|
||||
xfs_refcountbt_calc_size(
|
||||
struct xfs_mount *mp,
|
||||
unsigned long long len)
|
||||
{
|
||||
return xfs_btree_calc_size(mp, mp->m_refc_mnr, len);
|
||||
}
|
||||
|
||||
/*
|
||||
* Calculate the maximum refcount btree size.
|
||||
*/
|
||||
xfs_extlen_t
|
||||
xfs_refcountbt_max_size(
|
||||
struct xfs_mount *mp)
|
||||
{
|
||||
/* Bail out if we're uninitialized, which can happen in mkfs. */
|
||||
if (mp->m_refc_mxr[0] == 0)
|
||||
return 0;
|
||||
|
||||
return xfs_refcountbt_calc_size(mp, mp->m_sb.sb_agblocks);
|
||||
}
|
||||
|
||||
/*
|
||||
* Figure out how many blocks to reserve and how many are used by this btree.
|
||||
*/
|
||||
int
|
||||
xfs_refcountbt_calc_reserves(
|
||||
struct xfs_mount *mp,
|
||||
xfs_agnumber_t agno,
|
||||
xfs_extlen_t *ask,
|
||||
xfs_extlen_t *used)
|
||||
{
|
||||
struct xfs_buf *agbp;
|
||||
struct xfs_agf *agf;
|
||||
xfs_extlen_t tree_len;
|
||||
int error;
|
||||
|
||||
if (!xfs_sb_version_hasreflink(&mp->m_sb))
|
||||
return 0;
|
||||
|
||||
*ask += xfs_refcountbt_max_size(mp);
|
||||
|
||||
error = xfs_alloc_read_agf(mp, NULL, agno, 0, &agbp);
|
||||
if (error)
|
||||
return error;
|
||||
|
||||
agf = XFS_BUF_TO_AGF(agbp);
|
||||
tree_len = be32_to_cpu(agf->agf_refcount_blocks);
|
||||
xfs_buf_relse(agbp);
|
||||
|
||||
*used += tree_len;
|
||||
|
||||
return error;
|
||||
}
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user