Obtain a reference to the crc32c driver if needed for the v2 checksum.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add in the necessary code so that journal clients can enable the new
journal checksumming features.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Define flags and allocate space in on-disk journal structures to support
checksumming of journal metadata.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Previously we were only enabling the 64-bit jbd2 feature if the number
of blocks in the file system was greater 2**32-1. The problem with
this is that it makes it harder to test the 64-bit journal code paths
with small file systems, since a small test file system would with the
64-bit ext4 feature enable would use a 64-bit file system on-disk data
structures, but use a 32-bit journal.
This would also cause problems when trying to do an online resize to
grow the filesystem above the 2**32-1 boundary. Fortunately the patch
to support online resize for 64-bit file systems hasn't been merged
yet, so this problem hasn't arisen in practice.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
None of this function callers ever pass in a NULL inode pointer, so
this check is unnecessary, and the else clause is dead code. (This
change should make the code coverage people a little happier. :-)
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
metadata_csum supersedes uninit_bg. Convert the ROCOMPAT uninit_bg
flag check to a helper function that covers both, and make the
checksum calculation algorithm use either crc16 or the metadata_csum
chosen algorithm depending on which flag is set. Print a warning if
we try to mount a filesystem with both feature flags set.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Calculate and verify the checksums of extended attribute blocks. This
only applies to separate EA blocks that are pointed to by
inode->i_file_acl (i.e. external EA blocks); the checksum lives in
the EA header.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Calculate and verify the checksums for directory leaf blocks
(i.e. blocks that only contain actual directory entries). The
checksum lives in what looks to be an unused directory entry with a 0
name_len at the end of the block. This scheme is not used for
internal htree nodes because the mechanism in place there only costs
one dx_entry, whereas the "empty" directory entry would cost two
dx_entries.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Calculate and verify the checksum for directory index tree (htree)
node blocks. The checksum is stored in the last 4 bytes of the htree
block and requires the dx_entry array to stop 1 dx_entry short of the
end of the block.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Calculate and verify the checksum for each extent tree block. The
checksum is located in the space immediately after the last possible
ext4_extent in the block. The space is is typically the last 4-8
bytes in the block.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Compute and verify the checksum of the block bitmap; this checksum is
stored in the block group descriptor.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Compute and verify the checksum of the inode bitmap; the checkum is
stored in the block group descriptor.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This patch introduces to ext4 the ability to calculate and verify
inode checksums. This requires the use of a new ro compatibility flag
and some accompanying e2fsprogs patches to provide the relevant
features in tune2fs and e2fsck. The inode generation changes have
been integrated into this patch.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Calculate and verify the superblock checksum. Since the UUID and
block group number are embedded in each copy of the superblock, we
need only checksum the entire block. Refactor some of the code to
eliminate open-coding of the checksum update call.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Obtain a reference to the cryptoapi and crc32c if we mount a
filesystem with metadata checksumming enabled.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Record the type of checksum algorithm we're using for metadata in the
superblock, in case we ever want/need to change the algorithm.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Define flags and change structure definitions to allow checksumming of
ext4 metadata.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Create a new BH_Verified flag to indicate that we've verified all the
data in a buffer_head for correctness. This allows us to bypass
expensive verification steps when they are not necessary without
missing them when they are.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Pull power management fixes from Rafael J. Wysocki:
"Fix for an issue causing hibernation to hang on systems with highmem
(that practically means i386) due to broken memory management (bug
introduced in 3.2, so -stable material) and PM documentation update
making the freezer documentation follow the code again after some
recent updates."
* tag 'pm-for-3.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / Freezer / Docs: Update documentation about freezing of tasks
PM / Hibernate: fix the number of pages used for hibernate/thaw buffering
The autofs packet size has had a very unfortunate size problem on x86:
because the alignment of 'u64' differs in 32-bit and 64-bit modes, and
because the packet data was not 8-byte aligned, the size of the autofsv5
packet structure differed between 32-bit and 64-bit modes despite
looking otherwise identical (300 vs 304 bytes respectively).
We first fixed that up by making the 64-bit compat mode know about this
problem in commit a32744d4ab ("autofs: work around unhappy compat
problem on x86-64"), and that made a 32-bit 'systemd' work happily on a
64-bit kernel because everything then worked the same way as on a 32-bit
kernel.
But it turned out that 'automount' had actually known and worked around
this problem in user space, so fixing the kernel to do the proper 32-bit
compatibility handling actually *broke* 32-bit automount on a 64-bit
kernel, because it knew that the packet sizes were wrong and expected
those incorrect sizes.
As a result, we ended up reverting that compatibility mode fix, and
thus breaking systemd again, in commit fcbf94b9de.
With both automount and systemd doing a single read() system call, and
verifying that they get *exactly* the size they expect but using
different sizes, it seemed that fixing one of them inevitably seemed to
break the other. At one point, a patch I seriously considered applying
from Michael Tokarev did a "strcmp()" to see if it was automount that
was doing the operation. Ugly, ugly.
However, a prettier solution exists now thanks to the packetized pipe
mode. By marking the communication pipe as being packetized (by simply
setting the O_DIRECT flag), we can always just write the bigger packet
size, and if user-space does a smaller read, it will just get that
partial end result and the extra alignment padding will simply be thrown
away.
This makes both automount and systemd happy, since they now get the size
they asked for, and the kernel side of autofs simply no longer needs to
care - it could pad out the packet arbitrarily.
Of course, if there is some *other* user of autofs (please, please,
please tell me it ain't so - and we haven't heard of any) that tries to
read the packets with multiple writes, that other user will now be
broken - the whole point of the packetized mode is that one system call
gets exactly one packet, and you cannot read a packet in pieces.
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: David Miller <davem@davemloft.net>
Cc: Ian Kent <raven@themaw.net>
Cc: Thomas Meyer <thomas@m3y3r.de>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The file Documentation/power/freezing-of-tasks.txt was still referencing
the TIF_FREEZE flag, that was removed by the commit
d88e4cb67197d007fb778d62fe17360e970d5bfa(freezer: remove now unused
TIF_FREEZE).
This patch removes all the references of TIF_FREEZE that were left
behind.
Signed-off-by: Marcos Paulo de Souza <marcos.souza.org@gmail.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
The actual internal pipe implementation is already really about
individual packets (called "pipe buffers"), and this simply exposes that
as a special packetized mode.
When we are in the packetized mode (marked by O_DIRECT as suggested by
Alan Cox), a write() on a pipe will not merge the new data with previous
writes, so each write will get a pipe buffer of its own. The pipe
buffer is then marked with the PIPE_BUF_FLAG_PACKET flag, which in turn
will tell the reader side to break the read at that boundary (and throw
away any partial packet contents that do not fit in the read buffer).
End result: as long as you do writes less than PIPE_BUF in size (so that
the pipe doesn't have to split them up), you can now treat the pipe as a
packet interface, where each read() system call will read one packet at
a time. You can just use a sufficiently big read buffer (PIPE_BUF is
sufficient, since bigger than that doesn't guarantee atomicity anyway),
and the return value of the read() will naturally give you the size of
the packet.
NOTE! We do not support zero-sized packets, and zero-sized reads and
writes to a pipe continue to be no-ops. Also note that big packets will
currently be split at write time, but that the size at which that
happens is not really specified (except that it's bigger than PIPE_BUF).
Currently that limit is the system page size, but we might want to
explicitly support bigger packets some day.
The main user for this is going to be the autofs packet interface,
allowing us to stop having to care so deeply about exact packet sizes
(which have had bugs with 32/64-bit compatibility modes). But user
space can create packetized pipes with "pipe2(fd, O_DIRECT)", which will
fail with an EINVAL on kernels that do not support this interface.
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: David Miller <davem@davemloft.net>
Cc: Ian Kent <raven@themaw.net>
Cc: Thomas Meyer <thomas@m3y3r.de>
Cc: stable@kernel.org # needed for systemd/autofs interaction fix
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull staging tree fixes from Greg Kroah-Hartman:
"Here are some tiny drivers/staging/ bugfixes. Some build fixes that
were recently reported, as well as one kfree bug that is hitting a
number of users."
* tag 'staging-3.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: ozwpan: Fix bug where kfree is called twice.
staging: octeon-ethernet: fix build errors by including interrupt.h
staging: zcache: fix Kconfig crypto dependency
staging: tidspbridge: remove usage of OMAP2_L4_IO_ADDRESS