Merge branch 'mauro' into docs-next

Mauro sez:

  There are lots of plain text documents under Documentation/filesystems.
  Manually convert several of those to ReST and add them to the index file.

Signed-off-by: Jonathan Corbet <corbet@lwn.net>
This commit is contained in:
Jonathan Corbet
2020-03-02 14:11:43 -07:00
47 changed files with 2744 additions and 2040 deletions
@@ -1,7 +1,10 @@
v9fs: Plan 9 Resource Sharing for Linux
=======================================
.. SPDX-License-Identifier: GPL-2.0
ABOUT
=======================================
v9fs: Plan 9 Resource Sharing for Linux
=======================================
About
=====
v9fs is a Unix implementation of the Plan 9 9p remote filesystem protocol.
@@ -14,32 +17,34 @@ and Maya Gokhale. Additional development by Greg Watson
The best detailed explanation of the Linux implementation and applications of
the 9p client is available in the form of a USENIX paper:
http://www.usenix.org/events/usenix05/tech/freenix/hensbergen.html
Other applications are described in the following papers:
* XCPU & Clustering
http://xcpu.org/papers/xcpu-talk.pdf
* KVMFS: control file system for KVM
http://xcpu.org/papers/kvmfs.pdf
* CellFS: A New Programming Model for the Cell BE
http://xcpu.org/papers/cellfs-talk.pdf
* PROSE I/O: Using 9p to enable Application Partitions
http://plan9.escet.urjc.es/iwp9/cready/PROSE_iwp9_2006.pdf
* VirtFS: A Virtualization Aware File System pass-through
http://goo.gl/3WPDg
USAGE
* XCPU & Clustering
http://xcpu.org/papers/xcpu-talk.pdf
* KVMFS: control file system for KVM
http://xcpu.org/papers/kvmfs.pdf
* CellFS: A New Programming Model for the Cell BE
http://xcpu.org/papers/cellfs-talk.pdf
* PROSE I/O: Using 9p to enable Application Partitions
http://plan9.escet.urjc.es/iwp9/cready/PROSE_iwp9_2006.pdf
* VirtFS: A Virtualization Aware File System pass-through
http://goo.gl/3WPDg
Usage
=====
For remote file server:
For remote file server::
mount -t 9p 10.10.1.2 /mnt/9
For Plan 9 From User Space applications (http://swtch.com/plan9)
For Plan 9 From User Space applications (http://swtch.com/plan9)::
mount -t 9p `namespace`/acme /mnt/9 -o trans=unix,uname=$USER
For server running on QEMU host with virtio transport:
For server running on QEMU host with virtio transport::
mount -t 9p -o trans=virtio <mount_tag> /mnt/9
@@ -48,18 +53,22 @@ mount points. Each 9P export is seen by the client as a virtio device with an
associated "mount_tag" property. Available mount tags can be
seen by reading /sys/bus/virtio/drivers/9pnet_virtio/virtio<n>/mount_tag files.
OPTIONS
Options
=======
============= ===============================================================
trans=name select an alternative transport. Valid options are
currently:
unix - specifying a named pipe mount point
tcp - specifying a normal TCP/IP connection
fd - used passed file descriptors for connection
(see rfdno and wfdno)
virtio - connect to the next virtio channel available
(from QEMU with trans_virtio module)
rdma - connect to a specified RDMA channel
======== ============================================
unix specifying a named pipe mount point
tcp specifying a normal TCP/IP connection
fd used passed file descriptors for connection
(see rfdno and wfdno)
virtio connect to the next virtio channel available
(from QEMU with trans_virtio module)
rdma connect to a specified RDMA channel
======== ============================================
uname=name user name to attempt mount as on the remote server. The
server may override or ignore this value. Certain user
@@ -69,28 +78,36 @@ OPTIONS
offering several exported file systems.
cache=mode specifies a caching policy. By default, no caches are used.
none = default no cache policy, metadata and data
none
default no cache policy, metadata and data
alike are synchronous.
loose = no attempts are made at consistency,
loose
no attempts are made at consistency,
intended for exclusive, read-only mounts
fscache = use FS-Cache for a persistent, read-only
fscache
use FS-Cache for a persistent, read-only
cache backend.
mmap = minimal cache that is only used for read-write
mmap
minimal cache that is only used for read-write
mmap. Northing else is cached, like cache=none
debug=n specifies debug level. The debug level is a bitmask.
0x01 = display verbose error messages
0x02 = developer debug (DEBUG_CURRENT)
0x04 = display 9p trace
0x08 = display VFS trace
0x10 = display Marshalling debug
0x20 = display RPC debug
0x40 = display transport debug
0x80 = display allocation debug
0x100 = display protocol message debug
0x200 = display Fid debug
0x400 = display packet debug
0x800 = display fscache tracing debug
===== ================================
0x01 display verbose error messages
0x02 developer debug (DEBUG_CURRENT)
0x04 display 9p trace
0x08 display VFS trace
0x10 display Marshalling debug
0x20 display RPC debug
0x40 display transport debug
0x80 display allocation debug
0x100 display protocol message debug
0x200 display Fid debug
0x400 display packet debug
0x800 display fscache tracing debug
===== ================================
rfdno=n the file descriptor for reading with trans=fd
@@ -103,9 +120,12 @@ OPTIONS
noextend force legacy mode (no 9p2000.u or 9p2000.L semantics)
version=name Select 9P protocol version. Valid options are:
9p2000 - Legacy mode (same as noextend)
9p2000.u - Use 9P2000.u protocol
9p2000.L - Use 9P2000.L protocol
======== ==============================
9p2000 Legacy mode (same as noextend)
9p2000.u Use 9P2000.u protocol
9p2000.L Use 9P2000.L protocol
======== ==============================
dfltuid attempt to mount as a particular uid
@@ -118,22 +138,27 @@ OPTIONS
hosts. This functionality will be expanded in later versions.
access there are four access modes.
user = if a user tries to access a file on v9fs
user
if a user tries to access a file on v9fs
filesystem for the first time, v9fs sends an
attach command (Tattach) for that user.
This is the default mode.
<uid> = allows only user with uid=<uid> to access
<uid>
allows only user with uid=<uid> to access
the files on the mounted filesystem
any = v9fs does single attach and performs all
any
v9fs does single attach and performs all
operations as one user
client = ACL based access check on the 9p client
clien
ACL based access check on the 9p client
side for access validation
cachetag cache tag to use the specified persistent cache.
cache tags for existing cache sessions can be listed at
/sys/fs/9p/caches. (applies only to cache=fscache)
============= ===============================================================
RESOURCES
Resources
=========
Protocol specifications are maintained on github:
@@ -158,4 +183,3 @@ http://plan9.bell-labs.com/plan9
For information on Plan 9 from User Space (Plan 9 applications and libraries
ported to Linux/BSD/OSX/etc) check out http://swtch.com/plan9
@@ -1,3 +1,9 @@
.. SPDX-License-Identifier: GPL-2.0
===============================
Acorn Disc Filing System - ADFS
===============================
Filesystems supported by ADFS
-----------------------------
@@ -25,6 +31,7 @@ directory updates, specifically updating the access mode and timestamp.
Mount options for ADFS
----------------------
============ ======================================================
uid=nnn All files in the partition will be owned by
user id nnn. Default 0 (root).
gid=nnn All files in the partition will be in group
@@ -36,22 +43,23 @@ Mount options for ADFS
ftsuffix=n When ftsuffix=0, no file type suffix will be applied.
When ftsuffix=1, a hexadecimal suffix corresponding to
the RISC OS file type will be added. Default 0.
============ ======================================================
Mapping of ADFS permissions to Linux permissions
------------------------------------------------
ADFS permissions consist of the following:
Owner read
Owner write
Other read
Other write
- Owner read
- Owner write
- Other read
- Other write
(In older versions, an 'execute' permission did exist, but this
does not hold the same meaning as the Linux 'execute' permission
and is now obsolete).
does not hold the same meaning as the Linux 'execute' permission
and is now obsolete).
The mapping is performed as follows:
The mapping is performed as follows::
Owner read -> -r--r--r--
Owner write -> --w--w---w
@@ -66,17 +74,18 @@ Mapping of ADFS permissions to Linux permissions
Possible other mode permissions -> ----rwxrwx
Hence, with the default masks, if a file is owner read/write, and
not a UnixExec filetype, then the permissions will be:
not a UnixExec filetype, then the permissions will be::
-rw-------
However, if the masks were ownmask=0770,othmask=0007, then this would
be modified to:
be modified to::
-rw-rw----
There is no restriction on what you can do with these masks. You may
wish that either read bits give read access to the file for all, but
keep the default write protection (ownmask=0755,othmask=0577):
keep the default write protection (ownmask=0755,othmask=0577)::
-rw-r--r--
@@ -1,9 +1,13 @@
.. SPDX-License-Identifier: GPL-2.0
=============================
Overview of Amiga Filesystems
=============================
Not all varieties of the Amiga filesystems are supported for reading and
writing. The Amiga currently knows six different filesystems:
============== ===============================================================
DOS\0 The old or original filesystem, not really suited for
hard disks and normally not used on them, either.
Supported read/write.
@@ -23,6 +27,7 @@ DOS\4 The original filesystem with directory cache. The directory
sense on hard disks. Supported read only.
DOS\5 The Fast File System with directory cache. Supported read only.
============== ===============================================================
All of the above filesystems allow block sizes from 512 to 32K bytes.
Supported block sizes are: 512, 1024, 2048 and 4096 bytes. Larger blocks
@@ -36,14 +41,18 @@ are supported, too.
Mount options for the AFFS
==========================
protect If this option is set, the protection bits cannot be altered.
protect
If this option is set, the protection bits cannot be altered.
setuid[=uid] This sets the owner of all files and directories in the file
setuid[=uid]
This sets the owner of all files and directories in the file
system to uid or the uid of the current user, respectively.
setgid[=gid] Same as above, but for gid.
setgid[=gid]
Same as above, but for gid.
mode=mode Sets the mode flags to the given (octal) value, regardless
mode=mode
Sets the mode flags to the given (octal) value, regardless
of the original permissions. Directories will get an x
permission if the corresponding r bit is set.
This is useful since most of the plain AmigaOS files
@@ -53,33 +62,41 @@ nofilenametruncate
The file system will return an error when filename exceeds
standard maximum filename length (30 characters).
reserved=num Sets the number of reserved blocks at the start of the
reserved=num
Sets the number of reserved blocks at the start of the
partition to num. You should never need this option.
Default is 2.
root=block Sets the block number of the root block. This should never
root=block
Sets the block number of the root block. This should never
be necessary.
bs=blksize Sets the blocksize to blksize. Valid block sizes are 512,
bs=blksize
Sets the blocksize to blksize. Valid block sizes are 512,
1024, 2048 and 4096. Like the root option, this should
never be necessary, as the affs can figure it out itself.
quiet The file system will not return an error for disallowed
quiet
The file system will not return an error for disallowed
mode changes.
verbose The volume name, file system type and block size will
verbose
The volume name, file system type and block size will
be written to the syslog when the filesystem is mounted.
mufs The filesystem is really a muFS, also it doesn't
mufs
The filesystem is really a muFS, also it doesn't
identify itself as one. This option is necessary if
the filesystem wasn't formatted as muFS, but is used
as one.
prefix=path Path will be prefixed to every absolute path name of
prefix=path
Path will be prefixed to every absolute path name of
symbolic links on an AFFS partition. Default = "/".
(See below.)
volume=name When symbolic links with an absolute path are created
volume=name
When symbolic links with an absolute path are created
on an AFFS partition, name will be prepended as the
volume name. Default = "" (empty string).
(See below.)
@@ -119,7 +136,7 @@ The Linux rwxrwxrwx file mode is handled as follows:
- All other flags (suid, sgid, ...) are ignored and will
not be retained.
Newly created files and directories will get the user and group ID
of the current user and a mode according to the umask.
@@ -148,11 +165,13 @@ might be "User", "WB" and "Graphics", the mount points /amiga/User,
Examples
========
Command line:
Command line::
mount Archive/Amiga/Workbench3.1.adf /mnt -t affs -o loop,verbose
mount /dev/sda3 /Amiga -t affs
/etc/fstab entry:
/etc/fstab entry::
/dev/sdb5 /amiga/Workbench affs noauto,user,exec,verbose 0 0
IMPORTANT NOTE
@@ -170,7 +189,8 @@ before booting Windows!
If the damage is already done, the following should fix the RDB
(where <disk> is the device name).
DO AT YOUR OWN RISK:
DO AT YOUR OWN RISK::
dd if=/dev/<disk> of=rdb.tmp count=1
cp rdb.tmp rdb.fixed
@@ -189,10 +209,14 @@ By default, filenames are truncated to 30 characters without warning.
'nofilenametruncate' mount option can change that behavior.
Case is ignored by the affs in filename matching, but Linux shells
do care about the case. Example (with /wb being an affs mounted fs):
do care about the case. Example (with /wb being an affs mounted fs)::
rm /wb/WRONGCASE
will remove /mnt/wrongcase, but
will remove /mnt/wrongcase, but::
rm /wb/WR*
will not since the names are matched by the shell.
The block allocation is designed for hard disk partitions. If more
@@ -219,4 +243,4 @@ due to an incompatibility with the Amiga floppy controller.
If you are interested in an Amiga Emulator for Linux, look at
http://web.archive.org/web/*/http://www.freiburg.linux.de/~uae/
http://web.archive.org/web/%2E/http://www.freiburg.linux.de/~uae/
@@ -1,8 +1,10 @@
====================
kAFS: AFS FILESYSTEM
====================
.. SPDX-License-Identifier: GPL-2.0
Contents:
====================
kAFS: AFS FILESYSTEM
====================
.. Contents:
- Overview.
- Usage.
@@ -14,8 +16,7 @@ Contents:
- The @sys substitution.
========
OVERVIEW
Overview
========
This filesystem provides a fairly simple secure AFS filesystem driver. It is
@@ -35,35 +36,33 @@ It does not yet support the following AFS features:
(*) pioctl() system call.
===========
COMPILATION
Compilation
===========
The filesystem should be enabled by turning on the kernel configuration
options:
options::
CONFIG_AF_RXRPC - The RxRPC protocol transport
CONFIG_RXKAD - The RxRPC Kerberos security handler
CONFIG_AFS - The AFS filesystem
Additionally, the following can be turned on to aid debugging:
Additionally, the following can be turned on to aid debugging::
CONFIG_AF_RXRPC_DEBUG - Permit AF_RXRPC debugging to be enabled
CONFIG_AFS_DEBUG - Permit AFS debugging to be enabled
They permit the debugging messages to be turned on dynamically by manipulating
the masks in the following files:
the masks in the following files::
/sys/module/af_rxrpc/parameters/debug
/sys/module/kafs/parameters/debug
=====
USAGE
Usage
=====
When inserting the driver modules the root cell must be specified along with a
list of volume location server IP addresses:
list of volume location server IP addresses::
modprobe rxrpc
modprobe kafs rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91
@@ -77,14 +76,14 @@ The second module is the kerberos RxRPC security driver, and the third module
is the actual filesystem driver for the AFS filesystem.
Once the module has been loaded, more modules can be added by the following
procedure:
procedure::
echo add grand.central.org 18.9.48.14:128.2.203.61:130.237.48.87 >/proc/fs/afs/cells
Where the parameters to the "add" command are the name of a cell and a list of
volume location servers within that cell, with the latter separated by colons.
Filesystems can be mounted anywhere by commands similar to the following:
Filesystems can be mounted anywhere by commands similar to the following::
mount -t afs "%cambridge.redhat.com:root.afs." /afs
mount -t afs "#cambridge.redhat.com:root.cell." /afs/cambridge
@@ -104,8 +103,7 @@ named volume will be looked up in the cell specified during modprobe.
Additional cells can be added through /proc (see later section).
===========
MOUNTPOINTS
Mountpoints
===========
AFS has a concept of mountpoints. In AFS terms, these are specially formatted
@@ -123,42 +121,40 @@ culled first. If all are culled, then the requested volume will also be
unmounted, otherwise error EBUSY will be returned.
This can be used by the administrator to attempt to unmount the whole AFS tree
mounted on /afs in one go by doing:
mounted on /afs in one go by doing::
umount /afs
============
DYNAMIC ROOT
Dynamic Root
============
A mount option is available to create a serverless mount that is only usable
for dynamic lookup. Creating such a mount can be done by, for example:
for dynamic lookup. Creating such a mount can be done by, for example::
mount -t afs none /afs -o dyn
This creates a mount that just has an empty directory at the root. Attempting
to look up a name in this directory will cause a mountpoint to be created that
looks up a cell of the same name, for example:
looks up a cell of the same name, for example::
ls /afs/grand.central.org/
===============
PROC FILESYSTEM
Proc Filesystem
===============
The AFS modules creates a "/proc/fs/afs/" directory and populates it:
(*) A "cells" file that lists cells currently known to the afs module and
their usage counts:
their usage counts::
[root@andromeda ~]# cat /proc/fs/afs/cells
USE NAME
3 cambridge.redhat.com
(*) A directory per cell that contains files that list volume location
servers, volumes, and active servers known within that cell.
servers, volumes, and active servers known within that cell::
[root@andromeda ~]# cat /proc/fs/afs/cambridge.redhat.com/servers
USE ADDR STATE
@@ -171,8 +167,7 @@ The AFS modules creates a "/proc/fs/afs/" directory and populates it:
1 Val 20000000 20000001 20000002 root.afs
=================
THE CELL DATABASE
The Cell Database
=================
The filesystem maintains an internal database of all the cells it knows and the
@@ -181,7 +176,7 @@ the system belongs is added to the database when modprobe is performed by the
"rootcell=" argument or, if compiled in, using a "kafs.rootcell=" argument on
the kernel command line.
Further cells can be added by commands similar to the following:
Further cells can be added by commands similar to the following::
echo add CELLNAME VLADDR[:VLADDR][:VLADDR]... >/proc/fs/afs/cells
echo add grand.central.org 18.9.48.14:128.2.203.61:130.237.48.87 >/proc/fs/afs/cells
@@ -189,8 +184,7 @@ Further cells can be added by commands similar to the following:
No other cell database operations are available at this time.
========
SECURITY
Security
========
Secure operations are initiated by acquiring a key using the klog program. A
@@ -198,17 +192,17 @@ very primitive klog program is available at:
http://people.redhat.com/~dhowells/rxrpc/klog.c
This should be compiled by:
This should be compiled by::
make klog LDLIBS="-lcrypto -lcrypt -lkrb4 -lkeyutils"
And then run as:
And then run as::
./klog
Assuming it's successful, this adds a key of type RxRPC, named for the service
and cell, eg: "afs@<cellname>". This can be viewed with the keyctl program or
by cat'ing /proc/keys:
by cat'ing /proc/keys::
[root@andromeda ~]# keyctl show
Session Keyring
@@ -232,20 +226,19 @@ socket), then the operations on the file will be made with key that was used to
open the file.
=====================
THE @SYS SUBSTITUTION
The @sys Substitution
=====================
The list of up to 16 @sys substitutions for the current network namespace can
be configured by writing a list to /proc/fs/afs/sysname:
be configured by writing a list to /proc/fs/afs/sysname::
[root@andromeda ~]# echo foo amd64_linux_26 >/proc/fs/afs/sysname
or cleared entirely by writing an empty list:
or cleared entirely by writing an empty list::
[root@andromeda ~]# echo >/proc/fs/afs/sysname
The current list for current network namespace can be retrieved by:
The current list for current network namespace can be retrieved by::
[root@andromeda ~]# cat /proc/fs/afs/sysname
foo
@@ -1,4 +1,6 @@
.. SPDX-License-Identifier: GPL-2.0
====================================================================
Miscellaneous Device control operations for the autofs kernel module
====================================================================
@@ -36,24 +38,24 @@ For example, there are two types of automount maps, direct (in the kernel
module source you will see a third type called an offset, which is just
a direct mount in disguise) and indirect.
Here is a master map with direct and indirect map entries:
Here is a master map with direct and indirect map entries::
/- /etc/auto.direct
/test /etc/auto.indirect
/- /etc/auto.direct
/test /etc/auto.indirect
and the corresponding map files:
and the corresponding map files::
/etc/auto.direct:
/etc/auto.direct:
/automount/dparse/g6 budgie:/autofs/export1
/automount/dparse/g1 shark:/autofs/export1
and so on.
/automount/dparse/g6 budgie:/autofs/export1
/automount/dparse/g1 shark:/autofs/export1
and so on.
/etc/auto.indirect:
/etc/auto.indirect::
g1 shark:/autofs/export1
g6 budgie:/autofs/export1
and so on.
g1 shark:/autofs/export1
g6 budgie:/autofs/export1
and so on.
For the above indirect map an autofs file system is mounted on /test and
mounts are triggered for each sub-directory key by the inode lookup
@@ -69,23 +71,23 @@ use the follow_link inode operation to trigger the mount.
But, each entry in direct and indirect maps can have offsets (making
them multi-mount map entries).
For example, an indirect mount map entry could also be:
For example, an indirect mount map entry could also be::
g1 \
/ shark:/autofs/export5/testing/test \
/s1 shark:/autofs/export/testing/test/s1 \
/s2 shark:/autofs/export5/testing/test/s2 \
/s1/ss1 shark:/autofs/export1 \
/s2/ss2 shark:/autofs/export2
g1 \
/ shark:/autofs/export5/testing/test \
/s1 shark:/autofs/export/testing/test/s1 \
/s2 shark:/autofs/export5/testing/test/s2 \
/s1/ss1 shark:/autofs/export1 \
/s2/ss2 shark:/autofs/export2
and a similarly a direct mount map entry could also be:
and a similarly a direct mount map entry could also be::
/automount/dparse/g1 \
/ shark:/autofs/export5/testing/test \
/s1 shark:/autofs/export/testing/test/s1 \
/s2 shark:/autofs/export5/testing/test/s2 \
/s1/ss1 shark:/autofs/export2 \
/s2/ss2 shark:/autofs/export2
/automount/dparse/g1 \
/ shark:/autofs/export5/testing/test \
/s1 shark:/autofs/export/testing/test/s1 \
/s2 shark:/autofs/export5/testing/test/s2 \
/s1/ss1 shark:/autofs/export2 \
/s2/ss2 shark:/autofs/export2
One of the issues with version 4 of autofs was that, when mounting an
entry with a large number of offsets, possibly with nesting, we needed
@@ -170,32 +172,32 @@ autofs Miscellaneous Device mount control interface
The control interface is opening a device node, typically /dev/autofs.
All the ioctls use a common structure to pass the needed parameter
information and return operation results:
information and return operation results::
struct autofs_dev_ioctl {
__u32 ver_major;
__u32 ver_minor;
__u32 size; /* total size of data passed in
* including this struct */
__s32 ioctlfd; /* automount command fd */
struct autofs_dev_ioctl {
__u32 ver_major;
__u32 ver_minor;
__u32 size; /* total size of data passed in
* including this struct */
__s32 ioctlfd; /* automount command fd */
/* Command parameters */
union {
struct args_protover protover;
struct args_protosubver protosubver;
struct args_openmount openmount;
struct args_ready ready;
struct args_fail fail;
struct args_setpipefd setpipefd;
struct args_timeout timeout;
struct args_requester requester;
struct args_expire expire;
struct args_askumount askumount;
struct args_ismountpoint ismountpoint;
};
/* Command parameters */
union {
struct args_protover protover;
struct args_protosubver protosubver;
struct args_openmount openmount;
struct args_ready ready;
struct args_fail fail;
struct args_setpipefd setpipefd;
struct args_timeout timeout;
struct args_requester requester;
struct args_expire expire;
struct args_askumount askumount;
struct args_ismountpoint ismountpoint;
};
char path[0];
};
char path[0];
};
The ioctlfd field is a mount point file descriptor of an autofs mount
point. It is returned by the open call and is used by all calls except
@@ -212,7 +214,7 @@ is used account for the increased structure length when translating the
structure sent from user space.
This structure can be initialized before setting specific fields by using
the void function call init_autofs_dev_ioctl(struct autofs_dev_ioctl *).
the void function call init_autofs_dev_ioctl(``struct autofs_dev_ioctl *``).
All of the ioctls perform a copy of this structure from user space to
kernel space and return -EINVAL if the size parameter is smaller than
@@ -1,48 +1,54 @@
.. SPDX-License-Identifier: GPL-2.0
=========================
BeOS filesystem for Linux
=========================
Document last updated: Dec 6, 2001
WARNING
Warning
=======
Make sure you understand that this is alpha software. This means that the
implementation is neither complete nor well-tested.
implementation is neither complete nor well-tested.
I DISCLAIM ALL RESPONSIBILITY FOR ANY POSSIBLE BAD EFFECTS OF THIS CODE!
LICENSE
=====
This software is covered by the GNU General Public License.
License
=======
This software is covered by the GNU General Public License.
See the file COPYING for the complete text of the license.
Or the GNU website: <http://www.gnu.org/licenses/licenses.html>
AUTHOR
=====
Author
======
The largest part of the code written by Will Dyson <will_dyson@pobox.com>
He has been working on the code since Aug 13, 2001. See the changelog for
details.
Original Author: Makoto Kato <m_kato@ga2.so-net.ne.jp>
His original code can still be found at:
<http://hp.vector.co.jp/authors/VA008030/bfs/>
Does anyone know of a more current email address for Makoto? He doesn't
respond to the address given above...
This filesystem doesn't have a maintainer.
WHAT IS THIS DRIVER?
==================
This module implements the native filesystem of BeOS http://www.beincorporated.com/
What is this Driver?
====================
This module implements the native filesystem of BeOS http://www.beincorporated.com/
for the linux 2.4.1 and later kernels. Currently it is a read-only
implementation.
Which is it, BFS or BEFS?
================
Be, Inc said, "BeOS Filesystem is officially called BFS, not BeFS".
=========================
Be, Inc said, "BeOS Filesystem is officially called BFS, not BeFS".
But Unixware Boot Filesystem is called bfs, too. And they are already in
the kernel. Because of this naming conflict, on Linux the BeOS
filesystem is called befs.
HOW TO INSTALL
How to Install
==============
step 1. Install the BeFS patch into the source code tree of linux.
@@ -54,16 +60,16 @@ is called patch-befs-xxx, you would do the following:
patch -p1 < /path/to/patch-befs-xxx
if the patching step fails (i.e. there are rejected hunks), you can try to
figure it out yourself (it shouldn't be hard), or mail the maintainer
figure it out yourself (it shouldn't be hard), or mail the maintainer
(Will Dyson <will_dyson@pobox.com>) for help.
step 2. Configuration & make kernel
The linux kernel has many compile-time options. Most of them are beyond the
scope of this document. I suggest the Kernel-HOWTO document as a good general
reference on this topic. http://www.linuxdocs.org/HOWTOs/Kernel-HOWTO-4.html
reference on this topic. http://www.linuxdocs.org/HOWTOs/Kernel-HOWTO-4.html
However, to use the BeFS module, you must enable it at configure time.
However, to use the BeFS module, you must enable it at configure time::
cd /foo/bar/linux
make menuconfig (or xconfig)
@@ -82,35 +88,40 @@ step 3. Install
See the kernel howto <http://www.linux.com/howto/Kernel-HOWTO.html> for
instructions on this critical step.
USING BFS
Using BFS
=========
To use the BeOS filesystem, use filesystem type 'befs'.
ex)
ex::
mount -t befs /dev/fd0 /beos
MOUNT OPTIONS
Mount Options
=============
============= ===========================================================
uid=nnn All files in the partition will be owned by user id nnn.
gid=nnn All files in the partition will be in group nnn.
iocharset=xxx Use xxx as the name of the NLS translation table.
debug The driver will output debugging information to the syslog.
============= ===========================================================
HOW TO GET LASTEST VERSION
How to Get Lastest Version
==========================
The latest version is currently available at:
<http://befs-driver.sourceforge.net/>
ANY KNOWN BUGS?
===========
Any Known Bugs?
===============
As of Jan 20, 2002:
None
SPECIAL THANKS
Special Thanks
==============
Dominic Giampalo ... Writing "Practical file system design with Be filesystem"
Hiroyuki Yamada ... Testing LinuxPPC.
@@ -1,4 +1,7 @@
BFS FILESYSTEM FOR LINUX
.. SPDX-License-Identifier: GPL-2.0
========================
BFS Filesystem for Linux
========================
The BFS filesystem is used by SCO UnixWare OS for the /stand slice, which
@@ -9,22 +12,22 @@ In order to access /stand partition under Linux you obviously need to
know the partition number and the kernel must support UnixWare disk slices
(CONFIG_UNIXWARE_DISKLABEL config option). However BFS support does not
depend on having UnixWare disklabel support because one can also mount
BFS filesystem via loopback:
BFS filesystem via loopback::
# losetup /dev/loop0 stand.img
# mount -t bfs /dev/loop0 /mnt/stand
# losetup /dev/loop0 stand.img
# mount -t bfs /dev/loop0 /mnt/stand
where stand.img is a file containing the image of BFS filesystem.
where stand.img is a file containing the image of BFS filesystem.
When you have finished using it and umounted you need to also deallocate
/dev/loop0 device by:
/dev/loop0 device by::
# losetup -d /dev/loop0
# losetup -d /dev/loop0
You can simplify mounting by just typing:
You can simplify mounting by just typing::
# mount -t bfs -o loop stand.img /mnt/stand
# mount -t bfs -o loop stand.img /mnt/stand
this will allocate the first available loopback device (and load loop.o
this will allocate the first available loopback device (and load loop.o
kernel module if necessary) automatically. If the loopback driver is not
loaded automatically, make sure that you have compiled the module and
that modprobe is functioning. Beware that umount will not deallocate
@@ -33,21 +36,21 @@ that modprobe is functioning. Beware that umount will not deallocate
losetup(8). Read losetup(8) manpage for more info.
To create the BFS image under UnixWare you need to find out first which
slice contains it. The command prtvtoc(1M) is your friend:
slice contains it. The command prtvtoc(1M) is your friend::
# prtvtoc /dev/rdsk/c0b0t0d0s0
# prtvtoc /dev/rdsk/c0b0t0d0s0
(assuming your root disk is on target=0, lun=0, bus=0, controller=0). Then you
look for the slice with tag "STAND", which is usually slice 10. With this
information you can use dd(1) to create the BFS image:
information you can use dd(1) to create the BFS image::
# umount /stand
# dd if=/dev/rdsk/c0b0t0d0sa of=stand.img bs=512
# umount /stand
# dd if=/dev/rdsk/c0b0t0d0sa of=stand.img bs=512
Just in case, you can verify that you have done the right thing by checking
the magic number:
the magic number::
# od -Ad -tx4 stand.img | more
# od -Ad -tx4 stand.img | more
The first 4 bytes should be 0x1badface.
@@ -1,3 +1,6 @@
.. SPDX-License-Identifier: GPL-2.0
=====
BTRFS
=====
@@ -1,3 +1,6 @@
.. SPDX-License-Identifier: GPL-2.0
============================
Ceph Distributed File System
============================
@@ -15,6 +18,7 @@ Basic features include:
* Easy deployment: most FS components are userspace daemons
Also,
* Flexible snapshots (on any directory)
* Recursive accounting (nested files, directories, bytes)
@@ -63,7 +67,7 @@ no 'du' or similar recursive scan of the file system is required.
Finally, Ceph also allows quotas to be set on any directory in the system.
The quota can restrict the number of bytes or the number of files stored
beneath that point in the directory hierarchy. Quotas can be set using
extended attributes 'ceph.quota.max_files' and 'ceph.quota.max_bytes', eg:
extended attributes 'ceph.quota.max_files' and 'ceph.quota.max_bytes', eg::
setfattr -n ceph.quota.max_bytes -v 100000000 /some/dir
getfattr -n ceph.quota.max_bytes /some/dir
@@ -76,7 +80,7 @@ from writing as much data as it needs.
Mount Syntax
============
The basic mount syntax is:
The basic mount syntax is::
# mount -t ceph monip[:port][,monip2[:port]...]:/[subdir] mnt
@@ -84,7 +88,7 @@ You only need to specify a single monitor, as the client will get the
full list when it connects. (However, if the monitor you specify
happens to be down, the mount won't succeed.) The port can be left
off if the monitor is using the default. So if the monitor is at
1.2.3.4,
1.2.3.4::
# mount -t ceph 1.2.3.4:/ /mnt/ceph
@@ -163,14 +167,14 @@ Mount Options
available modes are "no" and "clean". The default is "no".
* no: never attempt to reconnect when client detects that it has been
blacklisted. Operations will generally fail after being blacklisted.
blacklisted. Operations will generally fail after being blacklisted.
* clean: client reconnects to the ceph cluster automatically when it
detects that it has been blacklisted. During reconnect, client drops
dirty data/metadata, invalidates page caches and writable file handles.
After reconnect, file locks become stale because the MDS loses track
of them. If an inode contains any stale file locks, read/write on the
inode is not allowed until applications release all stale file locks.
detects that it has been blacklisted. During reconnect, client drops
dirty data/metadata, invalidates page caches and writable file handles.
After reconnect, file locks become stale because the MDS loses track
of them. If an inode contains any stale file locks, read/write on the
inode is not allowed until applications release all stale file locks.
More Information
================
@@ -179,8 +183,8 @@ For more information on Ceph, see the home page at
https://ceph.com/
The Linux kernel client source tree is available at
https://github.com/ceph/ceph-client.git
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git
- https://github.com/ceph/ceph-client.git
- git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git
and the source for the full system is at
https://github.com/ceph/ceph.git
@@ -1,12 +1,15 @@
.. SPDX-License-Identifier: GPL-2.0
Cramfs - cram a filesystem onto a small ROM
===========================================
Cramfs - cram a filesystem onto a small ROM
===========================================
cramfs is designed to be simple and small, and to compress things well.
cramfs is designed to be simple and small, and to compress things well.
It uses the zlib routines to compress a file one page at a time, and
allows random page access. The meta-data is not compressed, but is
expressed in a very terse representation to make it use much less
diskspace than traditional filesystems.
diskspace than traditional filesystems.
You can't write to a cramfs filesystem (making it compressible and
compact also makes it _very_ hard to update on-the-fly), so you have to
@@ -28,9 +31,9 @@ issue.
Hard links are supported, but hard linked files
will still have a link count of 1 in the cramfs image.
Cramfs directories have no `.' or `..' entries. Directories (like
Cramfs directories have no ``.`` or ``..`` entries. Directories (like
every other file on cramfs) always have a link count of 1. (There's
no need to use -noleaf in `find', btw.)
no need to use -noleaf in ``find``, btw.)
No timestamps are stored in a cramfs, so these default to the epoch
(1970 GMT). Recently-accessed files may have updated timestamps, but
@@ -70,9 +73,9 @@ MTD drivers are cfi_cmdset_0001 (Intel/Sharp CFI flash) or physmap
(Flash device in physical memory map). MTD partitions based on such devices
are fine too. Then that device should be specified with the "mtd:" prefix
as the mount device argument. For example, to mount the MTD device named
"fs_partition" on the /mnt directory:
"fs_partition" on the /mnt directory::
$ mount -t cramfs mtd:fs_partition /mnt
$ mount -t cramfs mtd:fs_partition /mnt
To boot a kernel with this as root filesystem, suffice to specify
something like "root=mtd:fs_partition" on the kernel command line.
@@ -90,6 +93,7 @@ https://github.com/npitre/cramfs-tools
For /usr/share/magic
--------------------
===== ======================= =======================
0 ulelong 0x28cd3d45 Linux cramfs offset 0
>4 ulelong x size %d
>8 ulelong x flags 0x%x
@@ -110,6 +114,7 @@ For /usr/share/magic
>552 ulelong x fsid.blocks %d
>556 ulelong x fsid.files %d
>560 string >\0 name "%.16s"
===== ======================= =======================
Hacker Notes
@@ -1,4 +1,11 @@
Copyright 2009 Jonathan Corbet <corbet@lwn.net>
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
=======
DebugFS
=======
Copyright |copy| 2009 Jonathan Corbet <corbet@lwn.net>
Debugfs exists as a simple way for kernel developers to make information
available to user space. Unlike /proc, which is only meant for information
@@ -6,11 +13,11 @@ about a process, or sysfs, which has strict one-value-per-file rules,
debugfs has no rules at all. Developers can put any information they want
there. The debugfs filesystem is also intended to not serve as a stable
ABI to user space; in theory, there are no stability constraints placed on
files exported there. The real world is not always so simple, though [1];
files exported there. The real world is not always so simple, though [1]_;
even debugfs interfaces are best designed with the idea that they will need
to be maintained forever.
Debugfs is typically mounted with a command like:
Debugfs is typically mounted with a command like::
mount -t debugfs none /sys/kernel/debug
@@ -23,7 +30,7 @@ Note that the debugfs API is exported GPL-only to modules.
Code using debugfs should include <linux/debugfs.h>. Then, the first order
of business will be to create at least one directory to hold a set of
debugfs files:
debugfs files::
struct dentry *debugfs_create_dir(const char *name, struct dentry *parent);
@@ -36,7 +43,7 @@ something went wrong. If ERR_PTR(-ENODEV) is returned, that is an
indication that the kernel has been built without debugfs support and none
of the functions described below will work.
The most general way to create a file within a debugfs directory is with:
The most general way to create a file within a debugfs directory is with::
struct dentry *debugfs_create_file(const char *name, umode_t mode,
struct dentry *parent, void *data,
@@ -53,7 +60,7 @@ ERR_PTR(-ERROR) on error, or ERR_PTR(-ENODEV) if debugfs support is
missing.
Create a file with an initial size, the following function can be used
instead:
instead::
struct dentry *debugfs_create_file_size(const char *name, umode_t mode,
struct dentry *parent, void *data,
@@ -66,7 +73,7 @@ as the function debugfs_create_file.
In a number of cases, the creation of a set of file operations is not
actually necessary; the debugfs code provides a number of helper functions
for simple situations. Files containing a single integer value can be
created with any of:
created with any of::
void debugfs_create_u8(const char *name, umode_t mode,
struct dentry *parent, u8 *value);
@@ -80,7 +87,7 @@ created with any of:
These files support both reading and writing the given value; if a specific
file should not be written to, simply set the mode bits accordingly. The
values in these files are in decimal; if hexadecimal is more appropriate,
the following functions can be used instead:
the following functions can be used instead::
void debugfs_create_x8(const char *name, umode_t mode,
struct dentry *parent, u8 *value);
@@ -94,7 +101,7 @@ the following functions can be used instead:
These functions are useful as long as the developer knows the size of the
value to be exported. Some types can have different widths on different
architectures, though, complicating the situation somewhat. There are
functions meant to help out in such special cases:
functions meant to help out in such special cases::
void debugfs_create_size_t(const char *name, umode_t mode,
struct dentry *parent, size_t *value);
@@ -103,7 +110,7 @@ As might be expected, this function will create a debugfs file to represent
a variable of type size_t.
Similarly, there are helpers for variables of type unsigned long, in decimal
and hexadecimal:
and hexadecimal::
struct dentry *debugfs_create_ulong(const char *name, umode_t mode,
struct dentry *parent,
@@ -111,7 +118,7 @@ and hexadecimal:
void debugfs_create_xul(const char *name, umode_t mode,
struct dentry *parent, unsigned long *value);
Boolean values can be placed in debugfs with:
Boolean values can be placed in debugfs with::
struct dentry *debugfs_create_bool(const char *name, umode_t mode,
struct dentry *parent, bool *value);
@@ -120,7 +127,7 @@ A read on the resulting file will yield either Y (for non-zero values) or
N, followed by a newline. If written to, it will accept either upper- or
lower-case values, or 1 or 0. Any other input will be silently ignored.
Also, atomic_t values can be placed in debugfs with:
Also, atomic_t values can be placed in debugfs with::
void debugfs_create_atomic_t(const char *name, umode_t mode,
struct dentry *parent, atomic_t *value)
@@ -129,7 +136,7 @@ A read of this file will get atomic_t values, and a write of this file
will set atomic_t values.
Another option is exporting a block of arbitrary binary data, with
this structure and function:
this structure and function::
struct debugfs_blob_wrapper {
void *data;
@@ -151,7 +158,7 @@ If you want to dump a block of registers (something that happens quite
often during development, even if little such code reaches mainline.
Debugfs offers two functions: one to make a registers-only file, and
another to insert a register block in the middle of another sequential
file.
file::
struct debugfs_reg32 {
char *name;
@@ -175,7 +182,7 @@ The "base" argument may be 0, but you may want to build the reg32 array
using __stringify, and a number of register names (macros) are actually
byte offsets over a base for the register block.
If you want to dump an u32 array in debugfs, you can create file with:
If you want to dump an u32 array in debugfs, you can create file with::
void debugfs_create_u32_array(const char *name, umode_t mode,
struct dentry *parent,
@@ -185,7 +192,7 @@ The "array" argument provides data, and the "elements" argument is
the number of elements in the array. Note: Once array is created its
size can not be changed.
There is a helper function to create device related seq_file:
There is a helper function to create device related seq_file::
struct dentry *debugfs_create_devm_seqfile(struct device *dev,
const char *name,
@@ -197,14 +204,14 @@ The "dev" argument is the device related to this debugfs file, and
the "read_fn" is a function pointer which to be called to print the
seq_file content.
There are a couple of other directory-oriented helper functions:
There are a couple of other directory-oriented helper functions::
struct dentry *debugfs_rename(struct dentry *old_dir,
struct dentry *debugfs_rename(struct dentry *old_dir,
struct dentry *old_dentry,
struct dentry *new_dir,
struct dentry *new_dir,
const char *new_name);
struct dentry *debugfs_create_symlink(const char *name,
struct dentry *debugfs_create_symlink(const char *name,
struct dentry *parent,
const char *target);
@@ -219,7 +226,7 @@ module is unloaded without explicitly removing debugfs entries, the result
will be a lot of stale pointers and no end of highly antisocial behavior.
So all debugfs users - at least those which can be built as modules - must
be prepared to remove all files and directories they create there. A file
can be removed with:
can be removed with::
void debugfs_remove(struct dentry *dentry);
@@ -229,7 +236,7 @@ be removed.
Once upon a time, debugfs users were required to remember the dentry
pointer for every debugfs file they created so that all files could be
cleaned up. We live in more civilized times now, though, and debugfs users
can call:
can call::
void debugfs_remove_recursive(struct dentry *dentry);
@@ -237,5 +244,4 @@ If this function is passed a pointer for the dentry corresponding to the
top-level directory, the entire hierarchy below that directory will be
removed.
Notes:
[1] http://lwn.net/Articles/309298/
.. [1] http://lwn.net/Articles/309298/
@@ -1,20 +1,25 @@
dlmfs
==================
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
=====
DLMFS
=====
A minimal DLM userspace interface implemented via a virtual file
system.
dlmfs is built with OCFS2 as it requires most of its infrastructure.
Project web page: http://ocfs2.wiki.kernel.org
Tools web page: https://github.com/markfasheh/ocfs2-tools
OCFS2 mailing lists: http://oss.oracle.com/projects/ocfs2/mailman/
:Project web page: http://ocfs2.wiki.kernel.org
:Tools web page: https://github.com/markfasheh/ocfs2-tools
:OCFS2 mailing lists: http://oss.oracle.com/projects/ocfs2/mailman/
All code copyright 2005 Oracle except when otherwise noted.
CREDITS
Credits
=======
Some code taken from ramfs which is Copyright (C) 2000 Linus Torvalds
Some code taken from ramfs which is Copyright |copy| 2000 Linus Torvalds
and Transmeta Corp.
Mark Fasheh <mark.fasheh@oracle.com>
@@ -96,14 +101,19 @@ operation. If the lock succeeds, you'll get an fd.
open(2) with O_CREAT to ensure the resource inode is created - dlmfs does
not automatically create inodes for existing lock resources.
============ ===========================
Open Flag Lock Request Type
--------- -----------------
============ ===========================
O_RDONLY Shared Read
O_RDWR Exclusive
============ ===========================
============ ===========================
Open Flag Resulting Locking Behavior
--------- --------------------------
============ ===========================
O_NONBLOCK Trylock operation
============ ===========================
You must provide exactly one of O_RDONLY or O_RDWR.
@@ -1,14 +1,18 @@
.. SPDX-License-Identifier: GPL-2.0
======================================================
eCryptfs: A stacked cryptographic filesystem for Linux
======================================================
eCryptfs is free software. Please see the file COPYING for details.
For documentation, please see the files in the doc/ subdirectory. For
building and installation instructions please see the INSTALL file.
Maintainer: Phillip Hellewell
Lead developer: Michael A. Halcrow <mhalcrow@us.ibm.com>
Developers: Michael C. Thompson
Kent Yoder
Web Site: http://ecryptfs.sf.net
:Maintainer: Phillip Hellewell
:Lead developer: Michael A. Halcrow <mhalcrow@us.ibm.com>
:Developers: Michael C. Thompson
Kent Yoder
:Web Site: http://ecryptfs.sf.net
This software is currently undergoing development. Make sure to
maintain a backup copy of any data you write into eCryptfs.
@@ -19,13 +23,15 @@ SourceForge site:
http://sourceforge.net/projects/ecryptfs/
Userspace requirements include:
- David Howells' userspace keyring headers and libraries (version
1.0 or higher), obtainable from
http://people.redhat.com/~dhowells/keyutils/
- Libgcrypt
- David Howells' userspace keyring headers and libraries (version
1.0 or higher), obtainable from
http://people.redhat.com/~dhowells/keyutils/
- Libgcrypt
NOTES
Notes
=====
In the beta/experimental releases of eCryptfs, when you upgrade
eCryptfs, you should copy the files to an unencrypted location and
@@ -33,20 +39,21 @@ then copy the files back into the new eCryptfs mount to migrate the
files.
MOUNT-WIDE PASSPHRASE
Mount-wide Passphrase
=====================
Create a new directory into which eCryptfs will write its encrypted
files (i.e., /root/crypt). Then, create the mount point directory
(i.e., /mnt/crypt). Now it's time to mount eCryptfs:
(i.e., /mnt/crypt). Now it's time to mount eCryptfs::
mount -t ecryptfs /root/crypt /mnt/crypt
mount -t ecryptfs /root/crypt /mnt/crypt
You should be prompted for a passphrase and a salt (the salt may be
blank).
Try writing a new file:
Try writing a new file::
echo "Hello, World" > /mnt/crypt/hello.txt
echo "Hello, World" > /mnt/crypt/hello.txt
The operation will complete. Notice that there is a new file in
/root/crypt that is at least 12288 bytes in size (depending on your
@@ -59,10 +66,13 @@ keyctl clear @u
Then umount /mnt/crypt and mount again per the instructions given
above.
cat /mnt/crypt/hello.txt
::
cat /mnt/crypt/hello.txt
NOTES
Notes
=====
eCryptfs version 0.1 should only be mounted on (1) empty directories
or (2) directories containing files only created by eCryptfs. If you
@@ -1,5 +1,8 @@
.. SPDX-License-Identifier: GPL-2.0
=======================================
efivarfs - a (U)EFI variable filesystem
=======================================
The efivarfs filesystem was created to address the shortcomings of
using entries in sysfs to maintain EFI variables. The old sysfs EFI
@@ -11,7 +14,7 @@ than a single page, sysfs isn't the best interface for this.
Variables can be created, deleted and modified with the efivarfs
filesystem.
efivarfs is typically mounted like this,
efivarfs is typically mounted like this::
mount -t efivarfs none /sys/firmware/efi/efivars
@@ -1,3 +1,9 @@
.. SPDX-License-Identifier: GPL-2.0
======================================
Enhanced Read-Only File System - EROFS
======================================
Overview
========
@@ -6,6 +12,7 @@ from other read-only file systems, it aims to be designed for flexibility,
scalability, but be kept simple and high performance.
It is designed as a better filesystem solution for the following scenarios:
- read-only storage media or
- part of a fully trusted read-only solution, which means it needs to be
@@ -17,6 +24,7 @@ It is designed as a better filesystem solution for the following scenarios:
for those embedded devices with limited memory (ex, smartphone);
Here is the main features of EROFS:
- Little endian on-disk design;
- Currently 4KB block size (nobh) and therefore maximum 16TB address space;
@@ -24,13 +32,17 @@ Here is the main features of EROFS:
- Metadata & data could be mixed by design;
- 2 inode versions for different requirements:
===================== ============ =====================================
compact (v1) extended (v2)
Inode metadata size: 32 bytes 64 bytes
Max file size: 4 GB 16 EB (also limited by max. vol size)
Max uids/gids: 65536 4294967296
File change time: no yes (64 + 32-bit timestamp)
Max hardlinks: 65536 4294967296
Metadata reserved: 4 bytes 14 bytes
===================== ============ =====================================
Inode metadata size 32 bytes 64 bytes
Max file size 4 GB 16 EB (also limited by max. vol size)
Max uids/gids 65536 4294967296
File change time no yes (64 + 32-bit timestamp)
Max hardlinks 65536 4294967296
Metadata reserved 4 bytes 14 bytes
===================== ============ =====================================
- Support extended attributes (xattrs) as an option;
@@ -43,29 +55,36 @@ Here is the main features of EROFS:
The following git tree provides the file system user-space tools under
development (ex, formatting tool mkfs.erofs):
>> git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git
- git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git
Bugs and patches are welcome, please kindly help us and send to the following
linux-erofs mailing list:
>> linux-erofs mailing list <linux-erofs@lists.ozlabs.org>
- linux-erofs mailing list <linux-erofs@lists.ozlabs.org>
Mount options
=============
=================== =========================================================
(no)user_xattr Setup Extended User Attributes. Note: xattr is enabled
by default if CONFIG_EROFS_FS_XATTR is selected.
(no)acl Setup POSIX Access Control List. Note: acl is enabled
by default if CONFIG_EROFS_FS_POSIX_ACL is selected.
cache_strategy=%s Select a strategy for cached decompression from now on:
disabled: In-place I/O decompression only;
readahead: Cache the last incomplete compressed physical
========== =============================================
disabled In-place I/O decompression only;
readahead Cache the last incomplete compressed physical
cluster for further reading. It still does
in-place I/O decompression for the rest
compressed physical clusters;
readaround: Cache the both ends of incomplete compressed
readaround Cache the both ends of incomplete compressed
physical clusters for further reading.
It still does in-place I/O decompression
for the rest compressed physical clusters.
========== =============================================
=================== =========================================================
On-disk details
===============
@@ -73,7 +92,7 @@ On-disk details
Summary
-------
Different from other read-only file systems, an EROFS volume is designed
to be as simple as possible:
to be as simple as possible::
|-> aligned with the block size
____________________________________________________________
@@ -83,41 +102,45 @@ to be as simple as possible:
All data areas should be aligned with the block size, but metadata areas
may not. All metadatas can be now observed in two different spaces (views):
1. Inode metadata space
Each valid inode should be aligned with an inode slot, which is a fixed
value (32 bytes) and designed to be kept in line with compact inode size.
Each inode can be directly found with the following formula:
inode offset = meta_blkaddr * block_size + 32 * nid
|-> aligned with 8B
|-> followed closely
+ meta_blkaddr blocks |-> another slot
_____________________________________________________________________
| ... | inode | xattrs | extents | data inline | ... | inode ...
|________|_______|(optional)|(optional)|__(optional)_|_____|__________
|-> aligned with the inode slot size
. .
. .
. .
. .
. .
. .
.____________________________________________________|-> aligned with 4B
| xattr_ibody_header | shared xattrs | inline xattrs |
|____________________|_______________|_______________|
|-> 12 bytes <-|->x * 4 bytes<-| .
. . .
. . .
. . .
._______________________________.______________________.
| id | id | id | id | ... | id | ent | ... | ent| ... |
|____|____|____|____|______|____|_____|_____|____|_____|
|-> aligned with 4B
|-> aligned with 4B
::
|-> aligned with 8B
|-> followed closely
+ meta_blkaddr blocks |-> another slot
_____________________________________________________________________
| ... | inode | xattrs | extents | data inline | ... | inode ...
|________|_______|(optional)|(optional)|__(optional)_|_____|__________
|-> aligned with the inode slot size
. .
. .
. .
. .
. .
. .
.____________________________________________________|-> aligned with 4B
| xattr_ibody_header | shared xattrs | inline xattrs |
|____________________|_______________|_______________|
|-> 12 bytes <-|->x * 4 bytes<-| .
. . .
. . .
. . .
._______________________________.______________________.
| id | id | id | id | ... | id | ent | ... | ent| ... |
|____|____|____|____|______|____|_____|_____|____|_____|
|-> aligned with 4B
|-> aligned with 4B
Inode could be 32 or 64 bytes, which can be distinguished from a common
field which all inode versions have -- i_format:
field which all inode versions have -- i_format::
__________________ __________________
| i_format | | i_format |
@@ -132,16 +155,19 @@ may not. All metadatas can be now observed in two different spaces (views):
proper alignment, and they could be optional for different data mappings.
_currently_ total 4 valid data mappings are supported:
== ====================================================================
0 flat file data without data inline (no extent);
1 fixed-sized output data compression (with non-compacted indexes);
2 flat file data with tail packing data inline (no extent);
3 fixed-sized output data compression (with compacted indexes, v5.3+).
== ====================================================================
The size of the optional xattrs is indicated by i_xattr_count in inode
header. Large xattrs or xattrs shared by many different files can be
stored in shared xattrs metadata rather than inlined right after inode.
2. Shared xattrs metadata space
Shared xattrs space is similar to the above inode space, started with
a specific block indicated by xattr_blkaddr, organized one by one with
proper align.
@@ -149,11 +175,13 @@ may not. All metadatas can be now observed in two different spaces (views):
Each share xattr can also be directly found by the following formula:
xattr offset = xattr_blkaddr * block_size + 4 * xattr_id
|-> aligned by 4 bytes
+ xattr_blkaddr blocks |-> aligned with 4 bytes
_________________________________________________________________________
| ... | xattr_entry | xattr data | ... | xattr_entry | xattr data ...
|________|_____________|_____________|_____|______________|_______________
::
|-> aligned by 4 bytes
+ xattr_blkaddr blocks |-> aligned with 4 bytes
_________________________________________________________________________
| ... | xattr_entry | xattr data | ... | xattr_entry | xattr data ...
|________|_____________|_____________|_____|______________|_______________
Directories
-----------
@@ -163,19 +191,21 @@ random file lookup, and all directory entries are _strictly_ recorded in
alphabetical order in order to support improved prefix binary search
algorithm (could refer to the related source code).
___________________________
/ |
/ ______________|________________
/ / | nameoff1 | nameoffN-1
____________.______________._______________v________________v__________
| dirent | dirent | ... | dirent | filename | filename | ... | filename |
|___.0___|____1___|_____|___N-1__|____0_____|____1_____|_____|___N-1____|
\ ^
\ | * could have
\ | trailing '\0'
\________________________| nameoff0
::
Directory block
___________________________
/ |
/ ______________|________________
/ / | nameoff1 | nameoffN-1
____________.______________._______________v________________v__________
| dirent | dirent | ... | dirent | filename | filename | ... | filename |
|___.0___|____1___|_____|___N-1__|____0_____|____1_____|_____|___N-1____|
\ ^
\ | * could have
\ | trailing '\0'
\________________________| nameoff0
Directory block
Note that apart from the offset of the first filename, nameoff0 also indicates
the total number of directory entries in this block since it is no need to
@@ -184,28 +214,27 @@ introduce another on-disk field at all.
Compression
-----------
Currently, EROFS supports 4KB fixed-sized output transparent file compression,
as illustrated below:
as illustrated below::
|---- Variant-Length Extent ----|-------- VLE --------|----- VLE -----
clusterofs clusterofs clusterofs
| | | logical data
_________v_______________________________v_____________________v_______________
... | . | | . | | . | ...
____|____.________|_____________|________.____|_____________|__.__________|____
|-> cluster <-|-> cluster <-|-> cluster <-|-> cluster <-|-> cluster <-|
size size size size size
. . . .
. . . .
. . . .
_______._____________._____________._____________._____________________
... | | | | ... physical data
_______|_____________|_____________|_____________|_____________________
|-> cluster <-|-> cluster <-|-> cluster <-|
size size size
|---- Variant-Length Extent ----|-------- VLE --------|----- VLE -----
clusterofs clusterofs clusterofs
| | | logical data
_________v_______________________________v_____________________v_______________
... | . | | . | | . | ...
____|____.________|_____________|________.____|_____________|__.__________|____
|-> cluster <-|-> cluster <-|-> cluster <-|-> cluster <-|-> cluster <-|
size size size size size
. . . .
. . . .
. . . .
_______._____________._____________._____________._____________________
... | | | | ... physical data
_______|_____________|_____________|_____________|_____________________
|-> cluster <-|-> cluster <-|-> cluster <-|
size size size
Currently each on-disk physical cluster can contain 4KB (un)compressed data
at most. For each logical cluster, there is a corresponding on-disk index to
describe its cluster type, physical cluster address, etc.
See "struct z_erofs_vle_decompressed_index" in erofs_fs.h for more details.
@@ -1,3 +1,5 @@
.. SPDX-License-Identifier: GPL-2.0
The Second Extended Filesystem
==============================
@@ -14,8 +16,9 @@ Options
Most defaults are determined by the filesystem superblock, and can be
set using tune2fs(8). Kernel-determined defaults are indicated by (*).
bsddf (*) Makes `df' act like BSD.
minixdf Makes `df' act like Minix.
==================== === ================================================
bsddf (*) Makes ``df`` act like BSD.
minixdf Makes ``df`` act like Minix.
check=none, nocheck (*) Don't do extra checking of bitmaps on mount
(check=normal and check=strict options removed)
@@ -62,6 +65,7 @@ quota, usrquota Enable user disk quota support
grpquota Enable group disk quota support
(requires CONFIG_QUOTA).
==================== === ================================================
noquota option ls silently ignored by ext2.
@@ -294,9 +298,9 @@ respective fsck programs.
If you're exceptionally paranoid, there are 3 ways of making metadata
writes synchronous on ext2:
per-file if you have the program source: use the O_SYNC flag to open()
per-file if you don't have the source: use "chattr +S" on the file
per-filesystem: add the "sync" option to mount (or in /etc/fstab)
- per-file if you have the program source: use the O_SYNC flag to open()
- per-file if you don't have the source: use "chattr +S" on the file
- per-filesystem: add the "sync" option to mount (or in /etc/fstab)
the first and last are not ext2 specific but do force the metadata to
be written synchronously. See also Journaling below.
@@ -316,10 +320,12 @@ Most of these limits could be overcome with slight changes in the on-disk
format and using a compatibility flag to signal the format change (at
the expense of some compatibility).
Filesystem block size: 1kB 2kB 4kB 8kB
File size limit: 16GB 256GB 2048GB 2048GB
Filesystem size limit: 2047GB 8192GB 16384GB 32768GB
===================== ======= ======= ======= ========
Filesystem block size 1kB 2kB 4kB 8kB
===================== ======= ======= ======= ========
File size limit 16GB 256GB 2048GB 2048GB
Filesystem size limit 2047GB 8192GB 16384GB 32768GB
===================== ======= ======= ======= ========
There is a 2.4 kernel limit of 2048GB for a single block device, so no
filesystem larger than that can be created at this time. There is also
@@ -370,19 +376,24 @@ ext4 and journaling.
References
==========
======================= ===============================================
The kernel source file:/usr/src/linux/fs/ext2/
e2fsprogs (e2fsck) http://e2fsprogs.sourceforge.net/
Design & Implementation http://e2fsprogs.sourceforge.net/ext2intro.html
Journaling (ext3) ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/
Filesystem Resizing http://ext2resize.sourceforge.net/
Compression (*) http://e2compr.sourceforge.net/
Compression [1]_ http://e2compr.sourceforge.net/
======================= ===============================================
Implementations for:
Windows 95/98/NT/2000 http://www.chrysocome.net/explore2fs
Windows 95 (*) http://www.yipton.net/content.html#FSDEXT2
DOS client (*) ftp://metalab.unc.edu/pub/Linux/system/filesystems/ext2/
OS/2 (+) ftp://metalab.unc.edu/pub/Linux/system/filesystems/ext2/
RISC OS client http://www.esw-heim.tu-clausthal.de/~marco/smorbrod/IscaFS/
(*) no longer actively developed/supported (as of Apr 2001)
(+) no longer actively developed/supported (as of Mar 2009)
======================= ===========================================================
Windows 95/98/NT/2000 http://www.chrysocome.net/explore2fs
Windows 95 [1]_ http://www.yipton.net/content.html#FSDEXT2
DOS client [1]_ ftp://metalab.unc.edu/pub/Linux/system/filesystems/ext2/
OS/2 [2]_ ftp://metalab.unc.edu/pub/Linux/system/filesystems/ext2/
RISC OS client http://www.esw-heim.tu-clausthal.de/~marco/smorbrod/IscaFS/
======================= ===========================================================
.. [1] no longer actively developed/supported (as of Apr 2001)
.. [2] no longer actively developed/supported (as of Mar 2009)
@@ -1,4 +1,6 @@
.. SPDX-License-Identifier: GPL-2.0
===============
Ext3 Filesystem
===============
@@ -1,6 +1,8 @@
================================================================================
.. SPDX-License-Identifier: GPL-2.0
==========================================
WHAT IS Flash-Friendly File System (F2FS)?
================================================================================
==========================================
NAND flash memory-based storage devices, such as SSD, eMMC, and SD cards, have
been equipped on a variety systems ranging from mobile to server systems. Since
@@ -20,14 +22,15 @@ layout, but also for selecting allocation and cleaning algorithms.
The following git tree provides the file system formatting tool (mkfs.f2fs),
a consistency checking tool (fsck.f2fs), and a debugging tool (dump.f2fs).
>> git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs-tools.git
- git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs-tools.git
For reporting bugs and sending patches, please use the following mailing list:
>> linux-f2fs-devel@lists.sourceforge.net
================================================================================
BACKGROUND AND DESIGN ISSUES
================================================================================
- linux-f2fs-devel@lists.sourceforge.net
Background and Design issues
============================
Log-structured File System (LFS)
--------------------------------
@@ -61,6 +64,7 @@ needs to reclaim these obsolete blocks seamlessly to users. This job is called
as a cleaning process.
The process consists of three operations as follows.
1. A victim segment is selected through referencing segment usage table.
2. It loads parent index structures of all the data in the victim identified by
segment summary blocks.
@@ -71,9 +75,8 @@ This cleaning job may cause unexpected long delays, so the most important goal
is to hide the latencies to users. And also definitely, it should reduce the
amount of valid data to be moved, and move them quickly as well.
================================================================================
KEY FEATURES
================================================================================
Key Features
============
Flash Awareness
---------------
@@ -94,10 +97,11 @@ Cleaning Overhead
- Support multi-head logs for static/dynamic hot and cold data separation
- Introduce adaptive logging for efficient block allocation
================================================================================
MOUNT OPTIONS
================================================================================
Mount Options
=============
====================== ============================================================
background_gc=%s Turn on/off cleaning operations, namely garbage
collection, triggered in background when I/O subsystem is
idle. If background_gc=on, it will turn on the garbage
@@ -167,7 +171,10 @@ fault_injection=%d Enable fault injection in all supported types with
fault_type=%d Support configuring fault injection type, should be
enabled with fault_injection option, fault type value
is shown below, it supports single or combined type.
=================== ===========
Type_Name Type_Value
=================== ===========
FAULT_KMALLOC 0x000000001
FAULT_KVMALLOC 0x000000002
FAULT_PAGE_ALLOC 0x000000004
@@ -183,6 +190,7 @@ fault_type=%d Support configuring fault injection type, should be
FAULT_CHECKPOINT 0x000001000
FAULT_DISCARD 0x000002000
FAULT_WRITE_IO 0x000004000
=================== ===========
mode=%s Control block allocation mode which supports "adaptive"
and "lfs". In "lfs" mode, there should be no random
writes towards main area.
@@ -219,7 +227,7 @@ fsync_mode=%s Control the policy of fsync. Currently supports "posix",
non-atomic files likewise "nobarrier" mount option.
test_dummy_encryption Enable dummy encryption, which provides a fake fscrypt
context. The fake fscrypt context is used by xfstests.
checkpoint=%s[:%u[%]] Set to "disable" to turn off checkpointing. Set to "enable"
checkpoint=%s[:%u[%]] Set to "disable" to turn off checkpointing. Set to "enable"
to reenable checkpointing. Is enabled by default. While
disabled, any unmounting or unexpected shutdowns will cause
the filesystem contents to appear as they did when the
@@ -246,22 +254,22 @@ compress_extension=%s Support adding specified extension, so that f2fs can enab
on compression extension list and enable compression on
these file by default rather than to enable it via ioctl.
For other files, we can still enable compression via ioctl.
====================== ============================================================
================================================================================
DEBUGFS ENTRIES
================================================================================
Debugfs Entries
===============
/sys/kernel/debug/f2fs/ contains information about all the partitions mounted as
f2fs. Each file shows the whole f2fs information.
/sys/kernel/debug/f2fs/status includes:
- major file system information managed by f2fs currently
- average SIT information about whole segments
- current memory footprint consumed by f2fs.
================================================================================
SYSFS ENTRIES
================================================================================
Sysfs Entries
=============
Information about mounted f2fs file systems can be found in
/sys/fs/f2fs. Each mounted filesystem will have a directory in
@@ -271,22 +279,24 @@ The files in each per-device directory are shown in table below.
Files in /sys/fs/f2fs/<devname>
(see also Documentation/ABI/testing/sysfs-fs-f2fs)
================================================================================
USAGE
================================================================================
Usage
=====
1. Download userland tools and compile them.
2. Skip, if f2fs was compiled statically inside kernel.
Otherwise, insert the f2fs.ko module.
# insmod f2fs.ko
Otherwise, insert the f2fs.ko module::
3. Create a directory trying to mount
# mkdir /mnt/f2fs
# insmod f2fs.ko
4. Format the block device, and then mount as f2fs
# mkfs.f2fs -l label /dev/block_device
# mount -t f2fs /dev/block_device /mnt/f2fs
3. Create a directory trying to mount::
# mkdir /mnt/f2fs
4. Format the block device, and then mount as f2fs::
# mkfs.f2fs -l label /dev/block_device
# mount -t f2fs /dev/block_device /mnt/f2fs
mkfs.f2fs
---------
@@ -294,18 +304,26 @@ The mkfs.f2fs is for the use of formatting a partition as the f2fs filesystem,
which builds a basic on-disk layout.
The options consist of:
-l [label] : Give a volume label, up to 512 unicode name.
-a [0 or 1] : Split start location of each area for heap-based allocation.
1 is set by default, which performs this.
-o [int] : Set overprovision ratio in percent over volume size.
5 is set by default.
-s [int] : Set the number of segments per section.
1 is set by default.
-z [int] : Set the number of sections per zone.
1 is set by default.
-e [str] : Set basic extension list. e.g. "mp3,gif,mov"
-t [0 or 1] : Disable discard command or not.
1 is set by default, which conducts discard.
=============== ===========================================================
``-l [label]`` Give a volume label, up to 512 unicode name.
``-a [0 or 1]`` Split start location of each area for heap-based allocation.
1 is set by default, which performs this.
``-o [int]`` Set overprovision ratio in percent over volume size.
5 is set by default.
``-s [int]`` Set the number of segments per section.
1 is set by default.
``-z [int]`` Set the number of sections per zone.
1 is set by default.
``-e [str]`` Set basic extension list. e.g. "mp3,gif,mov"
``-t [0 or 1]`` Disable discard command or not.
1 is set by default, which conducts discard.
=============== ===========================================================
fsck.f2fs
---------
@@ -314,7 +332,8 @@ partition, which examines whether the filesystem metadata and user-made data
are cross-referenced correctly or not.
Note that, initial version of the tool does not fix any inconsistency.
The options consist of:
The options consist of::
-d debug level [default:0]
dump.f2fs
@@ -327,20 +346,21 @@ It shows on-disk inode information recognized by a given inode number, and is
able to dump all the SSA and SIT entries into predefined files, ./dump_ssa and
./dump_sit respectively.
The options consist of:
The options consist of::
-d debug level [default:0]
-i inode no (hex)
-s [SIT dump segno from #1~#2 (decimal), for all 0~-1]
-a [SSA dump segno from #1~#2 (decimal), for all 0~-1]
Examples:
# dump.f2fs -i [ino] /dev/sdx
# dump.f2fs -s 0~-1 /dev/sdx (SIT dump)
# dump.f2fs -a 0~-1 /dev/sdx (SSA dump)
Examples::
================================================================================
DESIGN
================================================================================
# dump.f2fs -i [ino] /dev/sdx
# dump.f2fs -s 0~-1 /dev/sdx (SIT dump)
# dump.f2fs -a 0~-1 /dev/sdx (SSA dump)
Design
======
On-disk Layout
--------------
@@ -351,7 +371,7 @@ consists of a set of sections. By default, section and zone sizes are set to one
segment size identically, but users can easily modify the sizes by mkfs.
F2FS splits the entire volume into six areas, and all the areas except superblock
consists of multiple segments as described below.
consists of multiple segments as described below::
align with the zone size <-|
|-> align with the segment size
@@ -373,28 +393,28 @@ consists of multiple segments as described below.
|__zone__|
- Superblock (SB)
: It is located at the beginning of the partition, and there exist two copies
It is located at the beginning of the partition, and there exist two copies
to avoid file system crash. It contains basic partition information and some
default parameters of f2fs.
- Checkpoint (CP)
: It contains file system information, bitmaps for valid NAT/SIT sets, orphan
It contains file system information, bitmaps for valid NAT/SIT sets, orphan
inode lists, and summary entries of current active segments.
- Segment Information Table (SIT)
: It contains segment information such as valid block count and bitmap for the
It contains segment information such as valid block count and bitmap for the
validity of all the blocks.
- Node Address Table (NAT)
: It is composed of a block address table for all the node blocks stored in
It is composed of a block address table for all the node blocks stored in
Main area.
- Segment Summary Area (SSA)
: It contains summary entries which contains the owner information of all the
It contains summary entries which contains the owner information of all the
data and node blocks stored in Main area.
- Main Area
: It contains file and directory data including their indices.
It contains file and directory data including their indices.
In order to avoid misalignment between file system and flash-based storage, F2FS
aligns the start block address of CP with the segment size. Also, it aligns the
@@ -414,7 +434,7 @@ One of them always indicates the last valid data, which is called as shadow copy
mechanism. In addition to CP, NAT and SIT also adopt the shadow copy mechanism.
For file system consistency, each CP points to which NAT and SIT copies are
valid, as shown as below.
valid, as shown as below::
+--------+----------+---------+
| CP | SIT | NAT |
@@ -438,7 +458,7 @@ indirect node. F2FS assigns 4KB to an inode block which contains 923 data block
indices, two direct node pointers, two indirect node pointers, and one double
indirect node pointer as described below. One direct node block contains 1018
data blocks, and one indirect node block contains also 1018 node blocks. Thus,
one inode block (i.e., a file) covers:
one inode block (i.e., a file) covers::
4KB * (923 + 2 * 1018 + 2 * 1018 * 1018 + 1018 * 1018 * 1018) := 3.94TB.
@@ -473,6 +493,8 @@ A dentry block consists of 214 dentry slots and file names. Therein a bitmap is
used to represent whether each dentry is valid or not. A dentry block occupies
4KB with the following composition.
::
Dentry Block(4 K) = bitmap (27 bytes) + reserved (3 bytes) +
dentries(11 * 214 bytes) + file name (8 * 214 bytes)
@@ -498,23 +520,25 @@ F2FS implements multi-level hash tables for directory structure. Each level has
a hash table with dedicated number of hash buckets as shown below. Note that
"A(2B)" means a bucket includes 2 data blocks.
----------------------
A : bucket
B : block
N : MAX_DIR_HASH_DEPTH
----------------------
::
level #0 | A(2B)
|
level #1 | A(2B) - A(2B)
|
level #2 | A(2B) - A(2B) - A(2B) - A(2B)
. | . . . .
level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
. | . . . .
level #N | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B)
----------------------
A : bucket
B : block
N : MAX_DIR_HASH_DEPTH
----------------------
The number of blocks and buckets are determined by,
level #0 | A(2B)
|
level #1 | A(2B) - A(2B)
|
level #2 | A(2B) - A(2B) - A(2B) - A(2B)
. | . . . .
level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
. | . . . .
level #N | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B)
The number of blocks and buckets are determined by::
,- 2, if n < MAX_DIR_HASH_DEPTH / 2,
# of blocks in level #n = |
@@ -532,7 +556,7 @@ dentry consisting of the file name and its inode number. If not found, F2FS
scans the next hash table in level #1. In this way, F2FS scans hash tables in
each levels incrementally from 1 to N. In each levels F2FS needs to scan only
one bucket determined by the following equation, which shows O(log(# of files))
complexity.
complexity::
bucket number to scan in level #n = (hash value) % (# of buckets in level #n)
@@ -540,7 +564,8 @@ In the case of file creation, F2FS finds empty consecutive slots that cover the
file name. F2FS searches the empty slots in the hash tables of whole levels from
1 to N in the same way as the lookup operation.
The following figure shows an example of two cases holding children.
The following figure shows an example of two cases holding children::
--------------> Dir <--------------
| |
child child
@@ -611,14 +636,15 @@ Write-hint Policy
2) whint_mode=user-based. F2FS tries to pass down hints given by
users.
===================== ======================== ===================
User F2FS Block
---- ---- -----
===================== ======================== ===================
META WRITE_LIFE_NOT_SET
HOT_NODE "
WARM_NODE "
COLD_NODE "
*ioctl(COLD) COLD_DATA WRITE_LIFE_EXTREME
*extension list " "
ioctl(COLD) COLD_DATA WRITE_LIFE_EXTREME
extension list " "
-- buffered io
WRITE_LIFE_EXTREME COLD_DATA WRITE_LIFE_EXTREME
@@ -635,11 +661,13 @@ WRITE_LIFE_NOT_SET WARM_DATA WRITE_LIFE_NOT_SET
WRITE_LIFE_NONE " WRITE_LIFE_NONE
WRITE_LIFE_MEDIUM " WRITE_LIFE_MEDIUM
WRITE_LIFE_LONG " WRITE_LIFE_LONG
===================== ======================== ===================
3) whint_mode=fs-based. F2FS passes down hints with its policy.
===================== ======================== ===================
User F2FS Block
---- ---- -----
===================== ======================== ===================
META WRITE_LIFE_MEDIUM;
HOT_NODE WRITE_LIFE_NOT_SET
WARM_NODE "
@@ -662,6 +690,7 @@ WRITE_LIFE_NOT_SET WARM_DATA WRITE_LIFE_NOT_SET
WRITE_LIFE_NONE " WRITE_LIFE_NONE
WRITE_LIFE_MEDIUM " WRITE_LIFE_MEDIUM
WRITE_LIFE_LONG " WRITE_LIFE_LONG
===================== ======================== ===================
Fallocate(2) Policy
-------------------
@@ -681,6 +710,7 @@ Allocating disk space
However, once F2FS receives ioctl(fd, F2FS_IOC_SET_PIN_FILE) in prior to
fallocate(fd, DEFAULT_MODE), it allocates on-disk blocks addressess having
zero or random data, which is useful to the below scenario where:
1. create(fd)
2. ioctl(fd, F2FS_IOC_SET_PIN_FILE)
3. fallocate(fd, 0, 0, size)
@@ -692,39 +722,41 @@ Compression implementation
--------------------------
- New term named cluster is defined as basic unit of compression, file can
be divided into multiple clusters logically. One cluster includes 4 << n
(n >= 0) logical pages, compression size is also cluster size, each of
cluster can be compressed or not.
be divided into multiple clusters logically. One cluster includes 4 << n
(n >= 0) logical pages, compression size is also cluster size, each of
cluster can be compressed or not.
- In cluster metadata layout, one special block address is used to indicate
cluster is compressed one or normal one, for compressed cluster, following
metadata maps cluster to [1, 4 << n - 1] physical blocks, in where f2fs
stores data including compress header and compressed data.
cluster is compressed one or normal one, for compressed cluster, following
metadata maps cluster to [1, 4 << n - 1] physical blocks, in where f2fs
stores data including compress header and compressed data.
- In order to eliminate write amplification during overwrite, F2FS only
support compression on write-once file, data can be compressed only when
all logical blocks in file are valid and cluster compress ratio is lower
than specified threshold.
support compression on write-once file, data can be compressed only when
all logical blocks in file are valid and cluster compress ratio is lower
than specified threshold.
- To enable compression on regular inode, there are three ways:
* chattr +c file
* chattr +c dir; touch dir/file
* mount w/ -o compress_extension=ext; touch file.ext
Compress metadata layout:
[Dnode Structure]
+-----------------------------------------------+
| cluster 1 | cluster 2 | ......... | cluster N |
+-----------------------------------------------+
. . . .
. . . .
. Compressed Cluster . . Normal Cluster .
+----------+---------+---------+---------+ +---------+---------+---------+---------+
|compr flag| block 1 | block 2 | block 3 | | block 1 | block 2 | block 3 | block 4 |
+----------+---------+---------+---------+ +---------+---------+---------+---------+
. .
. .
. .
+-------------+-------------+----------+----------------------------+
| data length | data chksum | reserved | compressed data |
+-------------+-------------+----------+----------------------------+
* chattr +c file
* chattr +c dir; touch dir/file
* mount w/ -o compress_extension=ext; touch file.ext
Compress metadata layout::
[Dnode Structure]
+-----------------------------------------------+
| cluster 1 | cluster 2 | ......... | cluster N |
+-----------------------------------------------+
. . . .
. . . .
. Compressed Cluster . . Normal Cluster .
+----------+---------+---------+---------+ +---------+---------+---------+---------+
|compr flag| block 1 | block 2 | block 3 | | block 1 | block 2 | block 3 | block 4 |
+----------+---------+---------+---------+ +---------+---------+---------+---------+
. .
. .
. .
+-------------+-------------+----------+----------------------------+
| data length | data chksum | reserved | compressed data |
+-------------+-------------+----------+----------------------------+
@@ -1,14 +1,18 @@
uevents and GFS2
==================
.. SPDX-License-Identifier: GPL-2.0
================
uevents and GFS2
================
During the lifetime of a GFS2 mount, a number of uevents are generated.
This document explains what the events are and what they are used
for (by gfs_controld in gfs2-utils).
A list of GFS2 uevents
-----------------------
======================
1. ADD
------
The ADD event occurs at mount time. It will always be the first
uevent generated by the newly created filesystem. If the mount
@@ -21,6 +25,7 @@ with no journal assigned), and read-only (with journal assigned) status
of the filesystem respectively.
2. ONLINE
---------
The ONLINE uevent is generated after a successful mount or remount. It
has the same environment variables as the ADD uevent. The ONLINE
@@ -29,6 +34,7 @@ RDONLY are a relatively recent addition (2.6.32-rc+) and will not
be generated by older kernels.
3. CHANGE
---------
The CHANGE uevent is used in two places. One is when reporting the
successful mount of the filesystem by the first node (FIRSTMOUNT=Done).
@@ -52,6 +58,7 @@ cluster. For this reason the ONLINE uevent was used when adding a new
uevent for a successful mount or remount.
4. OFFLINE
----------
The OFFLINE uevent is only generated due to filesystem errors and is used
as part of the "withdraw" mechanism. Currently this doesn't give any
@@ -59,6 +66,7 @@ information about what the error is, which is something that needs to
be fixed.
5. REMOVE
---------
The REMOVE uevent is generated at the end of an unsuccessful mount
or at the end of a umount of the filesystem. All REMOVE uevents will
@@ -68,9 +76,10 @@ kobject subsystem.
Information common to all GFS2 uevents (uevent environment variables)
----------------------------------------------------------------------
=====================================================================
1. LOCKTABLE=
--------------
The LOCKTABLE is a string, as supplied on the mount command
line (locktable=) or via fstab. It is used as a filesystem label
@@ -78,6 +87,7 @@ as well as providing the information for a lock_dlm mount to be
able to join the cluster.
2. LOCKPROTO=
-------------
The LOCKPROTO is a string, and its value depends on what is set
on the mount command line, or via fstab. It will be either
@@ -85,12 +95,14 @@ lock_nolock or lock_dlm. In the future other lock managers
may be supported.
3. JOURNALID=
-------------
If a journal is in use by the filesystem (journals are not
assigned for spectator mounts) then this will give the
numeric journal id in all GFS2 uevents.
4. UUID=
--------
With recent versions of gfs2-utils, mkfs.gfs2 writes a UUID
into the filesystem superblock. If it exists, this will
@@ -1,5 +1,8 @@
.. SPDX-License-Identifier: GPL-2.0
==================
Global File System
------------------
==================
https://fedorahosted.org/cluster/wiki/HomePage
@@ -14,16 +17,18 @@ on one machine show up immediately on all other machines in the cluster.
GFS uses interchangeable inter-node locking mechanisms, the currently
supported mechanisms are:
lock_nolock -- allows gfs to be used as a local file system
lock_nolock
- allows gfs to be used as a local file system
lock_dlm -- uses a distributed lock manager (dlm) for inter-node locking
The dlm is found at linux/fs/dlm/
lock_dlm
- uses a distributed lock manager (dlm) for inter-node locking.
The dlm is found at linux/fs/dlm/
Lock_dlm depends on user space cluster management systems found
at the URL above.
To use gfs as a local file system, no external clustering systems are
needed, simply:
needed, simply::
$ mkfs -t gfs2 -p lock_nolock -j 1 /dev/block_device
$ mount -t gfs2 /dev/block_device /dir
@@ -37,9 +42,12 @@ GFS2 is not on-disk compatible with previous versions of GFS, but it
is pretty close.
The following man pages can be found at the URL above:
============ =============================================
fsck.gfs2 to repair a filesystem
gfs2_grow to expand a filesystem online
gfs2_jadd to add journals to a filesystem online
tunegfs2 to manipulate, examine and tune a filesystem
gfs2_convert to convert a gfs filesystem to gfs2 in-place
gfs2_convert to convert a gfs filesystem to gfs2 in-place
mkfs.gfs2 to make a filesystem
============ =============================================

Some files were not shown because too many files have changed in this diff Show More