Commit Graph

166 Commits

Author SHA1 Message Date
Christoph Lameter
1b2a1a7e8a drivers/char/random: Replace __get_cpu_var uses
A single case of using __get_cpu_var for address calculation.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2014-08-26 13:45:45 -04:00
Linus Torvalds
f4f142ed4e Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
Pull randomness updates from Ted Ts'o:
 "Cleanups and bug fixes to /dev/random, add a new getrandom(2) system
  call, which is a superset of OpenBSD's getentropy(2) call, for use
  with userspace crypto libraries such as LibreSSL.

  Also add the ability to have a kernel thread to pull entropy from
  hardware rng devices into /dev/random"

* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
  hwrng: Pass entropy to add_hwgenerator_randomness() in bits, not bytes
  random: limit the contribution of the hw rng to at most half
  random: introduce getrandom(2) system call
  hw_random: fix sparse warning (NULL vs 0 for pointer)
  random: use registers from interrupted code for CPU's w/o a cycle counter
  hwrng: add per-device entropy derating
  hwrng: create filler thread
  random: add_hwgenerator_randomness() for feeding entropy from devices
  random: use an improved fast_mix() function
  random: clean up interrupt entropy accounting for archs w/o cycle counters
  random: only update the last_pulled time if we actually transferred entropy
  random: remove unneeded hash of a portion of the entropy pool
  random: always update the entropy pool under the spinlock
2014-08-06 08:16:24 -07:00
Theodore Ts'o
48d6be955a random: limit the contribution of the hw rng to at most half
For people who don't trust a hardware RNG which can not be audited,
the changes to add support for RDSEED can be troubling since 97% or
more of the entropy will be contributed from the in-CPU hardware RNG.

We now have a in-kernel khwrngd, so for those people who do want to
implicitly trust the CPU-based system, we could create an arch-rng
hw_random driver, and allow khwrng refill the entropy pool.  This
allows system administrator whether or not they trust the CPU (I
assume the NSA will trust RDRAND/RDSEED implicitly :-), and if so,
what level of entropy derating they want to use.

The reason why this is a really good idea is that if different people
use different levels of entropy derating, it will make it much more
difficult to design a backdoor'ed hwrng that can be generally
exploited in terms of the output of /dev/random when different attack
targets are using differing levels of entropy derating.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-08-05 16:41:50 -04:00
Theodore Ts'o
c6e9d6f388 random: introduce getrandom(2) system call
The getrandom(2) system call was requested by the LibreSSL Portable
developers.  It is analoguous to the getentropy(2) system call in
OpenBSD.

The rationale of this system call is to provide resiliance against
file descriptor exhaustion attacks, where the attacker consumes all
available file descriptors, forcing the use of the fallback code where
/dev/[u]random is not available.  Since the fallback code is often not
well-tested, it is better to eliminate this potential failure mode
entirely.

The other feature provided by this new system call is the ability to
request randomness from the /dev/urandom entropy pool, but to block
until at least 128 bits of entropy has been accumulated in the
/dev/urandom entropy pool.  Historically, the emphasis in the
/dev/urandom development has been to ensure that urandom pool is
initialized as quickly as possible after system boot, and preferably
before the init scripts start execution.

This is because changing /dev/urandom reads to block represents an
interface change that could potentially break userspace which is not
acceptable.  In practice, on most x86 desktop and server systems, in
general the entropy pool can be initialized before it is needed (and
in modern kernels, we will printk a warning message if not).  However,
on an embedded system, this may not be the case.  And so with this new
interface, we can provide the functionality of blocking until the
urandom pool has been initialized.  Any userspace program which uses
this new functionality must take care to assure that if it is used
during the boot process, that it will not cause the init scripts or
other portions of the system startup to hang indefinitely.

SYNOPSIS
	#include <linux/random.h>

	int getrandom(void *buf, size_t buflen, unsigned int flags);

DESCRIPTION
	The system call getrandom() fills the buffer pointed to by buf
	with up to buflen random bytes which can be used to seed user
	space random number generators (i.e., DRBG's) or for other
	cryptographic uses.  It should not be used for Monte Carlo
	simulations or other programs/algorithms which are doing
	probabilistic sampling.

	If the GRND_RANDOM flags bit is set, then draw from the
	/dev/random pool instead of the /dev/urandom pool.  The
	/dev/random pool is limited based on the entropy that can be
	obtained from environmental noise, so if there is insufficient
	entropy, the requested number of bytes may not be returned.
	If there is no entropy available at all, getrandom(2) will
	either block, or return an error with errno set to EAGAIN if
	the GRND_NONBLOCK bit is set in flags.

	If the GRND_RANDOM bit is not set, then the /dev/urandom pool
	will be used.  Unlike using read(2) to fetch data from
	/dev/urandom, if the urandom pool has not been sufficiently
	initialized, getrandom(2) will block (or return -1 with the
	errno set to EAGAIN if the GRND_NONBLOCK bit is set in flags).

	The getentropy(2) system call in OpenBSD can be emulated using
	the following function:

            int getentropy(void *buf, size_t buflen)
            {
                    int     ret;

                    if (buflen > 256)
                            goto failure;
                    ret = getrandom(buf, buflen, 0);
                    if (ret < 0)
                            return ret;
                    if (ret == buflen)
                            return 0;
            failure:
                    errno = EIO;
                    return -1;
            }

RETURN VALUE
       On success, the number of bytes that was filled in the buf is
       returned.  This may not be all the bytes requested by the
       caller via buflen if insufficient entropy was present in the
       /dev/random pool, or if the system call was interrupted by a
       signal.

       On error, -1 is returned, and errno is set appropriately.

ERRORS
	EINVAL		An invalid flag was passed to getrandom(2)

	EFAULT		buf is outside the accessible address space.

	EAGAIN		The requested entropy was not available, and
			getentropy(2) would have blocked if the
			GRND_NONBLOCK flag was not set.

	EINTR		While blocked waiting for entropy, the call was
			interrupted by a signal handler; see the description
			of how interrupted read(2) calls on "slow" devices
			are handled with and without the SA_RESTART flag
			in the signal(7) man page.

NOTES
	For small requests (buflen <= 256) getrandom(2) will not
	return EINTR when reading from the urandom pool once the
	entropy pool has been initialized, and it will return all of
	the bytes that have been requested.  This is the recommended
	way to use getrandom(2), and is designed for compatibility
	with OpenBSD's getentropy() system call.

	However, if you are using GRND_RANDOM, then getrandom(2) may
	block until the entropy accounting determines that sufficient
	environmental noise has been gathered such that getrandom(2)
	will be operating as a NRBG instead of a DRBG for those people
	who are working in the NIST SP 800-90 regime.  Since it may
	block for a long time, these guarantees do *not* apply.  The
	user may want to interrupt a hanging process using a signal,
	so blocking until all of the requested bytes are returned
	would be unfriendly.

	For this reason, the user of getrandom(2) MUST always check
	the return value, in case it returns some error, or if fewer
	bytes than requested was returned.  In the case of
	!GRND_RANDOM and small request, the latter should never
	happen, but the careful userspace code (and all crypto code
	should be careful) should check for this anyway!

	Finally, unless you are doing long-term key generation (and
	perhaps not even then), you probably shouldn't be using
	GRND_RANDOM.  The cryptographic algorithms used for
	/dev/urandom are quite conservative, and so should be
	sufficient for all purposes.  The disadvantage of GRND_RANDOM
	is that it can block, and the increased complexity required to
	deal with partially fulfilled getrandom(2) requests.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Zach Brown <zab@zabbo.net>
2014-08-05 16:41:22 -04:00
Hannes Frederic Sowa
79a8468747 random: check for increase of entropy_count because of signed conversion
The expression entropy_count -= ibytes << (ENTROPY_SHIFT + 3) could
actually increase entropy_count if during assignment of the unsigned
expression on the RHS (mind the -=) we reduce the value modulo
2^width(int) and assign it to entropy_count. Trinity found this.

[ Commit modified by tytso to add an additional safety check for a
  negative entropy_count -- which should never happen, and to also add
  an additional paranoia check to prevent overly large count values to
  be passed into urandom_read().  ]

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2014-07-19 01:42:13 -04:00
Theodore Ts'o
ee3e00e9e7 random: use registers from interrupted code for CPU's w/o a cycle counter
For CPU's that don't have a cycle counter, or something equivalent
which can be used for random_get_entropy(), random_get_entropy() will
always return 0.  In that case, substitute with the saved interrupt
registers to add a bit more unpredictability.

Some folks have suggested hashing all of the registers
unconditionally, but this would increase the overhead of
add_interrupt_randomness() by at least an order of magnitude, and this
would very likely be unacceptable.

The changes in this commit have been benchmarked as mostly unaffecting
the overhead of add_interrupt_randomness() if the entropy counter is
present, and doubling the overhead if it is not present.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Jörn Engel <joern@logfs.org>
2014-07-15 04:49:41 -04:00
Torsten Duwe
c84dbf61a7 random: add_hwgenerator_randomness() for feeding entropy from devices
This patch adds an interface to the random pool for feeding entropy
in-kernel.

Signed-off-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Acked-by: H. Peter Anvin <hpa@zytor.com>
2014-07-15 04:49:40 -04:00
Theodore Ts'o
43759d4f42 random: use an improved fast_mix() function
Use more efficient fast_mix() function.  Thanks to George Spelvin for
doing the leg work to find a more efficient mixing function.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: George Spelvin <linux@horizon.com>
2014-07-15 04:49:40 -04:00
Theodore Ts'o
840f95077f random: clean up interrupt entropy accounting for archs w/o cycle counters
For architectures that don't have cycle counters, the algorithm for
deciding when to avoid giving entropy credit due to back-to-back timer
interrupts didn't make any sense, since we were checking every 64
interrupts.  Change it so that we only give an entropy credit if the
majority of the interrupts are not based on the timer.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: George Spelvin <linux@horizon.com>
2014-07-15 04:49:39 -04:00
Theodore Ts'o
cff850312c random: only update the last_pulled time if we actually transferred entropy
In xfer_secondary_pull(), check to make sure we need to pull from the
secondary pool before checking and potentially updating the
last_pulled time.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: George Spelvin <linux@horizon.com>
2014-07-15 04:49:39 -04:00
Theodore Ts'o
85608f8e16 random: remove unneeded hash of a portion of the entropy pool
We previously extracted a portion of the entropy pool in
mix_pool_bytes() and hashed it in to avoid racing CPU's from returning
duplicate random values.  Now that we are using a spinlock to prevent
this from happening, this is no longer necessary.  So remove it, to
simplify the code a bit.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: George Spelvin <linux@horizon.com>
2014-07-15 04:49:39 -04:00
Theodore Ts'o
91fcb532ef random: always update the entropy pool under the spinlock
Instead of using lockless techniques introduced in commit
902c098a36, use spin_trylock to try to grab entropy pool's lock.  If
we can't get the lock, then just try again on the next interrupt.

Based on discussions with George Spelvin.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: George Spelvin <linux@horizon.com>
2014-07-15 04:49:39 -04:00
Linus Torvalds
5ee22beeb2 Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
Pull randomness bugfix from Ted Ts'o:
 "random: fix entropy accounting bug introduced in v3.15"

* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
  random: fix nasty entropy accounting bug
2014-06-17 14:23:14 -10:00
Theodore Ts'o
e33ba5fa7a random: fix nasty entropy accounting bug
Commit 0fb7a01af5 "random: simplify accounting code", introduced in
v3.15, has a very nasty accounting problem when the entropy pool has
has fewer bytes of entropy than the number of requested reserved
bytes.  In that case, "have_bytes - reserved" goes negative, and since
size_t is unsigned, the expression:

       ibytes = min_t(size_t, ibytes, have_bytes - reserved);

... does not do the right thing.  This is rather bad, because it
defeats the catastrophic reseeding feature in the
xfer_secondary_pool() path.

It also can cause the "BUG: spinlock trylock failure on UP" for some
kernel configurations when prandom_reseed() calls get_random_bytes()
in the early init, since when the entropy count gets corrupted,
credit_entropy_bits() erroneously believes that the nonblocking pool
has been fully initialized (when in fact it is not), and so it calls
prandom_reseed(true) recursively leading to the spinlock BUG.

The logic is *not* the same it was originally, but in the cases where
it matters, the behavior is the same, and the resulting code is
hopefully easier to read and understand.

Fixes: 0fb7a01af5 "random: simplify accounting code"
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Greg Price <price@mit.edu>
Cc: stable@vger.kernel.org  #v3.15
2014-06-15 21:04:32 -04:00
Joe Perches
5eb10d912e random: convert use of typedef ctl_table to struct ctl_table
This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:15 -07:00
Linus Torvalds
681a289548 Merge branch 'for-3.16/core' of git://git.kernel.dk/linux-block into next
Pull block core updates from Jens Axboe:
 "It's a big(ish) round this time, lots of development effort has gone
  into blk-mq in the last 3 months.  Generally we're heading to where
  3.16 will be a feature complete and performant blk-mq.  scsi-mq is
  progressing nicely and will hopefully be in 3.17.  A nvme port is in
  progress, and the Micron pci-e flash driver, mtip32xx, is converted
  and will be sent in with the driver pull request for 3.16.

  This pull request contains:

   - Lots of prep and support patches for scsi-mq have been integrated.
     All from Christoph.

   - API and code cleanups for blk-mq from Christoph.

   - Lots of good corner case and error handling cleanup fixes for
     blk-mq from Ming Lei.

   - A flew of blk-mq updates from me:

     * Provide strict mappings so that the driver can rely on the CPU
       to queue mapping.  This enables optimizations in the driver.

     * Provided a bitmap tagging instead of percpu_ida, which never
       really worked well for blk-mq.  percpu_ida relies on the fact
       that we have a lot more tags available than we really need, it
       fails miserably for cases where we exhaust (or are close to
       exhausting) the tag space.

     * Provide sane support for shared tag maps, as utilized by scsi-mq

     * Various fixes for IO timeouts.

     * API cleanups, and lots of perf tweaks and optimizations.

   - Remove 'buffer' from struct request.  This is ancient code, from
     when requests were always virtually mapped.  Kill it, to reclaim
     some space in struct request.  From me.

   - Remove 'magic' from blk_plug.  Since we store these on the stack
     and since we've never caught any actual bugs with this, lets just
     get rid of it.  From me.

   - Only call part_in_flight() once for IO completion, as includes two
     atomic reads.  Hopefully we'll get a better implementation soon, as
     the part IO stats are now one of the more expensive parts of doing
     IO on blk-mq.  From me.

   - File migration of block code from {mm,fs}/ to block/.  This
     includes bio.c, bio-integrity.c, bounce.c, and ioprio.c.  From me,
     from a discussion on lkml.

  That should describe the meat of the pull request.  Also has various
  little fixes and cleanups from Dave Jones, Shaohua Li, Duan Jiong,
  Fengguang Wu, Fabian Frederick, Randy Dunlap, Robert Elliott, and Sam
  Bradshaw"

* 'for-3.16/core' of git://git.kernel.dk/linux-block: (100 commits)
  blk-mq: push IPI or local end_io decision to __blk_mq_complete_request()
  blk-mq: remember to start timeout handler for direct queue
  block: ensure that the timer is always added
  blk-mq: blk_mq_unregister_hctx() can be static
  blk-mq: make the sysfs mq/ layout reflect current mappings
  blk-mq: blk_mq_tag_to_rq should handle flush request
  block: remove dead code in scsi_ioctl:blk_verify_command
  blk-mq: request initialization optimizations
  block: add queue flag for disabling SG merging
  block: remove 'magic' from struct blk_plug
  blk-mq: remove alloc_hctx and free_hctx methods
  blk-mq: add file comments and update copyright notices
  blk-mq: remove blk_mq_alloc_request_pinned
  blk-mq: do not use blk_mq_alloc_request_pinned in blk_mq_map_request
  blk-mq: remove blk_mq_wait_for_tags
  blk-mq: initialize request in __blk_mq_alloc_request
  blk-mq: merge blk_mq_alloc_reserved_request into blk_mq_alloc_request
  blk-mq: add helper to insert requests from irq context
  blk-mq: remove stale comment for blk_mq_complete_request()
  blk-mq: allow non-softirq completions
  ...
2014-06-02 09:29:34 -07:00
Theodore Ts'o
f9c6d4987b random: fix BUG_ON caused by accounting simplification
Commit ee1de406ba ("random: simplify accounting logic") simplified
things too much, in that it allows the following to trigger an
overflow that results in a BUG_ON crash:

dd if=/dev/urandom of=/dev/zero bs=67108707 count=1

Thanks to Peter Zihlstra for discovering the crash, and Hannes
Frederic for analyizing the root cause.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Greg Price <price@mit.edu>
2014-05-16 22:18:22 -04:00
Christoph Hellwig
bdcfa3e57c random: export add_disk_randomness
This will be needed for pending changes to the scsi midlayer that now
calls lower level block APIs, as well as any blk-mq driver that wants to
contribute to the random pool.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-04-28 09:29:55 -06:00
H. Peter Anvin
7b878d4b48 random: Add arch_has_random[_seed]()
Add predicate functions for having arch_get_random[_seed]*().  The
only current use is to avoid the loop in arch_random_refill() when
arch_get_random_seed_long() is unavailable.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-03-19 22:24:08 -04:00
H. Peter Anvin
331c6490c7 random: If we have arch_get_random_seed*(), try it before blocking
If we have arch_get_random_seed*(), try to use it for emergency refill
of the entropy pool before giving up and blocking on /dev/random.  It
may or may not work in the moment, but if it does work, it will give
the user better service than blocking will.

Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-03-19 22:22:06 -04:00
H. Peter Anvin
83664a6928 random: Use arch_get_random_seed*() at init time and once a second
Use arch_get_random_seed*() in two places in the Linux random
driver (drivers/char/random.c):

1. During entropy pool initialization, use RDSEED in favor of RDRAND,
   with a fallback to the latter.  Entropy exhaustion is unlikely to
   happen there on physical hardware as the machine is single-threaded
   at that point, but could happen in a virtual machine.  In that
   case, the fallback to RDRAND will still provide more than adequate
   entropy pool initialization.

2. Once a second, issue RDSEED and, if successful, feed it to the
   entropy pool.  To ensure an extra layer of security, only credit
   half the entropy just in case.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2014-03-19 22:22:06 -04:00
Theodore Ts'o
46884442fc random: use the architectural HWRNG for the SHA's IV in extract_buf()
To help assuage the fears of those who think the NSA can introduce a
massive hack into the instruction decode and out of order execution
engine in the CPU without hundreds of Intel engineers knowing about
it (only one of which woud need to have the conscience and courage of
Edward Snowden to spill the beans to the public), use the HWRNG to
initialize the SHA starting value, instead of xor'ing it in
afterwards.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-03-19 22:18:52 -04:00
Greg Price
2132a96f66 random: clarify bits/bytes in wakeup thresholds
These are a recurring cause of confusion, so rename them to
hopefully be clearer.

Signed-off-by: Greg Price <price@mit.edu>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-03-19 22:18:51 -04:00
Greg Price
7d1b08c40c random: entropy_bytes is actually bits
The variable 'entropy_bytes' is set from an expression that actually
counts bits.  Fortunately it's also only compared to values that also
count bits.  Rename it accordingly.

Signed-off-by: Greg Price <price@mit.edu>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-03-19 22:18:51 -04:00
Greg Price
0fb7a01af5 random: simplify accounting code
With this we handle "reserved" in just one place.  As a bonus the
code becomes less nested, and the "wakeup_write" flag variable
becomes unnecessary.  The variable "flags" was already unused.

This code behaves identically to the previous version except in
two pathological cases that don't occur.  If the argument "nbytes"
is already less than "min", then we didn't previously enforce
"min".  If r->limit is false while "reserved" is nonzero, then we
previously applied "reserved" in checking whether we had enough
bits, even though we don't apply it to actually limit how many we
take.  The callers of account() never exercise either of these cases.

Before the previous commit, it was possible for "nbytes" to be less
than "min" if userspace chose a pathological configuration, but no
longer.

Cc: Jiri Kosina <jkosina@suse.cz>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Greg Price <price@mit.edu>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2014-03-19 22:18:51 -04:00