In addition to mtd_block_isbad(), which checks if a block is bad or
reserved, it's needed to check if a block is reserved only (but not
bad). This commit adds an MTD interface for it, in a similar fashion to
mtd_block_isbad().
While here, fix mtd_block_isbad() so the out-of-bounds checking is done
before the callback check.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Tested-by: Pekon Gupta <pekon@ti.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
These new sysfs device attributes allow us to retrieve the ECC and bad
block stats by poking a sysfs file, which is often more convenient than
using the ioctl.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Tested-by: Pekon Gupta <pekon@ti.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
If a write to one time programmable memory (OTP) hits the end of this
memory area, no more data can be written. The count variable in
mtdchar_write() in drivers/mtd/mtdchar.c is not decreased anymore.
We are trapped in the loop forever, mtdchar_write() will never return
in this case.
The desired behavior of a write in such a case is described in [1]:
- Try to write as much data as possible, truncate the write to fit into
the available memory and return the number of bytes that actually
have been written.
- If no data could be written at all, return -ENOSPC.
This patch fixes the behavior of OTP write if there is not enough space
for all data:
1) mtd_write_user_prot_reg() in drivers/mtd/mtdcore.c is modified to
return -ENOSPC if no data could be written at all.
2) mtdchar_write() is modified to handle -ENOSPC correctly. Exit if a
write returned -ENOSPC and yield the correct return value, either
then number of bytes that could be written, or -ENOSPC, if no data
could be written at all.
Furthermore the patch harmonizes the behavior of the OTP memory write
in drivers/mtd/devices/mtd_dataflash.c with the other implementations
and the requirements from [1]. Instead of returning -EINVAL if the data
does not fit into the OTP memory, we try to write as much data as
possible/truncate the write.
[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html
Signed-off-by: Christian Riesch <christian.riesch@omicron.at>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
This patch moves the char and block major number definitions
to major.h to be with the rest of the major numbers.
While doing this, include major.h in the files that need it.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
The current mtd_type_show() misses the MTD_MLCNANDFLASH case.
This patch adds the case for it, and also updates the ABI.
Signed-off-by: Huang Shijie <b32955@freescale.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Calling dev_set_name with a single paramter causes it to be handled as a
format string. Many callers are passing potentially dynamic string
content, so use "%s" in those cases to avoid any potential accidents,
including wrappers like device_create*() and bdi_register().
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull MTD update from David Woodhouse:
- Lots of cleanups from Artem, including deletion of some obsolete
drivers
- Support partitions larger than 4GiB in device tree
- Support for new SPI chips
* tag 'for-linus-20130509' of git://git.infradead.org/linux-mtd: (83 commits)
mtd: omap2: Use module_platform_driver()
mtd: bf5xx_nand: Use module_platform_driver()
mtd: denali_dt: Remove redundant use of of_match_ptr
mtd: denali_dt: Change return value to fix smatch warning
mtd: denali_dt: Use module_platform_driver()
mtd: denali_dt: Fix incorrect error check
mtd: nand: subpage write support for hardware based ECC schemes
mtd: omap2: use msecs_to_jiffies()
mtd: nand_ids: use size macros
mtd: nand_ids: improve LEGACY_ID_NAND macro a bit
mtd: add 4 Toshiba nand chips for the full-id case
mtd: add the support to parse out the full-id nand type
mtd: add new fields to nand_flash_dev{}
mtd: sh_flctl: Use of_match_ptr() macro
mtd: gpio: Use of_match_ptr() macro
mtd: gpio: Use devm_kzalloc()
mtd: davinci_nand: Use of_match_ptr()
mtd: dataflash: Use of_match_ptr() macro
mtd: remove h720x flash support
mtd: onenand: remove OneNAND simulator
...
The MTD subsystem has historically tried to be as configurable as possible. The
side-effect of this is that its configuration menu is rather large, and we are
gradually shrinking it. For example, we recently merged partitions support with
the mtdcore.
This patch does the next step - it merges the mtdchar module to mtdcore. And in
this case this is not only about eliminating too fine-grained separation and
simplifying the configuration menu. This is also about eliminating seemingly
useless kernel module.
Indeed, mtdchar is a module that allows user-space making use of MTD devices
via /dev/mtd* character devices. If users do not enable it, they simply cannot
use MTD devices at all. They cannot read or write the flash contents. Is it a
sane and useful setup? I believe not. And everyone just enables mtdchar.
Having mtdchar separate is also a little bit harmful. People sometimes miss the
fact that they need to enable an additional configuration option to have
user-space MTD interfaces, and then they wonder why on earth the kernel does
not allow using the flash? They spend time asking around.
Thus, let's just get rid of this module and make it part of mtd core.
Note, mtdchar had additional configuration option to enable OTP interfaces,
which are present on some flashes. I removed that option as well - it saves a
really tiny amount space.
[dwmw2: Strictly speaking, you can mount file systems on MTD devices just
fine without the mtdchar (or mtdblock) devices; you just can't do
other manipulations directly on the underlying device. But still I
agree that it makes sense to make this unconditional. And Yay! we
get to kill off an instance of checking CONFIG_foo_MODULE, which is
an abomination that should never happen.]
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Remove a couple of useles '#ifdef CONFIG_PROC_FS's around procfs functions
which anyway turn into empty function in 'proc_fs.h' file when CONFIG_PROC_FS
is not defined.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
'mtd_device_parse_register()' and 'parse_mtd_partitions()' functions accept a
an array of character pointers. These functions modify neither the pointers nor
the characters they point to. The characters are actually names of the MTD
parsers.
At the moment, the argument type is 'const char **', which means that only the
names of the parsers are constant. Let's turn the argument type into 'const
char * const *', which means that both names and the pointers which point to
them are constant.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This reverts commits a50915394f and
d7c3b937bd.
This is a revert of a revert of a revert. In addition, it reverts the
even older i915 change to stop using the __GFP_NO_KSWAPD flag due to the
original commits in linux-next.
It turns out that the original patch really was bogus, and that the
original revert was the correct thing to do after all. We thought we
had fixed the problem, and then reverted the revert, but the problem
really is fundamental: waking up kswapd simply isn't the right thing to
do, and direct reclaim sometimes simply _is_ the right thing to do.
When certain allocations fail, we simply should try some direct reclaim,
and if that fails, fail the allocation. That's the right thing to do
for THP allocations, which can easily fail, and the GPU allocations want
to do that too.
So starting kswapd is sometimes simply wrong, and removing the flag that
said "don't start kswapd" was a mistake. Let's hope we never revisit
this mistake again - and certainly not this many times ;)
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With "mm: vmscan: scale number of pages reclaimed by reclaim/compaction
based on failures" reverted, Zdenek Kabelac reported the following
Hmm, so it's just took longer to hit the problem and observe
kswapd0 spinning on my CPU again - it's not as endless like before -
but still it easily eats minutes - it helps to turn off Firefox
or TB (memory hungry apps) so kswapd0 stops soon - and restart
those apps again. (And I still have like >1GB of cached memory)
kswapd0 R running task 0 30 2 0x00000000
Call Trace:
preempt_schedule+0x42/0x60
_raw_spin_unlock+0x55/0x60
put_super+0x31/0x40
drop_super+0x22/0x30
prune_super+0x149/0x1b0
shrink_slab+0xba/0x510
The sysrq+m indicates the system has no swap so it'll never reclaim
anonymous pages as part of reclaim/compaction. That is one part of the
problem but not the root cause as file-backed pages could also be
reclaimed.
The likely underlying problem is that kswapd is woken up or kept awake
for each THP allocation request in the page allocator slow path.
If compaction fails for the requesting process then compaction will be
deferred for a time and direct reclaim is avoided. However, if there
are a storm of THP requests that are simply rejected, it will still be
the the case that kswapd is awake for a prolonged period of time as
pgdat->kswapd_max_order is updated each time. This is noticed by the
main kswapd() loop and it will not call kswapd_try_to_sleep(). Instead
it will loopp, shrinking a small number of pages and calling
shrink_slab() on each iteration.
The temptation is to supply a patch that checks if kswapd was woken for
THP and if so ignore pgdat->kswapd_max_order but it'll be a hack and not
backed up by proper testing. As 3.7 is very close to release and this
is not a bug we should release with, a safer path is to revert "mm:
remove __GFP_NO_KSWAPD" for now and revisit it with the view to ironing
out the balance_pgdat() logic in general.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Zdenek Kabelac <zkabelac@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When transparent huge pages were introduced, memory compaction and swap
storms were an issue, and the kernel had to be careful to not make THP
allocations cause pageout or compaction.
Now that we have working compaction deferral, kswapd is smart enough to
invoke compaction and the quadratic behaviour around isolate_free_pages
has been fixed, it should be safe to remove __GFP_NO_KSWAPD.
[minchan@kernel.org: Comment fix]
[mgorman@suse.de: Avoid direct reclaim for deferred compaction]
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mtd_read_oob() has some unexpected similarities to mtd_read(). For
instance, when ops->datbuf != NULL, nand_base.c might return max_bitflips;
however, when ops->datbuf == NULL, nand_base's code potentially could
return -EUCLEAN (no in-tree drivers do this yet). In any case where the
driver might return max_bitflips, we should translate this into an
appropriate return code using the bitflip_threshold.
Essentially, mtd_read_oob() duplicates the logic from mtd_read().
This prevents users of mtd_read_oob() from receiving a positive return
value (i.e., from max_bitflips) and interpreting it as an unknown error.
Artem: amend comments.
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Reviewed-by: Mike Dunn <mikedunn@newsguy.com>
Reviewed-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The drivers' _read() method, absent an error, returns a non-negative integer
indicating the maximum number of bit errors that were corrected in any one
region comprising an ecc step. MTD returns -EUCLEAN if this is >=
bitflip_threshold, 0 otherwise. If bitflip_threshold is zero, the comparison is
not made since these devices lack ECC and always return zero in the non-error
case (thanks Brian)¹. Note that this is a subtle change to the driver
interface.
This and the preceding patches in this set were tested with ubi on top of the
nandsim and docg4 devices, running the ubi test io_basic from mtd-utils.
¹ http://lists.infradead.org/pipermail/linux-mtd/2012-March/040468.html
Signed-off-by: Mike Dunn <mikedunn@newsguy.com>
Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
Acked-by: Brian Norris <computersforpeace@gmail.com>
Ivan Djelic <ivan.djelic@parrot.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
An element 'bitflip_threshold' is added to struct mtd_info, and also exposed as
a read/write variable in sysfs. This will be used to determine whether or not
mtd_read() returns -EUCLEAN or 0 (absent a hard error). If the driver leaves it
as zero, mtd will set it to a default value of ecc_strength.
This v2 adds the line that propagates bitflip_threshold from the master to the
partitions - thanks Ivan¹.
¹ http://lists.infradead.org/pipermail/linux-mtd/2012-April/040900.html
Signed-off-by: Mike Dunn <mikedunn@newsguy.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>