Commit Graph

713 Commits

Author SHA1 Message Date
Matthew Wilcox (Oracle)
522a0032af Add linux/cacheflush.h
Many architectures do not include asm-generic/cacheflush.h, so turn
the includes on their head and add linux/cacheflush.h which includes
asm/cacheflush.h.

Move the flush_dcache_folio() declaration from asm-generic/cacheflush.h
to linux/cacheflush.h and change linux/highmem.h to include
linux/cacheflush.h instead of asm/cacheflush.h so that all necessary
places will see flush_dcache_folio().

More functions should have their default implementations moved in the
future, but those are for follow-on patches.  This fixes csky, sparc and
sparc64 which were missed in the commit which added flush_dcache_folio().

Fixes: 08b0b0059b ("mm: Add flush_dcache_folio()")
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
2021-11-17 10:36:15 -05:00
Linus Torvalds
79ef0c0014 Merge tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:

 - kprobes: Restructured stack unwinder to show properly on x86 when a
   stack dump happens from a kretprobe callback.

 - Fix to bootconfig parsing

 - Have tracefs allow owner and group permissions by default (only
   denying others). There's been pressure to allow non root to tracefs
   in a controlled fashion, and using groups is probably the safest.

 - Bootconfig memory managament updates.

 - Bootconfig clean up to have the tools directory be less dependent on
   changes in the kernel tree.

 - Allow perf to be traced by function tracer.

 - Rewrite of function graph tracer to be a callback from the function
   tracer instead of having its own trampoline (this change will happen
   on an arch by arch basis, and currently only x86_64 implements it).

 - Allow multiple direct trampolines (bpf hooks to functions) be batched
   together in one synchronization.

 - Allow histogram triggers to add variables that can perform
   calculations against the event's fields.

 - Use the linker to determine architecture callbacks from the ftrace
   trampoline to allow for proper parameter prototypes and prevent
   warnings from the compiler.

 - Extend histogram triggers to key off of variables.

 - Have trace recursion use bit magic to determine preempt context over
   if branches.

 - Have trace recursion disable preemption as all use cases do anyway.

 - Added testing for verification of tracing utilities.

 - Various small clean ups and fixes.

* tag 'trace-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (101 commits)
  tracing/histogram: Fix semicolon.cocci warnings
  tracing/histogram: Fix documentation inline emphasis warning
  tracing: Increase PERF_MAX_TRACE_SIZE to handle Sentinel1 and docker together
  tracing: Show size of requested perf buffer
  bootconfig: Initialize ret in xbc_parse_tree()
  ftrace: do CPU checking after preemption disabled
  ftrace: disable preemption when recursion locked
  tracing/histogram: Document expression arithmetic and constants
  tracing/histogram: Optimize division by a power of 2
  tracing/histogram: Covert expr to const if both operands are constants
  tracing/histogram: Simplify handling of .sym-offset in expressions
  tracing: Fix operator precedence for hist triggers expression
  tracing: Add division and multiplication support for hist triggers
  tracing: Add support for creating hist trigger variables from literal
  selftests/ftrace: Stop tracing while reading the trace file by default
  MAINTAINERS: Update KPROBES and TRACING entries
  test_kprobes: Move it from kernel/ to lib/
  docs, kprobes: Remove invalid URL and add new reference
  samples/kretprobes: Fix return value if register_kretprobe() failed
  lib/bootconfig: Fix the xbc_get_info kerneldoc
  ...
2021-11-01 20:05:19 -07:00
Linus Torvalds
9a7e0a90a4 Merge tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Thomas Gleixner:

 - Revert the printk format based wchan() symbol resolution as it can
   leak the raw value in case that the symbol is not resolvable.

 - Make wchan() more robust and work with all kind of unwinders by
   enforcing that the task stays blocked while unwinding is in progress.

 - Prevent sched_fork() from accessing an invalid sched_task_group

 - Improve asymmetric packing logic

 - Extend scheduler statistics to RT and DL scheduling classes and add
   statistics for bandwith burst to the SCHED_FAIR class.

 - Properly account SCHED_IDLE entities

 - Prevent a potential deadlock when initial priority is assigned to a
   newly created kthread. A recent change to plug a race between cpuset
   and __sched_setscheduler() introduced a new lock dependency which is
   now triggered. Break the lock dependency chain by moving the priority
   assignment to the thread function.

 - Fix the idle time reporting in /proc/uptime for NOHZ enabled systems.

 - Improve idle balancing in general and especially for NOHZ enabled
   systems.

 - Provide proper interfaces for live patching so it does not have to
   fiddle with scheduler internals.

 - Add cluster aware scheduling support.

 - A small set of tweaks for RT (irqwork, wait_task_inactive(), various
   scheduler options and delaying mmdrop)

 - The usual small tweaks and improvements all over the place

* tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (69 commits)
  sched/fair: Cleanup newidle_balance
  sched/fair: Remove sysctl_sched_migration_cost condition
  sched/fair: Wait before decaying max_newidle_lb_cost
  sched/fair: Skip update_blocked_averages if we are defering load balance
  sched/fair: Account update_blocked_averages in newidle_balance cost
  x86: Fix __get_wchan() for !STACKTRACE
  sched,x86: Fix L2 cache mask
  sched/core: Remove rq_relock()
  sched: Improve wake_up_all_idle_cpus() take #2
  irq_work: Also rcuwait for !IRQ_WORK_HARD_IRQ on PREEMPT_RT
  irq_work: Handle some irq_work in a per-CPU thread on PREEMPT_RT
  irq_work: Allow irq_work_sync() to sleep if irq_work() no IRQ support.
  sched/rt: Annotate the RT balancing logic irqwork as IRQ_WORK_HARD_IRQ
  sched: Add cluster scheduler level for x86
  sched: Add cluster scheduler level in core and related Kconfig for ARM64
  topology: Represent clusters of CPUs within a die
  sched: Disable -Wunused-but-set-variable
  sched: Add wrapper for get_wchan() to keep task blocked
  x86: Fix get_wchan() to support the ORC unwinder
  proc: Use task_is_running() for wchan in /proc/$pid/stat
  ...
2021-11-01 13:48:52 -07:00
Linus Torvalds
49f8275c7d Merge tag 'folio-5.16' of git://git.infradead.org/users/willy/pagecache
Pull memory folios from Matthew Wilcox:
 "Add memory folios, a new type to represent either order-0 pages or the
  head page of a compound page. This should be enough infrastructure to
  support filesystems converting from pages to folios.

  The point of all this churn is to allow filesystems and the page cache
  to manage memory in larger chunks than PAGE_SIZE. The original plan
  was to use compound pages like THP does, but I ran into problems with
  some functions expecting only a head page while others expect the
  precise page containing a particular byte.

  The folio type allows a function to declare that it's expecting only a
  head page. Almost incidentally, this allows us to remove various calls
  to VM_BUG_ON(PageTail(page)) and compound_head().

  This converts just parts of the core MM and the page cache. For 5.17,
  we intend to convert various filesystems (XFS and AFS are ready; other
  filesystems may make it) and also convert more of the MM and page
  cache to folios. For 5.18, multi-page folios should be ready.

  The multi-page folios offer some improvement to some workloads. The
  80% win is real, but appears to be an artificial benchmark (postgres
  startup, which isn't a serious workload). Real workloads (eg building
  the kernel, running postgres in a steady state, etc) seem to benefit
  between 0-10%. I haven't heard of any performance losses as a result
  of this series. Nobody has done any serious performance tuning; I
  imagine that tweaking the readahead algorithm could provide some more
  interesting wins. There are also other places where we could choose to
  create large folios and currently do not, such as writes that are
  larger than PAGE_SIZE.

  I'd like to thank all my reviewers who've offered review/ack tags:
  Christoph Hellwig, David Howells, Jan Kara, Jeff Layton, Johannes
  Weiner, Kirill A. Shutemov, Michal Hocko, Mike Rapoport, Vlastimil
  Babka, William Kucharski, Yu Zhao and Zi Yan.

  I'd also like to thank those who gave feedback I incorporated but
  haven't offered up review tags for this part of the series: Nick
  Piggin, Mel Gorman, Ming Lei, Darrick Wong, Ted Ts'o, John Hubbard,
  Hugh Dickins, and probably a few others who I forget"

* tag 'folio-5.16' of git://git.infradead.org/users/willy/pagecache: (90 commits)
  mm/writeback: Add folio_write_one
  mm/filemap: Add FGP_STABLE
  mm/filemap: Add filemap_get_folio
  mm/filemap: Convert mapping_get_entry to return a folio
  mm/filemap: Add filemap_add_folio()
  mm/filemap: Add filemap_alloc_folio
  mm/page_alloc: Add folio allocation functions
  mm/lru: Add folio_add_lru()
  mm/lru: Convert __pagevec_lru_add_fn to take a folio
  mm: Add folio_evictable()
  mm/workingset: Convert workingset_refault() to take a folio
  mm/filemap: Add readahead_folio()
  mm/filemap: Add folio_mkwrite_check_truncate()
  mm/filemap: Add i_blocks_per_folio()
  mm/writeback: Add folio_redirty_for_writepage()
  mm/writeback: Add folio_account_redirty()
  mm/writeback: Add folio_clear_dirty_for_io()
  mm/writeback: Add folio_cancel_dirty()
  mm/writeback: Add folio_account_cleaned()
  mm/writeback: Add filemap_dirty_folio()
  ...
2021-11-01 08:47:59 -07:00
Matthew Wilcox (Oracle)
08b0b0059b mm: Add flush_dcache_folio()
This is a default implementation which calls flush_dcache_page() on
each page in the folio.  If architectures can do better, they should
implement their own version of it.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
2021-10-18 07:49:36 -04:00
Vineet Gupta
c3ca31ce0e ARC: fix potential build snafu
In the big pgtable header split, I inadvertently introduced a couple of
duplicate symbols.

Fixes: fe6cb7b043 ("ARC: mm: disintegrate pgtable.h into levels and flags")
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-10-15 18:06:32 -07:00
Kees Cook
42a20f86dc sched: Add wrapper for get_wchan() to keep task blocked
Having a stable wchan means the process must be blocked and for it to
stay that way while performing stack unwinding.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm]
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Link: https://lkml.kernel.org/r/20211008111626.332092234@infradead.org
2021-10-15 11:25:14 +02:00
Masami Hiramatsu
bb6121b11c ARC: Add instruction_pointer_set() API
Add instruction_pointer_set() API for arc.

Link: https://lkml.kernel.org/r/163163050148.489837.15187799269793560256.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-30 21:24:07 -04:00
Masami Hiramatsu
adf8a61a94 kprobes: treewide: Make it harder to refer kretprobe_trampoline directly
Since now there is kretprobe_trampoline_addr() for referring the
address of kretprobe trampoline code, we don't need to access
kretprobe_trampoline directly.

Make it harder to refer by renaming it to __kretprobe_trampoline().

Link: https://lkml.kernel.org/r/163163045446.489837.14510577516938803097.stgit@devnote2

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-30 21:24:06 -04:00
Linus Torvalds
e07af26266 Merge tag 'arc-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC updates from Vineet Gupta:
 "Finally a big pile of changes for ARC (atomics/mm). These are from our
  internal arc64 tree, preparing mainline for eventual arc64 support.
  I'm spreading them out to avoid tsunami of patches in one release.

   - MM rework:
       - Implement up to 4 paging levels
       - Enable STRICT_MM_TYPECHECK
       - switch pgtable_t back to 'struct page *'

   - Atomics rework / implement relaxed accessors

   - Retire legacy MMUv1,v2; ARC750 cores

   - A few other build errors, typos"

* tag 'arc-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (33 commits)
  ARC: mm: vmalloc sync from kernel to user table to update PMD ...
  ARC: mm: support 4 levels of page tables
  ARC: mm: support 3 levels of page tables
  ARC: mm: switch to asm-generic/pgalloc.h
  ARC: mm: switch pgtable_t back to struct page *
  ARC: mm: hack to allow 2 level build with 4 level code
  ARC: mm: disintegrate pgtable.h into levels and flags
  ARC: mm: disintegrate mmu.h (arcv2 bits out)
  ARC: mm: move MMU specific bits out of entry code ...
  ARC: mm: move MMU specific bits out of ASID allocator
  ARC: mm: non-functional code movement/cleanup
  ARC: mm: pmd_populate* to use the canonical set_pmd (and drop pmd_set)
  ARC: ioremap: use more commonly used PAGE_KERNEL based uncached flag
  ARC: mm: Enable STRICT_MM_TYPECHECKS
  ARC: mm: Fixes to allow STRICT_MM_TYPECHECKS
  ARC: mm: move mmu/cache externs out to setup.h
  ARC: mm: remove tlb paranoid code
  ARC: mm: use SCRATCH_DATA0 register for caching pgdir in ARCv2 only
  ARC: retire MMUv1 and MMUv2 support
  ARC: retire ARC750 support
  ...
2021-09-05 11:43:03 -07:00
Linus Torvalds
4cdc4cc2ad Merge tag 'asm-generic-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
 "The main content for 5.15 is a series that cleans up the handling of
  strncpy_from_user() and strnlen_user(), removing a lot of slightly
  incorrect versions of these in favor of the lib/strn*.c helpers that
  implement these correctly and more efficiently.

  The only architectures that retain a private version now are mips,
  ia64, um and parisc. I had offered to convert those at all, but Thomas
  Bogendoerfer wanted to keep the mips version for the moment until he
  had a chance to do regression testing.

  The branch also contains two patches for bitops and for ffs()"

* tag 'asm-generic-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  bitops/non-atomic: make @nr unsigned to avoid any DIV
  asm-generic: ffs: Drop bogus reference to ffz location
  asm-generic: reverse GENERIC_{STRNCPY_FROM,STRNLEN}_USER symbols
  asm-generic: remove extra strn{cpy_from,len}_user declarations
  asm-generic: uaccess: remove inline strncpy_from_user/strnlen_user
  s390: use generic strncpy/strnlen from_user
  microblaze: use generic strncpy/strnlen from_user
  csky: use generic strncpy/strnlen from_user
  arc: use generic strncpy/strnlen from_user
  hexagon: use generic strncpy/strnlen from_user
  h8300: remove stale strncpy_from_user
  asm-generic/uaccess.h: remove __strncpy_from_user/__strnlen_user
2021-09-01 15:13:02 -07:00
Vineet Gupta
8747ff704a ARC: mm: support 4 levels of page tables
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-08-26 13:43:19 -07:00
Vineet Gupta
2dde02ab6d ARC: mm: support 3 levels of page tables
ARCv2 MMU is software walked and Linux implements 2 levels of paging: pgd/pte.
Forthcoming hw will have multiple levels, so this change preps mm code
for same. It is also fun to try multi levels even on soft-walked code to
ensure generic mm code is robust to handle.

overview
________

2 levels {pgd, pte} : pmd is folded but pmd_* macros are valid and operate on pgd
3 levels {pgd, pmd, pte}:
  - pud is folded and pud_* macros point to pgd
  - pmd_* macros operate on actual pmd

code changes
____________

1. #include <asm-generic/pgtable-nopud.h>

2. Define CONFIG_PGTABLE_LEVELS 3

3a. Define PMD_SHIFT, PMD_SIZE, PMD_MASK, pmd_t
3b. Define pmd_val() which actually deals with pmd
    (pmd_offset(), pmd_index() are provided by generic code)
3c. pmd_alloc_one()/pmd_free() also provided by generic code
    (pmd_populate/pmd_free already exist)

4. Define pud_none(), pud_bad() macros based on generic pud_val() which
   internally pertains to pgd now.
4b. define pud_populate() to just setup pgd

Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-08-26 13:43:19 -07:00
Vineet Gupta
9f3c76aedc ARC: mm: switch to asm-generic/pgalloc.h
With previous patch ARC pgalloc functions are same as generic, hence
switch to that.

Suggested-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-08-26 13:43:19 -07:00
Vineet Gupta
d9820ff76f ARC: mm: switch pgtable_t back to struct page *
So far ARC pgtable_t has not been struct page based to avoid extra
page_address() calls involved. However the differences are down to
noise and get in the way of using generic code, hence this patch.

This also allows us to reuse generic THP depost/withdraw code.

There's some additional consideration for PGDIR_SHIFT in 4K page config.
Now due to page tables being PAGE_SIZE deep only, the address split
can't be really arbitrary.

Tested-by: kernel test robot <lkp@intel.com>
Suggested-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-08-26 13:42:42 -07:00
Vineet Gupta
fe6cb7b043 ARC: mm: disintegrate pgtable.h into levels and flags
- pgtable-bits-arcv2.h (MMU specific page table flags)
 - pgtable-levels.h (paging levels)

No functional changes, but paves way for easy addition of new MMU code
with different bits and levels etc

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-08-25 15:53:19 -07:00
Vineet Gupta
2cc1121bc9 ARC: mm: disintegrate mmu.h (arcv2 bits out)
non functional change

Tested-by: kernel test robot <lkp@intel.com>
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-08-25 15:53:19 -07:00
Vineet Gupta
a79a9c765f ARC: mm: move MMU specific bits out of entry code ...
... to avoid polluting shared entry code (across three ISA variants)
with ISA/MMU specific code.

Cc: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-08-24 14:25:48 -07:00
Vineet Gupta
89d0d42412 ARC: mm: move MMU specific bits out of ASID allocator
And while at it, rewrite commentary on ASID allocator

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-08-24 14:25:48 -07:00
Vineet Gupta
be43b096ed ARC: mm: non-functional code movement/cleanup
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-08-24 14:25:48 -07:00
Vineet Gupta
e93e59ac1e ARC: mm: pmd_populate* to use the canonical set_pmd (and drop pmd_set)
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-08-24 14:25:48 -07:00
Vineet Gupta
da773cf20e ARC: ioremap: use more commonly used PAGE_KERNEL based uncached flag
and remove the one off uncached definition for ARC

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-08-24 14:25:48 -07:00
Vineet Gupta
1b4013b9ae ARC: mm: Enable STRICT_MM_TYPECHECKS
In the past I've refrained from doing this (at least 2 times) due to the
slight code bloat due to ABI implications of pte_t etc becoming struct

Per ARC ABI, functions return struct via memory and not through register
r0, even if the struct would fit in register(s)

 - caller allocates space on stack and passes the address as first arg
   (r0), shifting rest of args by one

 - callee creates return struct in memory (referenced via r0)

This time around the code actually shrunk slightly (due to subtle
inlining heuristic effects), but still slightly inefficient due to
return values passed through memory. That however seems like a small
cost compared to maintenance burden given the impending new mmu support
for page walk etc

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-08-24 14:25:48 -07:00
Vineet Gupta
47910ca3ce ARC: mm: move mmu/cache externs out to setup.h
Don't pollute mmu.h and cache.h with ARC internal bootlog/setup
related functions. Move them aside to setup.h

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-08-24 14:25:48 -07:00
Vineet Gupta
12e7804c26 ARC: mm: remove tlb paranoid code
This was used back in arc700 days when ASID allocator was fragile.
Not needed in last 5 years

Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2021-08-24 14:25:48 -07:00