Commit Graph

525396 Commits

Author SHA1 Message Date
Liad Kaufman be88a1ada9 iwlwifi: nvm: remove mac address byte swapping in 8000 family
This fixes the byte order copying in the MAO (Mac Override
Section) section from the PNVM, as the byte swapping is not
required anymore in the 8000 family. Due to the byte
swapping, the driver was reporting an incorrect MAC
adddress.

CC: <stable@vger.kernel.org> [4.1]
Signed-off-by: Liad Kaufman <liad.kaufman@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2015-07-12 19:54:19 +03:00
Emmanuel Grumbach 255ba06533 Revert "iwlwifi: pcie: New RBD allocation model"
This reverts commit 5f17570354.

This patch introduced a high latency in buffer allocation
under extreme load. This latency caused a firmwre crash.
The same scenario works fine with this patch reverted.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2015-07-12 19:54:19 +03:00
Avraham Stern 8465fe6ac5 iwlwifi: mvm: Add preemptive flag to scheulded scan
Add preemptive flag to scheduled scan command flags. Without this
flag, all scan requests after scheduled scan was started will be
delayed until scheduled scan stops. As a result, P2P_FIND will be
blocked while scheduled scan is active.
This flag was omitted during refactoring.

Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2015-07-12 19:54:18 +03:00
Oren Givon b8eee75707 iwlwifi: edit the 3165 series and 8000 series PCI IDs
Add new 3165 devices support.
Add one new 8000 series device support.
Remove support for 0x0000, 0xC030 and 0xD030 sub-system IDs
in the 8000 series.

Signed-off-by: Oren Givon <oren.givon@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2015-07-12 19:54:18 +03:00
Johannes Berg 00fd233ac4 iwlwifi: mvm: check time-event vif to avoid bad deletion
The time event is initialized relatively late in interface (mvmvif)
initialization, so it's possible to fail before that happens. As a
consequence, the driver will crash if it ever tries to delete this
time event in case initialization was unsuccessful.

Avoid this by using the time event's vif pointer to indicate validity.
The vif pointer is != NULL whenever the id is != TE_MAX, except for
this special error case where the vif pointer will have the correct
property (as the whole memory is cleared on allocation) whereas the
id is 0, causing a crash in trying to delete the time event from the
list.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2015-07-12 19:54:17 +03:00
Emmanuel Grumbach f9e5554cd8 iwlwifi: pcie: prepare the device before accessing it
For 8000 series, we need to access the device to know what
firmware to load. Before we do so, we need to prepare the
device otherwise we might not be able to access the
hardware.

Fixes: c278754a21e6 ("iwlwifi: mvm: support family 8000 B2/C steps")
CC: <stable@vger.kernel.org> [4.1]
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2015-07-12 19:54:00 +03:00
Emmanuel Grumbach af3f2f7401 iwlwifi: pcie: don't panic if pcie transport alloc fails
iwl_trans_pcie_alloc needs to return a non-zero value
if it fails. Otherwise the iwl_drv_start will think that
the allocation succeeded.
Remove the duplication of err and ret variable and use ret
which is the name we usually use in other places of the
driver.

Fixes: c278754a21e6 ("iwlwifi: mvm: support family 8000 B2/C steps")
CC: <stable@vger.kernel.org> [4.1]
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2015-07-08 05:18:03 +03:00
Dreyfuss, Haim 66337b7c67 iwlwifi: pcie: Fix bug in NIC's PM registers access
While cleanig the access to those hw-dependent registers,
instead of using the product family type, wrong condition was added
mistakenly and enabled 8000 family devices a forbidden access
to HW registers, fix it.

Fixes: 95411d0455 ("iwlwifi: pcie: Control access to the NIC's PM registers via iwl_cfg")
Signed-off-by: Dreyfuss, Haim <haim.dreyfuss@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2015-07-08 05:17:58 +03:00
Matti Gottlieb 11828dbce6 iwlwifi: mvm: Avoid accessing Null pointer when setting igtk
Sometimes when setting an igtk key the station maybe NULL.
In the of case igtk the function will skip to the end, and
try to print sta->addr, if sta is Null - we will access a
Null pointer.

Avoid accessing a Null pointer when setting a igtk key &
the sta == NULL, and print a default MAC address instead.

Signed-off-by: Matti Gottlieb <matti.gottlieb@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2015-06-26 09:01:30 +03:00
Emmanuel Grumbach 923a8c1d80 iwlwifi: mvm: fix antenna selection when BT is active
When BT is active, we want to avoid the shared antenna for
management frame to make sure we don't disturb BT. There
was a bug in that code because it chose the antenna
BIT(ANT_A) where ANT_A is already a bitmap (0x1). This
means that the antenna chosen in the end was ANT_B.
While this is not optimal on devices with 2 antennas (it'd
disturb BT), it is critical on single antenna devices like
3160 which couldn't connect at all when BT was active.

This fixes:
https://bugzilla.kernel.org/show_bug.cgi?id=97181

CC: <stable@vger.kernel.org> [3.17+]
Fixes: 34c8b24ff2 ("iwlwifi: mvm: BT Coex - avoid the shared antenna for management frames")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
2015-06-26 09:01:29 +03:00
Linus Torvalds e0456717e4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Add TX fast path in mac80211, from Johannes Berg.

 2) Add TSO/GRO support to ibmveth, from Thomas Falcon

 3) Move away from cached routes in ipv6, just like ipv4, from Martin
    KaFai Lau.

 4) Lots of new rhashtable tests, from Thomas Graf.

 5) Run ingress qdisc lockless, from Alexei Starovoitov.

 6) Allow servers to fetch TCP packet headers for SYN packets of new
    connections, for fingerprinting.  From Eric Dumazet.

 7) Add mode parameter to pktgen, for testing receive.  From Alexei
    Starovoitov.

 8) Cache access optimizations via simplifications of build_skb(), from
    Alexander Duyck.

 9) Move page frag allocator under mm/, also from Alexander.

10) Add xmit_more support to hv_netvsc, from KY Srinivasan.

11) Add a counter guard in case we try to perform endless reclassify
    loops in the packet scheduler.

12) Extern flow dissector to be programmable and use it in new "Flower"
    classifier.  From Jiri Pirko.

13) AF_PACKET fanout rollover fixes, performance improvements, and new
    statistics.  From Willem de Bruijn.

14) Add netdev driver for GENEVE tunnels, from John W Linville.

15) Add ingress netfilter hooks and filtering, from Pablo Neira Ayuso.

16) Fix handling of epoll edge triggers in TCP, from Eric Dumazet.

17) Add an ECN retry fallback for the initial TCP handshake, from Daniel
    Borkmann.

18) Add tail call support to BPF, from Alexei Starovoitov.

19) Add several pktgen helper scripts, from Jesper Dangaard Brouer.

20) Add zerocopy support to AF_UNIX, from Hannes Frederic Sowa.

21) Favor even port numbers for allocation to connect() requests, and
    odd port numbers for bind(0), in an effort to help avoid
    ip_local_port_range exhaustion.  From Eric Dumazet.

22) Add Cavium ThunderX driver, from Sunil Goutham.

23) Allow bpf programs to access skb_iif and dev->ifindex SKB metadata,
    from Alexei Starovoitov.

24) Add support for T6 chips in cxgb4vf driver, from Hariprasad Shenai.

25) Double TCP Small Queues default to 256K to accomodate situations
    like the XEN driver and wireless aggregation.  From Wei Liu.

26) Add more entropy inputs to flow dissector, from Tom Herbert.

27) Add CDG congestion control algorithm to TCP, from Kenneth Klette
    Jonassen.

28) Convert ipset over to RCU locking, from Jozsef Kadlecsik.

29) Track and act upon link status of ipv4 route nexthops, from Andy
    Gospodarek.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1670 commits)
  bridge: vlan: flush the dynamically learned entries on port vlan delete
  bridge: multicast: add a comment to br_port_state_selection about blocking state
  net: inet_diag: export IPV6_V6ONLY sockopt
  stmmac: troubleshoot unexpected bits in des0 & des1
  net: ipv4 sysctl option to ignore routes when nexthop link is down
  net: track link-status of ipv4 nexthops
  net: switchdev: ignore unsupported bridge flags
  net: Cavium: Fix MAC address setting in shutdown state
  drivers: net: xgene: fix for ACPI support without ACPI
  ip: report the original address of ICMP messages
  net/mlx5e: Prefetch skb data on RX
  net/mlx5e: Pop cq outside mlx5e_get_cqe
  net/mlx5e: Remove mlx5e_cq.sqrq back-pointer
  net/mlx5e: Remove extra spaces
  net/mlx5e: Avoid TX CQE generation if more xmit packets expected
  net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion
  net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq()
  net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them
  net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues
  net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device
  ...
2015-06-24 16:49:49 -07:00
Linus Torvalds 98ec21a018 Merge branch 'sched-hrtimers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Thomas Gleixner:
 "This series of scheduler updates depends on sched/core and timers/core
  branches, which are already in your tree:

   - Scheduler balancing overhaul to plug a hard to trigger race which
     causes an oops in the balancer (Peter Zijlstra)

   - Lockdep updates which are related to the balancing updates (Peter
     Zijlstra)"

* 'sched-hrtimers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched,lockdep: Employ lock pinning
  lockdep: Implement lock pinning
  lockdep: Simplify lock_release()
  sched: Streamline the task migration locking a little
  sched: Move code around
  sched,dl: Fix sched class hopping CBS hole
  sched, dl: Convert switched_{from, to}_dl() / prio_changed_dl() to balance callbacks
  sched,dl: Remove return value from pull_dl_task()
  sched, rt: Convert switched_{from, to}_rt() / prio_changed_rt() to balance callbacks
  sched,rt: Remove return value from pull_rt_task()
  sched: Allow balance callbacks for check_class_changed()
  sched: Use replace normalize_task() with __sched_setscheduler()
  sched: Replace post_schedule with a balance callback list
2015-06-24 15:09:40 -07:00
Linus Torvalds a262948335 Merge branch 'sched-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Thomas Gleixner:
 "These locking updates depend on the alreay merged sched/core branch:

   - Lockless top waiter wakeup for rtmutex (Davidlohr)

   - Reduce hash bucket lock contention for PI futexes (Sebastian)

   - Documentation update (Davidlohr)"

* 'sched-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rtmutex: Update stale plist comments
  futex: Lower the lock contention on the HB lock during wake up
  locking/rtmutex: Implement lockless top-waiter wakeup
2015-06-24 14:46:01 -07:00
Linus Torvalds e3d8238d7f Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
 "Mostly refactoring/clean-up:

   - CPU ops and PSCI (Power State Coordination Interface) refactoring
     following the merging of the arm64 ACPI support, together with
     handling of Trusted (secure) OS instances

   - Using fixmap for permanent FDT mapping, removing the initial dtb
     placement requirements (within 512MB from the start of the kernel
     image).  This required moving the FDT self reservation out of the
     memreserve processing

   - Idmap (1:1 mapping used for MMU on/off) handling clean-up

   - Removing flush_cache_all() - not safe on ARM unless the MMU is off.
     Last stages of CPU power down/up are handled by firmware already

   - "Alternatives" (run-time code patching) refactoring and support for
     immediate branch patching, GICv3 CPU interface access

   - User faults handling clean-up

  And some fixes:

   - Fix for VDSO building with broken ELF toolchains

   - Fix another case of init_mm.pgd usage for user mappings (during
     ASID roll-over broadcasting)

   - Fix for FPSIMD reloading after CPU hotplug

   - Fix for missing syscall trace exit

   - Workaround for .inst asm bug

   - Compat fix for switching the user tls tpidr_el0 register"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (42 commits)
  arm64: use private ratelimit state along with show_unhandled_signals
  arm64: show unhandled SP/PC alignment faults
  arm64: vdso: work-around broken ELF toolchains in Makefile
  arm64: kernel: rename __cpu_suspend to keep it aligned with arm
  arm64: compat: print compat_sp instead of sp
  arm64: mm: Fix freeing of the wrong memmap entries with !SPARSEMEM_VMEMMAP
  arm64: entry: fix context tracking for el0_sp_pc
  arm64: defconfig: enable memtest
  arm64: mm: remove reference to tlb.S from comment block
  arm64: Do not attempt to use init_mm in reset_context()
  arm64: KVM: Switch vgic save/restore to alternative_insn
  arm64: alternative: Introduce feature for GICv3 CPU interface
  arm64: psci: fix !CONFIG_HOTPLUG_CPU build warning
  arm64: fix bug for reloading FPSIMD state after CPU hotplug.
  arm64: kernel thread don't need to save fpsimd context.
  arm64: fix missing syscall trace exit
  arm64: alternative: Work around .inst assembler bugs
  arm64: alternative: Merge alternative-asm.h into alternative.h
  arm64: alternative: Allow immediate branch as alternative instruction
  arm64: Rework alternate sequence for ARM erratum 845719
  ...
2015-06-24 10:02:15 -07:00
Linus Torvalds 4e241557fc Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull first batch of KVM updates from Paolo Bonzini:
 "The bulk of the changes here is for x86.  And for once it's not for
  silicon that no one owns: these are really new features for everyone.

  Details:

   - ARM:
        several features are in progress but missed the 4.2 deadline.
        So here is just a smattering of bug fixes, plus enabling the
        VFIO integration.

   - s390:
        Some fixes/refactorings/optimizations, plus support for 2GB
        pages.

   - x86:
        * host and guest support for marking kvmclock as a stable
          scheduler clock.
        * support for write combining.
        * support for system management mode, needed for secure boot in
          guests.
        * a bunch of cleanups required for the above
        * support for virtualized performance counters on AMD
        * legacy PCI device assignment is deprecated and defaults to "n"
          in Kconfig; VFIO replaces it

        On top of this there are also bug fixes and eager FPU context
        loading for FPU-heavy guests.

   - Common code:
        Support for multiple address spaces; for now it is used only for
        x86 SMM but the s390 folks also have plans"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (124 commits)
  KVM: s390: clear floating interrupt bitmap and parameters
  KVM: x86/vPMU: Enable PMU handling for AMD PERFCTRn and EVNTSELn MSRs
  KVM: x86/vPMU: Implement AMD vPMU code for KVM
  KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch
  KVM: x86/vPMU: introduce kvm_pmu_msr_idx_to_pmc
  KVM: x86/vPMU: reorder PMU functions
  KVM: x86/vPMU: whitespace and stylistic adjustments in PMU code
  KVM: x86/vPMU: use the new macros to go between PMC, PMU and VCPU
  KVM: x86/vPMU: introduce pmu.h header
  KVM: x86/vPMU: rename a few PMU functions
  KVM: MTRR: do not map huge page for non-consistent range
  KVM: MTRR: simplify kvm_mtrr_get_guest_memory_type
  KVM: MTRR: introduce mtrr_for_each_mem_type
  KVM: MTRR: introduce fixed_mtrr_addr_* functions
  KVM: MTRR: sort variable MTRRs
  KVM: MTRR: introduce var_mtrr_range
  KVM: MTRR: introduce fixed_mtrr_segment table
  KVM: MTRR: improve kvm_mtrr_get_guest_memory_type
  KVM: MTRR: do not split 64 bits MSR content
  KVM: MTRR: clean up mtrr default type
  ...
2015-06-24 09:36:49 -07:00
Linus Torvalds 08d183e3c1 Merge tag 'powerpc-4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull powerpc updates from Michael Ellerman:

 - disable the 32-bit vdso when building LE, so we can build with a
   64-bit only toolchain.

 - EEH fixes from Gavin & Richard.

 - enable the sys_kcmp syscall from Laurent.

 - sysfs control for fastsleep workaround from Shreyas.

 - expose OPAL events as an irq chip by Alistair.

 - MSI ops moved to pci_controller_ops by Daniel.

 - fix for kernel to userspace backtraces for perf from Anton.

 - merge pseries and pseries_le defconfigs from Cyril.

 - CXL in-kernel API from Mikey.

 - OPAL prd driver from Jeremy.

 - fix for DSCR handling & tests from Anshuman.

 - Powernv flash mtd driver from Cyril.

 - dynamic DMA Window support on powernv from Alexey.

 - LLVM clang fixes & workarounds from Anton.

 - reworked version of the patch to abort syscalls when transactional.

 - fix the swap encoding to support 4TB, from Aneesh.

 - various fixes as usual.

 - Freescale updates from Scott: Highlights include more 8xx
   optimizations, an e6500 hugetlb optimization, QMan device tree nodes,
   t1024/t1023 support, and various fixes and cleanup.

* tag 'powerpc-4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (180 commits)
  cxl: Fix typo in debug print
  cxl: Add CXL_KERNEL_API config option
  powerpc/powernv: Fix wrong IOMMU table in pnv_ioda_setup_bus_dma()
  powerpc/mm: Change the swap encoding in pte.
  powerpc/mm: PTE_RPN_MAX is not used, remove the same
  powerpc/tm: Abort syscalls in active transactions
  powerpc/iommu/ioda2: Enable compile with IOV=on and IOMMU_API=off
  powerpc/include: Add opal-prd to installed uapi headers
  powerpc/powernv: fix construction of opal PRD messages
  powerpc/powernv: Increase opal-irqchip initcall priority
  powerpc: Make doorbell check preemption safe
  powerpc/powernv: pnv_init_idle_states() should only run on powernv
  macintosh/nvram: Remove as unused
  powerpc: Don't use gcc specific options on clang
  powerpc: Don't use -mno-strict-align on clang
  powerpc: Only use -mtraceback=no, -mno-string and -msoft-float if toolchain supports it
  powerpc: Only use -mabi=altivec if toolchain supports it
  powerpc: Fix duplicate const clang warning in user access code
  vfio: powerpc/spapr: Support Dynamic DMA windows
  vfio: powerpc/spapr: Register memory and define IOMMU v2
  ...
2015-06-24 08:46:32 -07:00
Linus Torvalds 4b1f2af675 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
 "Pretty boring for a merge window pull.

  One change in behaviour is the patch for dasd driver, the module which
  provides the diagnose discipline is now loaded automatically.

  The SCLP code got a nice cleanup, a new global structure replaces a
  bunch of accessor functions.

  And a couple of random, small improvements"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/pci: improve handling of hotplug event 0x301
  s390/setup: fix DMA_API_DEBUG warnings
  s390/zcrypt: remove obsolete __constant
  s390/keyboard: avoid off-by-one when using strnlen_user()
  s390/sclp: pass timeout as HZ independent value
  s390/mm: s/specifiation/specification/, s/an specification/a specification/
  s390/sclp: Use DECLARE_BITMAP
  s390/dasd: Enable automatic loading of dasd_diag_mod
  s390/sclp: move sclp_facilities into "struct sclp"
  s390/sclp: get rid of sclp_get_mtid() and sclp_get_mtid_max()
  s390/sclp: unify basic sclp access by exposing "struct sclp"
  s390/sclp: prepare smp_fill_possible_mask for global "struct sclp"
2015-06-24 08:45:27 -07:00
Linus Torvalds aaa6448526 Merge tag 'microblaze-4.2-rc1' of git://git.monstr.eu/linux-2.6-microblaze
Pull Microblaze updates from Michal Simek:

 - some PCI fixups

 - add new MB versions

 - sparse fixups

* tag 'microblaze-4.2-rc1' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze/PCI: Remove unnecessary struct pci_dev declaration
  microblaze/PCI: Remove unnecessary pci_bus_find_capability() declaration
  microblaze/PCI: Remove unused declarations
  microblaze: Label local function static
  microblaze: Add missing release version code
2015-06-24 08:44:40 -07:00
Nikolay Aleksandrov 1ea2d020ba bridge: vlan: flush the dynamically learned entries on port vlan delete
Add a new argument to br_fdb_delete_by_port which allows to specify a
vid to match when flushing entries and use it in nbp_vlan_delete() to
flush the dynamically learned entries of the vlan/port pair when removing
a vlan from a port. Before this patch only the local mac was being
removed and the dynamically learned ones were left to expire.
Note that the do_all argument is still respected and if specified, the
vid will be ignored.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24 05:40:55 -07:00
Nikolay Aleksandrov 9aa6638216 bridge: multicast: add a comment to br_port_state_selection about blocking state
Add a comment to explain why we're not disabling port's multicast when it
goes in blocking state. Since there's a check in the timer's function which
bypasses the timer if the port's in blocking/disabled state, the timer will
simply expire and stop without sending more queries.

Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24 05:40:54 -07:00
David S. Miller 3a07bd6fea Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/mellanox/mlx4/main.c
	net/packet/af_packet.c

Both conflicts were cases of simple overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24 02:58:51 -07:00
Phil Sutter 204621551b net: inet_diag: export IPV6_V6ONLY sockopt
For AF_INET6 sockets, the value of struct ipv6_pinfo.ipv6only is
exported to userspace. It indicates whether a socket bound to in6addr_any
listens on IPv4 as well as IPv6. Since the socket is natively IPv6, it is not
listed by e.g. 'ss -l -4'.

This patch is accompanied by an appropriate one for iproute2 to enable
the additional information in 'ss -e'.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24 02:51:39 -07:00
Alexey Brodkin f1590670ce stmmac: troubleshoot unexpected bits in des0 & des1
Current implementation of descriptor init procedure only takes
care about setting/clearing ownership flag in "des0"/"des1"
fields while it is perfectly possible to get unexpected bits
set because of the following factors:

 [1] On driver probe underlying memory allocated with
     dma_alloc_coherent() might not be zeroed and so
     it will be filled with garbage.

 [2] During driver operation some bits could be set by SD/MMC
     controller (for example error flags etc).

And unexpected and/or randomly set flags in "des0"/"des1"
fields may lead to unpredictable behavior of GMAC DMA block.

This change addresses both items above with:

 [1] Use of dma_zalloc_coherent() instead of simple
     dma_alloc_coherent() to make sure allocated memory is
     zeroed. That shouldn't affect performance because
     this allocation only happens once on driver probe.

 [2] Do explicit zeroing of both "des0" and "des1" fields
     of all buffer descriptors during initialization of
     DMA transfer.

And while at it fixed identation of dma_free_coherent()
counterpart as well.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: arc-linux-dev@synopsys.com
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24 02:38:53 -07:00
David S. Miller f389a40ec3 Merge branch 'ipv4-nexthop-link-status'
Andy Gospodarek says:

====================
changes to make ipv4 routing table aware of next-hop link status

This series adds the ability to have the Linux kernel track whether or
not a particular route should be used based on the link-status of the
interface associated with the next-hop.

Before this patch any link-failure on an interface that was serving as a
gateway for some systems could result in those systems being isolated
from the rest of the network as the stack would continue to attempt to
send frames out of an interface that is actually linked-down.  When the
kernel is responsible for all forwarding, it should also be responsible
for taking action when the traffic can no longer be forwarded -- there
is no real need to outsource link-monitoring to userspace anymore.

This feature is only enabled with the new per-interface or ipv4 global
sysctls called 'ignore_routes_with_linkdown'.

net.ipv4.conf.all.ignore_routes_with_linkdown = 0
net.ipv4.conf.default.ignore_routes_with_linkdown = 0
net.ipv4.conf.lo.ignore_routes_with_linkdown = 0
...

When the above sysctls are set, the kernel will not only report to
userspace that the link is down, but it will also report to userspace
that a route is dead.  This will signal to userspace that the route will
not be selected.

With the new sysctls set, the following behavior can be observed
(interface p8p1 is link-down):

default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15
70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1
80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1 dead linkdown
90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1 dead linkdown
90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2
90.0.0.1 via 70.0.0.2 dev p7p1  src 70.0.0.1
    cache
local 80.0.0.1 dev lo  src 80.0.0.1
    cache <local>
80.0.0.2 via 10.0.5.2 dev p9p1  src 10.0.5.15
    cache

While the route does remain in the table (so it can be modified if
needed rather than being wiped away as it would be if IFF_UP was
cleared), the proper next-hop is chosen automatically when the link is
down.  Now interface p8p1 is linked-up:

default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15
70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1
80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1
90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1
90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2
192.168.56.0/24 dev p2p1  proto kernel  scope link  src 192.168.56.2
90.0.0.1 via 80.0.0.2 dev p8p1  src 80.0.0.1
    cache
local 80.0.0.1 dev lo  src 80.0.0.1
    cache <local>
80.0.0.2 dev p8p1  src 80.0.0.1
    cache

and the output changes to what one would expect.

If the global or interface sysctl is not set, the following output would
be expected when p8p1 is down:

default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15
70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1
80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1 linkdown
90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1 linkdown
90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2

If the dead flag does not appear there should be no expectation that the
kernel would skip using this route due to link being down.

v2: Split kernel changes into 2 patches: first to add linkdown flag and
second to add new sysctl settings.  Also took suggestion from Alex to
simplify code by only checking sysctl during fib lookup and suggestion
from Scott to add a per-interface sysctl.  Added iproute2 patch to
recognize and print linkdown flag.

v3: Code cleanups along with reverse-path checks suggested by Alex and
small fixes related to problems found when multipath was disabled.

v4: Drop binary sysctls

v5: Whitespace and variable declaration fixups suggested by Dave

v6: Style changes noticed by Dave and checkpath suggestions.

v7: Last checkpatch fixup.

Though there were some that preferred not to have a configuration option
and to make this behavior the default when it was discussed in Ottawa
earlier this year since "it was time to do this."  I wanted to propose
the config option to preserve the current behavior for those that desire
it.  I'll happily remove it if Dave and Linus approve.

An IPv6 implementation is also needed (DECnet too!), but I wanted to
start with the IPv4 implementation to get people comfortable with the
idea before moving forward.  If this is accepted the IPv6 implementation
can be posted shortly.

There was also a request for switchdev support for this, but that will
be posted as a followup as switchdev does not currently handle dead
next-hops in a multi-path case and I felt that infra needed to be added
first.

FWIW, we have been running the original version of this series with a
global sysctl and our customers have been happily using a backported
version for IPv4 and IPv6 for >6 months.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24 02:16:03 -07:00
Andy Gospodarek 0eeb075fad net: ipv4 sysctl option to ignore routes when nexthop link is down
This feature is only enabled with the new per-interface or ipv4 global
sysctls called 'ignore_routes_with_linkdown'.

net.ipv4.conf.all.ignore_routes_with_linkdown = 0
net.ipv4.conf.default.ignore_routes_with_linkdown = 0
net.ipv4.conf.lo.ignore_routes_with_linkdown = 0
...

When the above sysctls are set, will report to userspace that a route is
dead and will no longer resolve to this nexthop when performing a fib
lookup.  This will signal to userspace that the route will not be
selected.  The signalling of a RTNH_F_DEAD is only passed to userspace
if the sysctl is enabled and link is down.  This was done as without it
the netlink listeners would have no idea whether or not a nexthop would
be selected.   The kernel only sets RTNH_F_DEAD internally if the
interface has IFF_UP cleared.

With the new sysctl set, the following behavior can be observed
(interface p8p1 is link-down):

default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15
70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1
80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1 dead linkdown
90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1 dead linkdown
90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2
90.0.0.1 via 70.0.0.2 dev p7p1  src 70.0.0.1
    cache
local 80.0.0.1 dev lo  src 80.0.0.1
    cache <local>
80.0.0.2 via 10.0.5.2 dev p9p1  src 10.0.5.15
    cache

While the route does remain in the table (so it can be modified if
needed rather than being wiped away as it would be if IFF_UP was
cleared), the proper next-hop is chosen automatically when the link is
down.  Now interface p8p1 is linked-up:

default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15
70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1
80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1
90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1
90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2
192.168.56.0/24 dev p2p1  proto kernel  scope link  src 192.168.56.2
90.0.0.1 via 80.0.0.2 dev p8p1  src 80.0.0.1
    cache
local 80.0.0.1 dev lo  src 80.0.0.1
    cache <local>
80.0.0.2 dev p8p1  src 80.0.0.1
    cache

and the output changes to what one would expect.

If the sysctl is not set, the following output would be expected when
p8p1 is down:

default via 10.0.5.2 dev p9p1
10.0.5.0/24 dev p9p1  proto kernel  scope link  src 10.0.5.15
70.0.0.0/24 dev p7p1  proto kernel  scope link  src 70.0.0.1
80.0.0.0/24 dev p8p1  proto kernel  scope link  src 80.0.0.1 linkdown
90.0.0.0/24 via 80.0.0.2 dev p8p1  metric 1 linkdown
90.0.0.0/24 via 70.0.0.2 dev p7p1  metric 2

Since the dead flag does not appear, there should be no expectation that
the kernel would skip using this route due to link being down.

v2: Split kernel changes into 2 patches, this actually makes a
behavioral change if the sysctl is set.  Also took suggestion from Alex
to simplify code by only checking sysctl during fib lookup and
suggestion from Scott to add a per-interface sysctl.

v3: Code clean-ups to make it more readable and efficient as well as a
reverse path check fix.

v4: Drop binary sysctl

v5: Whitespace fixups from Dave

v6: Style changes from Dave and checkpatch suggestions

v7: One more checkpatch fixup

Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-06-24 02:15:54 -07:00