Commit Graph

706934 Commits

Author SHA1 Message Date
Kees Cook
686fef928b timer: Prepare to change timer callback argument type
Modern kernel callback systems pass the structure associated with a
given callback to the callback function. The timer callback remains one
of the legacy cases where an arbitrary unsigned long argument continues
to be passed as the callback argument. This has several problems:

- This bloats the timer_list structure with a normally redundant
  .data field.

- No type checking is being performed, forcing callbacks to do
  explicit type casts of the unsigned long argument into the object
  that was passed, rather than using container_of(), as done in most
  of the other callback infrastructure.

- Neighboring buffer overflows can overwrite both the .function and
  the .data field, providing attackers with a way to elevate from a buffer
  overflow into a simplistic ROP-like mechanism that allows calling
  arbitrary functions with a controlled first argument.

- For future Control Flow Integrity work, this creates a unique function
  prototype for timer callbacks, instead of allowing them to continue to
  be clustered with other void functions that take a single unsigned long
  argument.

This adds a new timer initialization API, which will ultimately replace
the existing setup_timer(), setup_{deferrable,pinned,etc}_timer() family,
named timer_setup() (to mirror hrtimer_setup(), making instances of its
use much easier to grep for).

In order to support the migration of existing timers into the new
callback arguments, timer_setup() casts its arguments to the existing
legacy types, and explicitly passes the timer pointer as the legacy
data argument. Once all setup_*timer() callers have been replaced with
timer_setup(), the casts can be removed, and the data argument can be
dropped with the timer expiration code changed to just pass the timer
to the callback directly.

Since the regular pattern of using container_of() during local variable
declaration repeats the need for the variable type declaration
to be included, this adds a helper modeled after other from_*()
helpers that wrap container_of(), named from_timer(). This helper uses
typeof(*variable), removing the type redundancy and minimizing the need
for line wraps in forthcoming conversions from "unsigned data long" to
"struct timer_list *" in the timer callbacks:

-void callback(unsigned long data)
+void callback(struct timer_list *t)
{
-   struct some_data_structure *local = (struct some_data_structure *)data;
+   struct some_data_structure *local = from_timer(local, t, timer);

Finally, in order to support the handful of timer users that perform
open-coded assignments of the .function (and .data) fields, provide
cast macros (TIMER_FUNC_TYPE and TIMER_DATA_TYPE) that can be used
temporarily. Once conversion has been completed, these can be globally
trivially removed.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20170928133817.GA113410@beast
2017-09-28 16:30:36 +02:00
Zhenzhong Duan
0d805ee70a xen/mmu: Call xen_cleanhighmap() with 4MB aligned for page tables mapping
When bootup a PVM guest with large memory(Ex.240GB), XEN provided initial
mapping overlaps with kernel module virtual space. When mapping in this space
is cleared by xen_cleanhighmap(), in certain case there could be an 2MB mapping
left. This is due to XEN initialize 4MB aligned mapping but xen_cleanhighmap()
finish at 2MB boundary.

When module loading is just on top of the 2MB space, got below warning:

WARNING: at mm/vmalloc.c:106 vmap_pte_range+0x14e/0x190()
Call Trace:
 [<ffffffff81117083>] warn_alloc_failed+0xf3/0x160
 [<ffffffff81146022>] __vmalloc_area_node+0x182/0x1c0
 [<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80
 [<ffffffff81145df7>] __vmalloc_node_range+0xa7/0x110
 [<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80
 [<ffffffff8103ca54>] module_alloc+0x64/0x70
 [<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80
 [<ffffffff810ac91e>] module_alloc_update_bounds+0x1e/0x80
 [<ffffffff810ac9a7>] move_module+0x27/0x150
 [<ffffffff810aefa0>] layout_and_allocate+0x120/0x1b0
 [<ffffffff810af0a8>] load_module+0x78/0x640
 [<ffffffff811ff90b>] ? security_file_permission+0x8b/0x90
 [<ffffffff810af6d2>] sys_init_module+0x62/0x1e0
 [<ffffffff815154c2>] system_call_fastpath+0x16/0x1b

Then the mapping of 2MB is cleared, finally oops when the page in that space is
accessed.

BUG: unable to handle kernel paging request at ffff880022600000
IP: [<ffffffff81260877>] clear_page_c_e+0x7/0x10
PGD 1788067 PUD 178c067 PMD 22434067 PTE 0
Oops: 0002 [#1] SMP
Call Trace:
 [<ffffffff81116ef7>] ? prep_new_page+0x127/0x1c0
 [<ffffffff81117d42>] get_page_from_freelist+0x1e2/0x550
 [<ffffffff81133010>] ? ii_iovec_copy_to_user+0x90/0x140
 [<ffffffff81119c9d>] __alloc_pages_nodemask+0x12d/0x230
 [<ffffffff81155516>] alloc_pages_vma+0xc6/0x1a0
 [<ffffffff81006ffd>] ? pte_mfn_to_pfn+0x7d/0x100
 [<ffffffff81134cfb>] do_anonymous_page+0x16b/0x350
 [<ffffffff81139c34>] handle_pte_fault+0x1e4/0x200
 [<ffffffff8100712e>] ? xen_pmd_val+0xe/0x10
 [<ffffffff810052c9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
 [<ffffffff81139dab>] handle_mm_fault+0x15b/0x270
 [<ffffffff81510c10>] do_page_fault+0x140/0x470
 [<ffffffff8150d7d5>] page_fault+0x25/0x30

Call xen_cleanhighmap() with 4MB aligned for page tables mapping to fix it.
The unnecessory call of xen_cleanhighmap() in DEBUG mode is also removed.

-v2: add comment about XEN alignment from Juergen.

References: https://lists.xen.org/archives/html/xen-devel/2012-07/msg01562.html
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>

[boris: added 'xen/mmu' tag to commit subject]
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-09-28 08:26:30 -04:00
Jan Beulich
8c28ef3f1c xen-pciback: relax BAR sizing write value check
Just like done in d2bd05d88d ("xen-pciback: return proper values during
BAR sizing") for the ROM BAR, ordinary ones also shouldn't compare the
written value directly against ~0, but consider the r/o bits at the
bottom (if any).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-09-28 08:26:25 -04:00
Jeffy Chen
72364d3206 irq/generic-chip: Don't replace domain's name
When generic irq chips are allocated for an irq domain the domain name is
set to the irq chip name. That was done to have named domains before the
recent changes which enforce domain naming were done.

Since then the overwrite causes a memory leak when the domain name is
dynamically allocated and even worse it would cause the domain free code to
free the wrong name pointer, which might point to a constant.

Remove the name assignment to prevent this.

Fixes: d59f6617ee ("genirq: Allow fwnode to carry name information only")
Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20170928043731.4764-1-jeffy.chen@rock-chips.com
2017-09-28 12:18:59 +02:00
Baolin Wang
c3cdce45f8 usb: dwc3: of-simple: Add compatible for Spreadtrum SC9860 platform
Add compatible string to use this generic glue layer to support
Spreadtrum SC9860 platform's dwc3 controller.

Signed-off-by: Baolin Wang <baolin.wang@spreadtrum.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-09-28 12:41:56 +03:00
Nicolas Ferre
6baeda120d usb: gadget: udc: atmel: set vbus irqflags explicitly
The driver triggers actions on both edges of the vbus signal.

The former PIO controller was triggering IRQs on both falling and rising edges
by default. Newer PIO controller don't, so it's better to set it explicitly to
IRQF_TRIGGER_FALLING | IRQF_TRIGGER_RISING.

Without this patch we may trigger the connection with host but only on some
bouncing signal conditions and thus lose connecting events.

Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Cc: stable <stable@vger.kernel.org> # v4.4+
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-09-28 12:39:22 +03:00
John Keeping
addfc5823d usb: gadget: ffs: handle I/O completion in-order
By submitting completed transfers to the system workqueue there is no
guarantee that completion events will be queued up in the correct order,
as in multi-processor systems there is a thread running for each
processor and the work items are not bound to a particular core.

This means that several completions are in the queue at the same time,
they may be processed in parallel and complete out of order, resulting
in data appearing corrupt when read by userspace.

Create a single-threaded workqueue for FunctionFS so that data completed
requests is passed to userspace in the order in which they complete.

Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: John Keeping <john@metanate.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-09-28 12:37:24 +03:00
Yoshihiro Shimoda
0a2ce62b61 usb: renesas_usbhs: fix usbhsf_fifo_clear() for RX direction
This patch fixes an issue that the usbhsf_fifo_clear() is possible
to cause 10 msec delay if the pipe is RX direction and empty because
the FRDY bit will never be set to 1 in such case.

Fixes: e8d548d549 ("usb: renesas_usbhs: fifo became independent from pipe.")
Cc: <stable@vger.kernel.org> # v3.1+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-09-28 12:35:01 +03:00
Yoshihiro Shimoda
6124607acc usb: renesas_usbhs: fix the BCLR setting condition for non-DCP pipe
This patch fixes an issue that the driver sets the BCLR bit of
{C,Dn}FIFOCTR register to 1 even when it's non-DCP pipe and
the FRDY bit of {C,Dn}FIFOCTR register is set to 1.

Fixes: e8d548d549 ("usb: renesas_usbhs: fifo became independent from pipe.")
Cc: <stable@vger.kernel.org> # v3.1+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-09-28 12:34:59 +03:00
Yoshihiro Shimoda
447b8a01b8 usb: gadget: udc: renesas_usb3: Fix return value of usb3_write_pipe()
This patch fixes an issue that this driver cannot go status stage
in control read when the req.zero is set to 1 and the len in
usb3_write_pipe() is set to 0. Otherwise, if we use g_ncm driver,
usb enumeration takes long time (5 seconds or more).

Fixes: 746bfe63bb ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller")
Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-09-28 12:31:51 +03:00
Yoshihiro Shimoda
73f2f5745f usb: gadget: udc: renesas_usb3: fix Pn_RAMMAP.Pn_MPKT value
According to the datasheet of R-Car Gen3, the Pn_RAMMAP.Pn_MPKT should
be set to one of 8, 16, 32, 64, 512 and 1024. Otherwise, when a gadget
driver uses an interrupt endpoint, unexpected behavior happens. So,
this patch fixes it.

Fixes: 746bfe63bb ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller")
Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-09-28 12:31:47 +03:00
Yoshihiro Shimoda
4dcf4bab4a usb: gadget: udc: renesas_usb3: fix for no-data control transfer
When bRequestType & USB_DIR_IN is false and req.length is 0 in control
transfer, since it means non-data, this driver should not set the mode
as control write. So, this patch fixes it.

Fixes: 746bfe63bb ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller")
Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-09-28 12:31:26 +03:00
Alan Stern
7dbd8f4cab USB: dummy-hcd: Fix erroneous synchronization change
A recent change to the synchronization in dummy-hcd was incorrect.
The issue was that dummy_udc_stop() contained no locking and therefore
could race with various gadget driver callbacks, and the fix was to
add locking and issue the callbacks with the private spinlock held.

UDC drivers aren't supposed to do this.  Gadget driver callback
routines are allowed to invoke functions in the UDC driver, and these
functions will generally try to acquire the private spinlock.  This
would deadlock the driver.

The correct solution is to drop the spinlock before issuing callbacks,
and avoid races by emulating the synchronize_irq() call that all real
UDC drivers must perform in their ->udc_stop() routines after
disabling interrupts.  This involves adding a flag to dummy-hcd's
private structure to keep track of whether interrupts are supposed to
be enabled, and adding a counter to keep track of ongoing callbacks so
that dummy_udc_stop() can wait for them all to finish.

A real UDC driver won't receive disconnect, reset, suspend, resume, or
setup events once it has disabled interrupts.  dummy-hcd will receive
them but won't try to issue any gadget driver callbacks, which should
be just as good.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Fixes: f16443a034 ("USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks")
CC: <stable@vger.kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-09-28 12:30:59 +03:00
Alan Stern
0173a68bfb USB: dummy-hcd: fix infinite-loop resubmission bug
The dummy-hcd HCD/UDC emulator tries not to do too much work during
each timer interrupt.  But it doesn't try very hard; currently all
it does is limit the total amount of bulk data transferred.  Other
transfer types aren't limited, and URBs that transfer no data (because
of an error, perhaps) don't count toward the limit, even though on a
real USB bus they would consume at least a minimum overhead.

This means it's possible to get the driver stuck in an infinite loop,
for example, if the host class driver resubmits an URB every time it
completes (which is common for interrupt URBs).  Each time the URB is
resubmitted it gets added to the end of the pending-URBs list, and
dummy-hcd doesn't stop until that list is empty.  Andrey Konovalov was
able to trigger this failure mode using the syzkaller fuzzer.

This patch fixes the infinite-loop problem by restricting the URBs
handled during each timer interrupt to those that were already on the
pending list when the interrupt routine started.  Newly added URBs
won't be processed until the next timer interrupt.  The problem of
properly accounting for non-bulk bandwidth (as well as packet and
transaction overhead) is not addressed here.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-09-28 12:30:56 +03:00
Alan Stern
fe659bcc9b USB: dummy-hcd: fix connection failures (wrong speed)
The dummy-hcd UDC driver is not careful about the way it handles
connection speeds.  It ignores the module parameter that is supposed
to govern the maximum connection speed and it doesn't set the HCD
flags properly for the case where it ends up running at full speed.

The result is that in many cases, gadget enumeration over dummy-hcd
fails because the bMaxPacketSize byte in the device descriptor is set
incorrectly.  For example, the default settings call for a high-speed
connection, but the maxpacket value for ep0 ends up being set for a
Super-Speed connection.

This patch fixes the problem by initializing the gadget's max_speed
and the HCD flags correctly.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: <stable@vger.kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-09-28 12:30:51 +03:00
Oleg Nesterov
66a733ea6b seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter()
As Chris explains, get_seccomp_filter() and put_seccomp_filter() can end
up using different filters. Once we drop ->siglock it is possible for
task->seccomp.filter to have been replaced by SECCOMP_FILTER_FLAG_TSYNC.

Fixes: f8e529ed94 ("seccomp, ptrace: add support for dumping seccomp filters")
Reported-by: Chris Salls <chrissalls5@gmail.com>
Cc: stable@vger.kernel.org # needs s/refcount_/atomic_/ for v4.12 and earlier
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
[tycho: add __get_seccomp_filter vs. open coding refcount_inc()]
Signed-off-by: Tycho Andersen <tycho@docker.com>
[kees: tweak commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
2017-09-27 22:51:12 -07:00
Josh Poimboeuf
607a4029d4 objtool: Support unoptimized frame pointer setup
Arnd Bergmann reported a bunch of warnings like:

  crypto/jitterentropy.o: warning: objtool: jent_fold_time()+0x3b: call without frame pointer save/setup
  crypto/jitterentropy.o: warning: objtool: jent_stuck()+0x1d: call without frame pointer save/setup
  crypto/jitterentropy.o: warning: objtool: jent_unbiased_bit()+0x15: call without frame pointer save/setup
  crypto/jitterentropy.o: warning: objtool: jent_read_entropy()+0x32: call without frame pointer save/setup
  crypto/jitterentropy.o: warning: objtool: jent_entropy_collector_free()+0x19: call without frame pointer save/setup

and

  arch/x86/events/core.o: warning: objtool: collect_events uses BP as a scratch register
  arch/x86/events/core.o: warning: objtool: events_ht_sysfs_show()+0x22: call without frame pointer save/setup

With certain rare configurations, GCC sometimes sets up the frame
pointer with:

  lea    (%rsp),%rbp

instead of:

  mov    %rsp,%rbp

The instructions are equivalent, so treat the former like the latter.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a468af8b28a69b83fffc6d7668be9b6fcc873699.1506526584.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-28 07:25:54 +02:00
Josh Poimboeuf
da541b2002 objtool: Skip unreachable warnings for GCC 4.4 and older
The kbuild bot occasionally reports warnings like:

  drivers/scsi/pcmcia/aha152x_core.o: warning: objtool: seldo_run()+0x130: unreachable instruction

These warnings are always with GCC 4.4.  That version of GCC sometimes
places unreachable instructions after calls to noreturn functions.

The unreachable warnings aren't very important anyway.  Just ignore them
for old versions of GCC.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/bc89b807d965b98ec18a0bb94f96a594bd58f2f2.1506551639.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-28 07:23:02 +02:00
Or Gerlitz
353f59f4d4 net/mlx5: Fix wrong indentation in enable SRIOV code
Smatch is screaming:

drivers/net/ethernet/mellanox/mlx5/core/sriov.c:112
	mlx5_device_enable_sriov() warn: inconsistent indenting

fix that.

Fixes: 7ecf6d8ff1 ('IB/mlx5: Restore IB guid/policy for virtual functions')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-28 07:23:10 +03:00
Matan Barak
480df991b8 net/mlx5: Fix static checker warning on steering tracepoints code
Fix this sparse complaint:

drivers/net/ethernet/mellanox/mlx5/core/./diag/fs_tracepoint.h:172:1:
	warning: odd constant _Bool cast (ffffffffffffffff becomes 1)

Fixes: d9fea79171ee ('net/mlx5: Add tracepoints')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-28 07:23:10 +03:00
Gal Pressman
603e1f5bd3 net/mlx5e: Fix calculated checksum offloads counters
Instead of calculating the offloads counters, count them explicitly.
The calculations done for these counters would result in bugs in some
cases, for example:
When running TCP traffic over a VXLAN tunnel with TSO enabled the following
counters would increase:
       tx_csum_partial: 1,333,284
       tx_csum_partial_inner: 29,286
       tx4_csum_partial_inner: 384
       tx7_csum_partial_inner: 8
       tx9_csum_partial_inner: 34
       tx10_csum_partial_inner: 26,807
       tx11_csum_partial_inner: 287
       tx12_csum_partial_inner: 27
       tx16_csum_partial_inner: 6
       tx25_csum_partial_inner: 1,733

Seems like tx_csum_partial increased out of nowhere.
The issue is in the following calculation in mlx5e_update_sw_counters:
s->tx_csum_partial = s->tx_packets - tx_offload_none - s->tx_csum_partial_inner;

While tx_packets increases by the number of GSO segments for each SKB,
tx_csum_partial_inner will only increase by one, resulting in wrong
tx_csum_partial counter.

Fixes: bfe6d8d1d4 ("net/mlx5e: Reorganize ethtool statistics")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-28 07:23:10 +03:00
Gal Pressman
1456f69ff5 net/mlx5e: Don't add/remove 802.1ad rules when changing 802.1Q VLAN filter
Toggling of C-tag VLAN filter should not affect the "any S-tag" steering rule.

Fixes: 8a271746a2 ("net/mlx5e: Receive s-tagged packets in promiscuous mode")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-28 07:23:10 +03:00
Gal Pressman
b20eab15a1 net/mlx5e: Print netdev features correctly in error message
Use the correct formatting for netdev features.

Fixes: 0e405443e8 ("net/mlx5e: Improve set features ndo resiliency")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-28 07:23:10 +03:00
Vlad Buslov
b281208911 net/mlx5e: Check encap entry state when offloading tunneled flows
Encap entries cached by the driver could be invalidated due to
tunnel destination neighbour state changes.
When attempting to offload a flow that uses a cached encap entry,
we must check the entry validity and defer the offloading
if the entry exists but not valid.

When EAGAIN is returned, the flow offloading to hardware takes place
by the neigh update code when the tunnel destination neighbour
becomes connected.

Fixes: 232c001398 ("net/mlx5e: Add support to neighbour update flow")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-28 07:23:10 +03:00
Or Gerlitz
bdd66ac0ae net/mlx5e: Disallow TC offloading of unsupported match/action combinations
When offloading header re-write, the HW may need to adjust checksums along
the packet. For IP traffic, and a case where we are asked to modify fields in
the IP header, current HW supports that only for TCP and UDP. Enforce it, in
this case fail the offloading attempt for non TCP/UDP packets.

Fixes: d7e75a325c ('net/mlx5e: Add offloading of E-Switch TC pedit (header re-write) actions')
Fixes: 2f4fe4cab0 ('net/mlx5e: Add offloading of NIC TC pedit (header re-write) actions')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-28 07:23:09 +03:00
Paul Blakey
ace743214e net/mlx5e: Fix erroneous freeing of encap header buffer
In case the neighbour for the tunnel destination isn't valid,
we send a neighbour update request but we free the encap
header buffer. This is wrong, because we still need it for
allocating a HW encap entry once the neighbour is available.

Fix that by skipping freeing it if we wait for neighbour.

Fixes: 232c001398 ('net/mlx5e: Add support to neighbour update flow')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-28 07:23:09 +03:00
Raed Salem
16f1c5bb3e net/mlx5: Check device capability for maximum flow counters
Added check for the maximal number of flow counters attached
to rule (FTE).

Fixes: bd5251dbf1 ('net/mlx5_core: Introduce flow steering destination of type counter')
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-28 07:23:09 +03:00
Inbar Karmy
99d3cd27f7 net/mlx5: Fix FPGA capability location
Currently, FPGA capability is located in (mdev)->caps.hca_cur,
change the location to be (mdev)->caps.fpga,
since hca_cur is reserved for HCA device capabilities.

Fixes: e29341fb3a ("net/mlx5: FPGA, Add basic support for Innova")
Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-28 07:23:09 +03:00
Roi Dayan
38e8a5c040 net/mlx5e: IPoIB, Fix access to invalid memory address
When cleaning rdma netdevice we need to save the mdev pointer
because priv is released when we release netdev.

This bug was found using the kernel address sanitizer (KASAN).
use-after-free in mlx5_rdma_netdev_free+0xe3/0x100 [mlx5_core]

Fixes: 48935bbb7a ("net/mlx5e: IPoIB, Add netdevice profile skeleton")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-28 07:23:09 +03:00
Shaohua Li
7d5d7b5058 md/raid5: cap worker count
static checker reports a potential integer overflow. Cap the worker count to
avoid the overflow.

Reported:-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-09-27 20:08:44 -07:00
Shaohua Li
c4d6a1b8e8 dm-raid: fix a race condition in request handling
raid_map calls pers->make_request, which missed the suspend check. Fix it with
the new md_handle_request API.

Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start())
Cc: Heinz Mauelshagen <heinzm@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-09-27 20:08:29 -07:00
Shaohua Li
79bf31a3b2 md: fix a race condition for flush request handling
md_submit_flush_data calls pers->make_request, which missed the suspend check.
Fix it with the new md_handle_request API.

Reported-by: Nate Dailey <nate.dailey@stratus.com>
Tested-by: Nate Dailey <nate.dailey@stratus.com>
Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start())
Cc: stable@vger.kernel.org
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-09-27 20:08:18 -07:00
Shaohua Li
393debc23c md: separate request handling
With commit cc27b0c78c, pers->make_request could bail out without handling
the bio. If that happens, we should retry.  The commit fixes md_make_request
but not other call sites. Separate the request handling part, so other call
sites can use it.

Reported-by: Nate Dailey <nate.dailey@stratus.com>
Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start())
Cc: stable@vger.kernel.org
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-09-27 20:07:40 -07:00
Martin Wilck
d0b7a9095c scsi: ILLEGAL REQUEST + ASC==27 => target failure
ASC 0x27 is "WRITE PROTECTED". This error code is returned e.g.  by
Fujitsu ETERNUS systems under certain conditions for WRITE SAME 16
commands with UNMAP bit set. It should not be treated as a path
error. In general, it makes sense to assume that being write protected
is a target rather than a path property.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Acked-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-27 21:55:21 -04:00
Guilherme G. Piccoli
d1b490939d scsi: aacraid: Add a small delay after IOP reset
Commit 0e9973ed33 ("scsi: aacraid: Add periodic checks to see IOP reset
status") changed the way driver checks if a reset succeeded. Now, after an
IOP reset, aacraid immediately start polling a register to verify the reset
is complete.

This behavior cause regressions on the reset path in PowerPC (at least).
Since the delay after the IOP reset was removed by the aforementioned patch,
the fact driver just starts to read a register instantly after the reset
was issued (by writing in another register) "corrupts" the reset procedure,
which ends up failing all the time.

The issue highly impacted kdump on PowerPC, since on kdump path we
proactively issue a reset in adapter (through the reset_devices kernel
parameter).

This patch (re-)adds a delay right after IOP reset is issued. Empirically
we measured that 3 seconds is enough, but for safety reasons we delay
for 5s (and since it was 30s before, 5s is still a small amount).

For reference, without this patch we observe the following messages
on kdump kernel boot process:

  [ 76.294] aacraid 0003:01:00.0: IOP reset failed
  [ 76.294] aacraid 0003:01:00.0: ARC Reset attempt failed
  [ 86.524] aacraid 0003:01:00.0: adapter kernel panic'd ff.
  [ 86.524] aacraid 0003:01:00.0: Controller reset type is 3
  [ 86.524] aacraid 0003:01:00.0: Issuing IOP reset
  [146.534] aacraid 0003:01:00.0: IOP reset failed
  [146.534] aacraid 0003:01:00.0: ARC Reset attempt failed

Fixes: 0e9973ed33 ("scsi: aacraid: Add periodic checks to see IOP reset status")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-27 21:42:14 -04:00
Rafael J. Wysocki
8aba233390 cpufreq: docs: Drop intel-pstate.txt from index.txt
Commit 33fc30b470 (cpufreq: intel_pstate: Document the current
behavior and user interface) dropped the intel-pstate.txt file
from Documentation/cpu-freq/, but it did not update the index.txt
file in there accordingly, so do that now.

Fixes: 33fc30b470 (cpufreq: intel_pstate: Document the current behavior and user interface)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-09-28 02:08:43 +02:00
James Morris
2569e7e1d6 Merge commit 'keys-fixes-20170927' into fixes-v4.14-rc3
From David Howells:

"There are two sets of patches here:
 (1) A bunch of core keyrings bug fixes from Eric Biggers.

 (2) Fixing big_key to use safe crypto from Jason A. Donenfeld."
2017-09-28 09:11:28 +10:00
Dennis Zhou
2e08d20d77 percpu: fix starting offset for chunk statistics traversal
This patch fixes the starting offset used when scanning chunks to
compute the chunk statistics. The value start_offset (and end_offset)
are managed in bytes while the traversal occurs over bits. Thus for the
reserved and dynamic chunk, it may incorrectly skip over the initial
allocations.

Signed-off-by: Dennis Zhou <dennisszhou@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2017-09-27 14:45:57 -07:00
Tyler Baicar
aaf2c2fb0f ACPI / APEI: clear error status before acknowledging the error
Currently we acknowledge errors before clearing the error status.
This could cause a new error to be populated by firmware in-between
the error acknowledgment and the error status clearing which would
cause the second error's status to be cleared without being handled.
So, clear the error status before acknowledging the errors.

Also, make sure to acknowledge the error if the error status read
fails.

Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-09-27 23:13:06 +02:00
Dave Airlie
2726e15e54 Merge branch 'drm-fixes-4.14' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
A few fixes for 4.14.  Nothing too major.

* 'drm-fixes-4.14' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: disable hard reset in hibernate for APUs
  drm/amdgpu: revert tile table update for oland
2017-09-28 05:49:38 +10:00
Dave Airlie
ffa34d8547 Merge branch 'etnaviv/fixes' of https://git.pengutronix.de/git/lst/linux into drm-fixes
Just two small etnaviv fixes, one fixing a list corruption, the other
fixing a NULL ptr deref in an error path.

* 'etnaviv/fixes' of https://git.pengutronix.de/git/lst/linux:
  etnaviv: fix gem object list corruption
  etnaviv: fix submit error path
2017-09-28 05:48:53 +10:00
Dave Airlie
f2e295342e Merge tag 'drm-amdkfd-fixes-2017-09-24' of git://people.freedesktop.org/~gabbayo/linux into drm-fixes
It contains the following fixes:
- correct checking of return value
- send correct parameter to function (According to the parameter type)
- avoid spamming of dmesg log
- fix queue wrapping calculations

* tag 'drm-amdkfd-fixes-2017-09-24' of git://people.freedesktop.org/~gabbayo/linux:
  drm/amdkfd: Print event limit messages only once per process
  drm/amdkfd: Fix kernel-queue wrapping bugs
  drm/amdkfd: Fix incorrect destroy_mqd parameter
  drm/amdkfd: check for null dev to avoid a null pointer dereference
2017-09-28 05:47:39 +10:00
Linus Torvalds
9cd6681cb1 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota and isofs fixes from Jan Kara:
 "Two quota fixes (fallout of the quota locking changes) and an isofs
  build fix"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: Fix quota corruption with generic/232 test
  isofs: fix build regression
  quota: add missing lock into __dquot_transfer()
2017-09-27 12:22:12 -07:00
Linus Torvalds
225d3b6748 linux-kselftest-4.14-rc3-fixes
This update consists of:
 
 - fixes to several existing tests
 - a test for regression introduced by
   b9470c2760 ("inet: kill smallest_size and smallest_port")
 - seccomp support for glibc 2.26 siginfo_t.h
 - fixes to kselftest framework and tests to run make O=dir use-case
 - fixes to silence unnecessary test output to de-clutter test results
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCAAGBQJZy7S7AAoJEAsCRMQNDUMcAt0P/iuR279yaBF3RVqHTyXsmr/t
 RO6k4uj4XLYKTrVnV/YTu5hLCGO9fPDhprMmrTqlAGclioEyMDtRTOWDDln4TNFh
 gehbXiOTVVHlLPCOXXRwvU+RsMppgi4O2WRTBK0dnTkBdl+sTLOl4iywGyqFPB11
 O3oj1nNc8ruaxYoUMYwxiGCm1OATrngoSu/Y4mMhZPgT9MnCtZWDlg//kkrxQDHO
 UTD11zk17nBAOw2q4nw3I4un00tgN8RzIOfg9g47Az40LjWSG5c5oAgd/hArqeBv
 7pCUR1PnNKTf0RujX0nfaoQQ+bOEXqpV9GmM67HLo8Q/5e4lYxWdmSdhItPS5qtS
 ZLo1lEMOuRH7+FCQuD236llhwKVMm/+R3jnXgdJcc+SupdGCmpzZ9P8rscX1g11R
 ZDZ9+k8XOA2p7ufxSIGFEILSovn0FUMneOd3Nhwk40R7cIvSiZh+V+Xzdb6Q1K9T
 NBVtH8qvRi5TyHSNwQCDF45fC6bCM80JxGcPToOguFsQTcUL6B0pG6xhxZG73+Ut
 br+Z5y+g+JLWLeGzaBjo4LnqFpeP6w4Jb8CCrqu8BussV3BToIFCJkGX6aOggow/
 D3g03tGDeMjqFMYwn0ZCH5s5u9cicWUUC8CBvoCJp2UZaE/prsNNfRjZjfwYlrVj
 TvWPdPJtwjA/sdq/n2Hl
 =FUuY
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-4.14-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:
 "This update consists of:

   - fixes to several existing tests

   - a test for regression introduced by b9470c2760 ("inet: kill
     smallest_size and smallest_port")

   - seccomp support for glibc 2.26 siginfo_t.h

   - fixes to kselftest framework and tests to run make O=dir use-case

   - fixes to silence unnecessary test output to de-clutter test results"

* tag 'linux-kselftest-4.14-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (28 commits)
  selftests: timers: set-timer-lat: Fix hang when testing unsupported alarms
  selftests: timers: set-timer-lat: fix hang when std out/err are redirected
  selftests/memfd: correct run_tests.sh permission
  selftests/seccomp: Support glibc 2.26 siginfo_t.h
  selftests: futex: Makefile: fix for loops in targets to run silently
  selftests: Makefile: fix for loops in targets to run silently
  selftests: mqueue: Use full path to run tests from Makefile
  selftests: futex: copy sub-dir test scripts for make O=dir run
  selftests: lib.mk: copy test scripts and test files for make O=dir run
  selftests: sync: kselftest and kselftest-clean fail for make O=dir case
  selftests: sync: use TEST_CUSTOM_PROGS instead of TEST_PROGS
  selftests: lib.mk: add TEST_CUSTOM_PROGS to allow custom test run/install
  selftests: watchdog: fix to use TEST_GEN_PROGS and remove clean
  selftests: lib.mk: fix test executable status check to use full path
  selftests: Makefile: clear LDFLAGS for make O=dir use-case
  selftests: lib.mk: kselftest and kselftest-clean fail for make O=dir case
  Makefile: kselftest and kselftest-clean fail for make O=dir case
  selftests/net: msg_zerocopy enable build with older kernel headers
  selftests: actually run the various net selftests
  selftest: add a reuseaddr test
  ...
2017-09-27 10:51:08 -07:00
Richard Genoud
36de807400 mtd: nand: atmel: fix buffer overflow in atmel_pmecc_user
When calculating the size needed by struct atmel_pmecc_user *user,
the dmu and delta buffer sizes were forgotten.
This lead to a memory corruption (especially with a large ecc_strength).

Link: http://lkml.kernel.org/r/1506503157.3016.5.camel@gmail.com
Fixes: f88fc122cc ("mtd: nand: Cleanup/rework the atmel_nand driver")
Cc: stable@vger.kernel.org
Reported-by: Richard Genoud <richard.genoud@gmail.com>
Pointed-at-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Reviewed-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-09-27 17:33:28 +02:00
Linus Torvalds
7031b64125 Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fpu fixes and cleanups from Ingo Molnar:
 "This is _way_ more cleanups than fixes, but the bugs were subtle and
  hard to hit, and the primary reason for them existing was the
  unnecessary historical complexity of some of the x86/fpu interfaces.

  The first bunch of commits clean up and simplify the xstate user copy
  handling functions, in reaction to the collective head-scratching
  about the xstate user-copy handling code that leads up to the fix for
  this SkyLake xstate handling bug:

     0852b37417: x86/fpu: Add FPU state copying quirk to handle XRSTOR failure on Intel Skylake CPUs

  The cleanups don't change any functionality, they just (hopefully)
  make it all clearer, more consistent, more debuggable and more robust.

  Note that most of the linecount increase comes from these commits,
  where we better split the user/kernel copy logic by having more
  variants, instead repeated fragile patterns of:

               if (kbuf) {
                       memcpy(kbuf + pos, data, copy);
               } else {
                       if (__copy_to_user(ubuf + pos, data, copy))
                               return -EFAULT;
               }

  The next bunch of commits simplify the FPU state-machine to get rid of
  old lazy-FPU idiosyncrasies - a defensive simplification to make all
  the code easier to review and fix. No change in functionality.

  Then there's a couple of additional debugging tweaks: static checker
  warning fix and move an FPU related warning to under WARN_ON_FPU(),
  followed by another bunch of commits that represent a finegrained
  split-up of the fixes from Eric Biggers to handle weird xstate bits
  properly.

  I did this finegrained split-up because some of these fixes also
  impact the ABI for weird xstate handling, for which we'd like to have
  good bisection results, should they cause any problems. (We also had
  one regression with the more monolithic fixes, so splitting it all up
  sounded prudent for robustness reasons as well.)

  About the whole series: the commits up to 03eaec81ac have been in
  -next for months - but I've recently rebased them to remove a state
  machine clean-up commit that was objected to, and to make it more
  bisectable - so technically it's a new, rebased tree.

  Robustness history: this series had some regressions along the way,
  and all reported regressions have been fixed. All but one of the
  regressions manifested itself as easy to report warnings. The previous
  version of this latest series was also in linux-next, with one
  (warning-only) regression reported which is fixed in the latest
  version.

  Barring last minute brown paper bag bugs (and the commits are now
  older by a day which I'd hope helps paperbag reduction), I'm
  reasonably confident about its general robustness.

  Famous last words ..."

* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
  x86/fpu: Use using_compacted_format() instead of open coded X86_FEATURE_XSAVES
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in copy_user_to_xstate()
  x86/fpu: Eliminate the 'xfeatures' local variable in copy_user_to_xstate()
  x86/fpu: Copy the full header in copy_user_to_xstate()
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in copy_kernel_to_xstate()
  x86/fpu: Eliminate the 'xfeatures' local variable in copy_kernel_to_xstate()
  x86/fpu: Copy the full state_header in copy_kernel_to_xstate()
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in __fpu__restore_sig()
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in xstateregs_set()
  x86/fpu: Introduce validate_xstate_header()
  x86/fpu: Rename fpu__activate_fpstate_read/write() to fpu__prepare_[read|write]()
  x86/fpu: Rename fpu__activate_curr() to fpu__initialize()
  x86/fpu: Simplify and speed up fpu__copy()
  x86/fpu: Fix stale comments about lazy FPU logic
  x86/fpu: Rename fpu::fpstate_active to fpu::initialized
  x86/fpu: Remove fpu__current_fpstate_write_begin/end()
  x86/fpu: Fix fpu__activate_fpstate_read() and update comments
  x86/fpu: Reinitialize FPU registers if restoring FPU state fails
  x86/fpu: Don't let userspace set bogus xcomp_bv
  x86/fpu: Turn WARN_ON() in context switch into WARN_ON_FPU()
  ...
2017-09-27 08:33:26 -07:00
Harish Chegondi
828bcbdc97 IB/hfi1: Unsuccessful PCIe caps tuning should not fail driver load
Failure to tune PCIe capabilities should not fail driver load. This can
cause the driver load to fail on systems with any of the following:
1. HFI's parent is not root. Example: HFI card is behind a PCIe bridge.
2. HFI's parent is not PCI Express capable.
In these situations, failure to tune PCIe capabilities should be logged
in the system message logs but not cause the driver load to fail.

This patch also ensures pcie capability word DevCtl is written only
after a successful read and the capability tuning process continues
even if read/write of the pcie capability word DevCtl fails.

Fixes: c53df62c7a ("IB/hfi1: Check return values from PCI config API calls")
Fixes: bf70a77577 ("staging/rdma/hfi1: Enable WFR PCIe extended tags from the driver")
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-27 11:10:36 -04:00
Michael J. Ruhl
b8f42738ac IB/hfi1: On error, fix use after free during user context setup
During base context setup, if setup_base_ctxt() fails, the context is
deallocated. This is incorrect because the context is referenced on
return, to notify any waiting subcontext.  If there are no subcontexts
the pointer will be invalid.

Reorganize the error path so that deallocate_ctxt() is called after all
the possible subcontexts have been notified.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-27 11:10:36 -04:00
Alex Estrin
612601d001 Revert "IB/ipoib: Update broadcast object if PKey value was changed in index 0"
commit 9a9b811269 will cause core to fail UD QP from being destroyed
on ipoib unload, therefore cause resources leakage.
On pkey change event above patch modifies mgid before calling underlying
driver to detach it from QP. Drivers' detach_mcast() will fail to find
modified mgid it was never given to attach in a first place.
Core qp->usecnt will never go down, so ib_destroy_qp() will fail.

IPoIB driver actually does take care of new broadcast mgid based on new
pkey by destroying an old mcast object in ipoib_mcast_dev_flush())
....
	if (priv->broadcast) {
		rb_erase(&priv->broadcast->rb_node, &priv->multicast_tree);
		list_add_tail(&priv->broadcast->list, &remove_list);
		priv->broadcast = NULL;
	}
...

then in restarted ipoib_macst_join_task() creating a new broadcast mcast
object, sending join request and on completion tells the driver to attach
to reinitialized QP:
...
if (!priv->broadcast) {
...
	broadcast = ipoib_mcast_alloc(dev, 0);
...
	memcpy(broadcast->mcmember.mgid.raw, priv->dev->broadcast + 4,
	       sizeof (union ib_gid));
	priv->broadcast = broadcast;
...

Fixes: 9a9b811269 ("IB/ipoib: Update broadcast object if PKey value was changed in index 0")
Cc: stable@vger.kernel.org
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-27 11:10:36 -04:00
Kamenee Arumugam
09592af5fd IB/hfi1: Return correct value in general interrupt handler
The general interrupt handler returns IRQ_HANDLED whether an IRQ
was handled or not.
Determine if an IRQ was handled and return the correct value.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Kamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-09-27 11:10:36 -04:00