Fixing a concurrency issue with creq handling. Each caller
was given a globally managed crsq element, which was
accessed outside a lock. This could result in corruption,
if lot of applications are simultaneously issuing Control Path
commands. Now, each caller will provide its own response buffer
and the responses will be copied under a lock.
Also, Fixing the queue full condition check for the CMDQ.
As a part of these changes, the control path code is refactored
to remove the code replication in the response status checking.
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Commit eea40b8f62 ("infiniband: call ipv6 route lookup via the stub
interface") introduced a regression in address resolution when connecting
to IPv6 destination addresses. The old code called ip6_route_output(),
while the new code calls ipv6_stub->ipv6_dst_lookup(). The two are almost
the same, except that ipv6_dst_lookup() also calls ip6_route_get_saddr()
if the source address is in6addr_any.
This means that the test of ipv6_addr_any(&fl6.saddr) now never succeeds,
and so we never copy the source address out. This ends up causing
rdma_resolve_addr() to fail, because without a resolved source address,
cma_acquire_dev() will fail to find an RDMA device to use. For me, this
causes connecting to an NVMe over Fabrics target via RoCE / IPv6 to fail.
Fix this by copying out fl6.saddr if ipv6_addr_any() is true for the original
source address passed into addr6_resolve(). We can drop our call to
ipv6_dev_get_saddr() because ipv6_dst_lookup() already does that work.
Fixes: eea40b8f62 ("infiniband: call ipv6 route lookup via the stub interface")
Cc: <stable@vger.kernel.org> # 3.12+
Signed-off-by: Roland Dreier <roland@purestorage.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Commit 9fdca4da4d (IB/SA: Split struct sa_path_rec based on IB and
ROCE specific fields) moved the service_id to be specific attribute
for IB and OPA SA Path Record, and thus wasn't assigned for RoCE.
This caused to the following kernel panic in the CMA request handler flow:
[ 27.074594] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 27.074731] IP: __radix_tree_lookup+0x1d/0xe0
...
[ 27.075356] Workqueue: ib_cm cm_work_handler [ib_cm]
[ 27.075401] task: ffff88022e3b8000 task.stack: ffffc90001298000
[ 27.075449] RIP: 0010:__radix_tree_lookup+0x1d/0xe0
...
[ 27.075979] Call Trace:
[ 27.076015] radix_tree_lookup+0xd/0x10
[ 27.076055] cma_ps_find+0x59/0x70 [rdma_cm]
[ 27.076097] cma_id_from_event+0xd2/0x470 [rdma_cm]
[ 27.076144] ? ib_init_ah_from_path+0x39a/0x590 [ib_core]
[ 27.076193] cma_req_handler+0x25/0x480 [rdma_cm]
[ 27.076237] cm_process_work+0x25/0x120 [ib_cm]
[ 27.076280] ? cm_get_bth_pkey.isra.62+0x3c/0xa0 [ib_cm]
[ 27.076350] cm_req_handler+0xb03/0xd40 [ib_cm]
[ 27.076430] ? sched_clock_cpu+0x11/0xb0
[ 27.076478] cm_work_handler+0x194/0x1588 [ib_cm]
[ 27.076525] process_one_work+0x160/0x410
[ 27.076565] worker_thread+0x137/0x4a0
[ 27.076614] kthread+0x112/0x150
[ 27.076684] ? max_active_store+0x60/0x60
[ 27.077642] ? kthread_park+0x90/0x90
[ 27.078530] ret_from_fork+0x2c/0x40
This patch moves it back to the common SA Path Record structure
and removes the redundant setter and getter.
Tested on Connect-IB and Connect-X4 in Infiniband and RoCE respectively.
Fixes: 9fdca4da4d (IB/SA: Split struct sa_path_rec based on IB ands
ROCE specific fields)
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This change will optimize kernel memory deregistration operations.
__ib_umem_release() used to call set_page_dirty_lock() against every
writable page in its memory region. Its purpose is to keep data
synced between CPU and DMA device when swapping happens after mem
deregistration ops. Now we choose not to set page dirty bit if it's
already set by kernel prior to calling __ib_umem_release(). This
reduces memory deregistration time by half or even more when we ran
application simulation test program.
Signed-off-by: Qing Huang <qing.huang@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Commit 5752075144 ("IB/SA: Add OPA path record type") introduced
new local function __ib_copy_path_rec_to_user, but didn't limit its
scope. This produces the following sparse warning:
drivers/infiniband/core/uverbs_marshall.c:99:6: warning:
symbol '__ib_copy_path_rec_to_user' was not declared. Should it be
static?
In addition, it used sizeof ... notations instead of sizeof(...), which
is correct in C, but a little bit misleading. Let's change it too.
Fixes: 5752075144 ("IB/SA: Add OPA path record type")
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
RDMA netlink is part of ib_core, hence ibnl_chk_listeners(),
ibnl_init() and ibnl_cleanup() don't need to be published
in public header file.
Let's remove EXPORT_SYMBOL from ibnl_chk_listeners() and move all these
functions to private header file.
CC: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
If srp_init_qp() fails at srp_create_ch_ib() then ch->send_cq
may be NULL.
Calling directly to ib_destroy_qp() is sufficient because
no work requests were posted on the created qp.
Fixes: 9294000d6d ("IB/srp: Drain the send queue before destroying a QP")
Cc: <stable@vger.kernel.org>
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Bart van Assche <bart.vanassche@sandisk.com>--
Signed-off-by: Doug Ledford <dledford@redhat.com>
ipoib_dev_uninit_default() call is used in ipoib_main.c file only
and it generates the following warning from smatch tool:
drivers/infiniband/ulp/ipoib/ipoib_main.c:1593:6: warning:
symbol 'ipoib_dev_uninit_default' was not declared. Should it
be static?
so let's declare that function as static.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Cache the needed umr_fence and set the wqe ctrl segmennt
accordingly.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
HW can implement UMR wqe re-transmission in various ways.
Thus, add HCA cap to distinguish the needed fence for UMR to make
sure that the wqe wouldn't fail on mkey checks.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The cited patch added a type field to structures ib_ah and rdma_ah_attr.
Function mlx4_ib_query_ah() builds an rdma_ah_attr structure from the
data in an mlx4_ib_ah structure (which contains both an ib_ah structure
and an address vector).
For mlx4_ib_query_ah() to work properly, the type field in the contained
ib_ah structure must be set correctly.
In the outgoing MAD tunneling flow, procedure mlx4_ib_multiplex_mad()
paravirtualizes a MAD received from a slave and sends the processed
mad out over the wire. During this processing, it populates an
mlx4_ib_ah structure and calls mlx4_ib_query_ah().
The cited commit overlooked setting the type field in the contained
ib_ah structure before invoking mlx4_ib_query_ah(). As a result, the
type field remained uninitialized, and the rdma_ah_attr structure was
incorrectly built. This resulted in improperly built MADs being sent out
over the wire.
This patch properly initializes the type field in the contained ib_ah
structure before calling mlx4_ib_query_ah(). The rdma_ah_attr structure
is then generated correctly.
Fixes: 44c58487d5 ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The handling of IB_RDMA_WRITE_ONLY_WITH_IMMEDIATE will leak a memory
reference when a buffer cannot be allocated for returning the immediate
data.
The issue is that the rkey validation has already occurred and the RNR
nak fails to release the reference that was fruitlessly gotten. The
the peer will send the identical single packet request when its RNR
timer pops.
The fix is to release the held reference prior to the rnr nak exit.
This is the only sequence the requires both rkey validation and the
buffer allocation on the same packet.
Cc: Stable <stable@vger.kernel.org> # 4.7+
Tested-by: Tadeusz Struk <tadeusz.struk@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Keep VL15 credits at 0 during LNI, before link-up. Store
VL15 credits value during verify cap interrupt and set
in after link-up. This addresses an issue where VL15 MAD
packets could be sent by one side of the link before
the other side is ready to receive them.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The Omni-Path adapter driver fails to load on the ppc64le platform
due to invalid PCI setup.
This patch makes the PCI configuration more robust and will
fix 64 bit addressing for ppc64le.
Signed-off-by: Steven L Roberts <robers97@gmail.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This fixes a kernel panic when loading the hfi driver as a dynamic module.
Signed-off-by: Steven L Roberts <robers97@gmail.com>
Reviewed-by: Leon Romanovsky <leon@kernel.org>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Take care of ipv6 checks while computing header length for deducing mtu
size of ipv6 servers. Due to the incorrect header length computation for
ipv6 servers, wrong mss is reported to the peer (client).
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The patch 761e19a504 (RDMA/iw_cxgb4: Handle return value of
c4iw_ofld_send() in abort_arp_failure()) from May 6, 2016
leads to the following static checker warning:
drivers/infiniband/hw/cxgb4/cm.c:575 abort_arp_failure()
warn: passing freed memory 'skb'
Also fixes skb leak when l2t resolution fails
Fixes: 761e19a504 (RDMA/iw_cxgb4: Handle return value of
c4iw_ofld_send() in abort_arp_failure())
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Explicitly ACK the MPA Reply frame so the peer
does not retransmit the frame.
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Don't set control flag for 0-length FULPDU (Send) RTR indication
in the enhanced MPA Request/Reply frames, because it isn't supported.
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Some error paths in i40iw_initialize_dev are doing
additional and unnecessary work before exiting.
Correctly free resources allocated prior to error
and return with correct status code.
Signed-off-by: Mustafa Ismail <mustafa.ismail@intelcom>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Explicitly ACK the MPA Reject frame so the peer does
not retransmit the frame.
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Don't set control flag for 0-length FULPDU (Send)
RTR indication in the enhanced MPA Request/Reply
frames, because it isn't supported.
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Pull thermal SoC management fixes from Eduardo Valentin:
- fixes to TI SoC driver, Broadcom, qoriq
- small sparse warning fix on thermal core
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
thermal: broadcom: ns-thermal: default on iProc SoCs
ti-soc-thermal: Fix a typo in a comment line
ti-soc-thermal: Delete error messages for failed memory allocations in ti_bandgap_build()
ti-soc-thermal: Use devm_kcalloc() in ti_bandgap_build()
thermal: core: make thermal_emergency_poweroff static
thermal: qoriq: remove useless call for of_thermal_get_trip_points()
Here are some serial and tty fixes for 4.12-rc3. They are a bit
"bigger" than normal, which is why I had them "bake" in linux-next for a
few weeks and didn't send them to you for -rc2.
They revert a few of the serdev patches from 4.12-rc1, and bring things
back to how they were in 4.11, to try to make things a bit more stable
there. Rob and Johan both agree that this is the way forward, so this
isn't people squabbling over semantics. Other than that, just a few
minor serial driver fixes that people have had problems with.
All of these have been in linux-next for a few weeks with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWSlOHA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylDCACgn7RHT16JUASggJmRUBeadxQcFQAAnjtxX2kc
0AQLqXxqGyFxVZClAYMy
=Y6+X
-----END PGP SIGNATURE-----
Merge tag 'tty-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are some serial and tty fixes for 4.12-rc3. They are a bit bigger
than normal, which is why I had them bake in linux-next for a few
weeks and didn't send them to you for -rc2.
They revert a few of the serdev patches from 4.12-rc1, and bring
things back to how they were in 4.11, to try to make things a bit more
stable there. Rob and Johan both agree that this is the way forward,
so this isn't people squabbling over semantics. Other than that, just
a few minor serial driver fixes that people have had problems with.
All of these have been in linux-next for a few weeks with no reported
issues"
* tag 'tty-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: altera_uart: call iounmap() at driver remove
serial: imx: ensure UCR3 and UFCR are setup correctly
MAINTAINERS/serial: Change maintainer of jsm driver
serial: enable serdev support
tty/serdev: add serdev registration interface
serdev: Restore serdev_device_write_buf for atomic context
serial: core: fix crash in uart_suspend_port
tty: fix port buffer locking
tty: ehv_bytechan: clean up init error handling
serial: ifx6x60: fix use-after-free on module unload
serial: altera_jtaguart: adding iounmap()
serial: exar: Fix stuck MSIs
serial: efm32: Fix parity management in 'efm32_uart_console_get_options()'
serdev: fix tty-port client deregistration
Revert "tty_port: register tty ports with serdev bus"
drivers/tty: 8250: only call fintek_8250_probe when doing port I/O
Fix running SPU programs on Cell, and a few other minor fixes.
Thanks to:
Alistair Popple, Jeremy Kerr, Michael Neuling, Nicholas Piggin.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZKURPAAoJEFHr6jzI4aWAhckQAJxZZHt2OMbdNu0PxHdhZgxo
+eSODIF0jvzBnYs/Pe9aqqrxuONxW+ioclyUIVYLUlHwLjCGf7x2Y5tJe0OmEff6
ZOaUUcwECKw4cy2UJY6NCGv0nw/8INTDN5xPcQq9M8gExmX6plTmbnQg8Y10ONdQ
LYu36GWyXF4ygblvLo7kXs8tuZYKozO6kPRqxiQ3zML2dOAyqWqPwpnoWSc6c7oR
W+z/Vuxe3lTR+QHbfvnSpQhmdVi+WEnwFvgNmIise5R9Jd30Q1f1vES5E089ifmN
b0Qi5/gkb6YWBkROvxTARFRdmU0/YPNDFWUsuyHJB/Nz1MnqqXx5X+5KpqqinPya
azVoYW010x2zawm0aX+BF/WeH5ymsl++R84/aO8UR0fA2AIwEOQeLGWZvaZb8CMl
9vd3NqCJ+diBhgCHiHp80pjD978bqt7Ls1nfbHhYTJ31HRT8Yz/ympWOhLE6rp+t
kGR+UOHduaZWK3KHoE2WIoUFJuQMvRgFmjoy2G+YaK/PcUc8OA+90v1665fnbk+N
wmZyAirP39gveHkHXDywqbEjN4CSMgsqrRW/KwPo0b4mf2m3rsIAshO9pBbZRv+P
evhrAkCYRv5zGbGIYJ/TiEyball+8NQzxzoYmMzq62pjE27gyIe94Sqy80U4zyOC
RqqUxflOBgMDC8Ufc30u
=EM32
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Fix running SPU programs on Cell, and a few other minor fixes.
Thanks to Alistair Popple, Jeremy Kerr, Michael Neuling, Nicholas
Piggin"
* tag 'powerpc-4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: Add PPC_FEATURE userspace bits for SCV and DARN instructions
powerpc/spufs: Fix hash faults for kernel regions
powerpc: Fix booting P9 hash with CONFIG_PPC_RADIX_MMU=N
powerpc/powernv/npu-dma.c: Fix opal_npu_destroy_context() call
selftests/powerpc: Fix TM resched DSCR test with some compilers
Pull x86 fixes from Thomas Gleixner:
"A series of fixes for X86:
- The final fix for the end-of-stack issue in the unwinder
- Handle non PAT systems gracefully
- Prevent access to uninitiliazed memory
- Move early delay calaibration after basic init
- Fix Kconfig help text
- Fix a cross compile issue
- Unbreak older make versions"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/timers: Move simple_udelay_calibration past init_hypervisor_platform
x86/alternatives: Prevent uninitialized stack byte read in apply_alternatives()
x86/PAT: Fix Xorg regression on CPUs that don't support PAT
x86/watchdog: Fix Kconfig help text file path reference to lockup watchdog documentation
x86/build: Permit building with old make versions
x86/unwind: Add end-of-stack check for ftrace handlers
Revert "x86/entry: Fix the end of the stack for newly forked tasks"
x86/boot: Use CROSS_COMPILE prefix for readelf
Pull timer fixlet from Thomas Gleixner:
"Silence dmesg spam by making the posix cpu timer printks depend on
print_fatal_signals"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
posix-timers: Make signal printks conditional
Pull RAS fixes from Thomas Gleixner:
"Two fixlets for RAS:
- Export memory_error() so the NFIT module can utilize it
- Handle memory errors in NFIT correctly"
* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
acpi, nfit: Fix the memory error check in nfit_handle_mce()
x86/MCE: Export memory_error()
Pull perf tooling fixes from Thomas Gleixner:
- Synchronization of tools and kernel headers
- A series of fixes for perf report addressing various failures:
* Handle invalid maps proper
* Plug a memory leak
* Handle frames and callchain order correctly
- Fixes for handling inlines and children mode
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools/include: Sync kernel ABI headers with tooling headers
perf tools: Put caller above callee in --children mode
perf report: Do not drop last inlined frame
perf report: Always honor callchain order for inlined nodes
perf script: Add --inline option for debugging
perf report: Fix off-by-one for non-activation frames
perf report: Fix memory leak in addr2line when called by addr2inlines
perf report: Don't crash on invalid maps in `-g srcline` mode
Pull locking fix from Thomas Gleixner:
"A fix for a state leak which was introduced in the recent rework of
futex/rtmutex interaction"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock()
Pull kthread fix from Thomas Gleixner:
"A single fix which prevents a use after free when kthread fork fails"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kthread: Fix use-after-free if kthread fork fails
One was simply a memory leak where not all was being freed that should
have been in releasing a file pointer on set_graph_function.
Then Thomas found that the ftrace trampolines were marked for read/write
as well as execute. To shrink the possible attack surface, he added
calls to set them to ro. Which also uncovered some other issues with
freeing module allocated memory that had its permissions changed.
Kprobes had a similar issue which is fixed and a selftest was added
to trigger that issue again.
-----BEGIN PGP SIGNATURE-----
iQExBAABCAAbBQJZKOiVFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
vBoH/jxVozuAEVCv+Nbj6fhRxe4emjo0lZZb32EbEaSV/nUQGqHIZFdDQtbt+ld+
sn06/BSMBI+L4BqLj1BCAW0e/zIn/4birIg53SX5jQwc3AlhUG7HS2d+RJZZCrp9
Zofq9L6xZ4Hl2XjkPXqwEgtrwxQtkIPLlJqeYDJ6BVrlPfOPEwB7bfR7B684wiYT
6h2Qo7f/ZQzgJ1sK8N2IjHEnAgE08KCYcj4IB4WHJk6SqQz3bv1Y00WBg2UQihVT
TPPSVhYLnrSw53fxyALqZbHo2DvnQf1TnNadWxvSIpbvgm/T5GG60FDtvHgNfbwz
yKuKAog+P9xBLkoAcfvODLY9O5s=
=75TZ
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull ftrace fixes from Steven Rostedt:
"There's been a few memory issues found with ftrace.
One was simply a memory leak where not all was being freed that should
have been in releasing a file pointer on set_graph_function.
Then Thomas found that the ftrace trampolines were marked for
read/write as well as execute. To shrink the possible attack surface,
he added calls to set them to ro. Which also uncovered some other
issues with freeing module allocated memory that had its permissions
changed.
Kprobes had a similar issue which is fixed and a selftest was added to
trigger that issue again"
* tag 'trace-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
x86/ftrace: Make sure that ftrace trampolines are not RWX
x86/mm/ftrace: Do not bug in early boot on irqs_disabled in cpu_flush_range()
selftests/ftrace: Add a testcase for many kprobe events
kprobes/x86: Fix to set RWX bits correctly before releasing trampoline
ftrace: Fix memory leak in ftrace_graph_release()
ftrace use module_alloc() to allocate trampoline pages. The mapping of
module_alloc() is RWX, which makes sense as the memory is written to right
after allocation. But nothing makes these pages RO after writing to them.
Add proper set_memory_rw/ro() calls to protect the trampolines after
modification.
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1705251056410.1862@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
With function tracing starting in early bootup and having its trampoline
pages being read only, a bug triggered with the following:
kernel BUG at arch/x86/mm/pageattr.c:189!
invalid opcode: 0000 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 4.12.0-rc2-test+ #3
Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
task: ffffffffb4222500 task.stack: ffffffffb4200000
RIP: 0010:change_page_attr_set_clr+0x269/0x302
RSP: 0000:ffffffffb4203c88 EFLAGS: 00010046
RAX: 0000000000000046 RBX: 0000000000000000 RCX: 00000001b6000000
RDX: ffffffffb4203d40 RSI: 0000000000000000 RDI: ffffffffb4240d60
RBP: ffffffffb4203d18 R08: 00000001b6000000 R09: 0000000000000001
R10: ffffffffb4203aa8 R11: 0000000000000003 R12: ffffffffc029b000
R13: ffffffffb4203d40 R14: 0000000000000001 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff9a639ea00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff9a636b384000 CR3: 00000001ea21d000 CR4: 00000000000406b0
Call Trace:
change_page_attr_clear+0x1f/0x21
set_memory_ro+0x1e/0x20
arch_ftrace_update_trampoline+0x207/0x21c
? ftrace_caller+0x64/0x64
? 0xffffffffc029b000
ftrace_startup+0xf4/0x198
register_ftrace_function+0x26/0x3c
function_trace_init+0x5e/0x73
tracer_init+0x1e/0x23
tracing_set_tracer+0x127/0x15a
register_tracer+0x19b/0x1bc
init_function_trace+0x90/0x92
early_trace_init+0x236/0x2b3
start_kernel+0x200/0x3f5
x86_64_start_reservations+0x29/0x2b
x86_64_start_kernel+0x17c/0x18f
secondary_startup_64+0x9f/0x9f
? secondary_startup_64+0x9f/0x9f
Interrupts should not be enabled at this early in the boot process. It is
also fine to leave interrupts enabled during this time as there's only one
CPU running, and on_each_cpu() means to only run on the current CPU.
If early_boot_irqs_disabled is set, it is safe to run cpu_flush_range() with
interrupts disabled. Don't trigger a BUG_ON() in that case.
Link: http://lkml.kernel.org/r/20170526093717.0be3b849@gandalf.local.home
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Add a testcase to test kprobes via ftrace interface
with many concurrent kprobe events.
This tries to add many kprobe events (up to 256) on
kernel functions. To avoid making ftrace-based
kprobes (kprobes on fentry), it skips first N bytes
(on x86 N=5, on ppc or arm N=4) of function entry.
After that, it enables all those events, disable it,
and remove it.
Since the unoptimization buffer reclaiming will
be delayed, after removing events, it will wait
enough time.
Link: http://lkml.kernel.org/r/149577388470.11702.11832460851769204511.stgit@devbox
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fix kprobes to set(recover) RWX bits correctly on trampoline
buffer before releasing it. Releasing readonly page to
module_memfree() crash the kernel.
Without this fix, if kprobes user register a bunch of kprobes
in function body (since kprobes on function entry usually
use ftrace) and unregister it, kernel hits a BUG and crash.
Link: http://lkml.kernel.org/r/149570868652.3518.14120169373590420503.stgit@devbox
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: d0381c81c2 ("kprobes/x86: Set kprobes pages read-only")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Pull input layer fixes from Dmitry Torokhov:
"Just a few fixups to a couple of drivers"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elan_i2c - ignore signals when finishing updating firmware
Input: elan_i2c - clear INT before resetting controller
Input: atmel_mxt_ts - add T100 as a readable object
Input: edt-ft5x06 - increase allowed data range for threshold parameter
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZKIBGAAoJEL1qUBy3i3wmVmgP/39LC2rvbxUIXK2V4qwuOmhM
1ScGhqCxMVBkb6/iBV7Ilk18LZbRXLCaO7fylpg/rK38HorblFIrp+5LOfS0qyjr
tfLEiWcNLHEO5kh+hQi1JyfxEjJT7xXISF07VdvZ/YUR6lyU7fZL93cC+bFSPaYI
vvpygbahkBrw/4+AyFunXBgSiF5F6ITg/9fSK/BsCheVRV7CHy1WBVwy5hYhQZVP
OIOznLds9bGUrUIHF8oqy8JB7D+qnXuGi+B3un6erw39KH6Z1cHnIfMk5GWgjsfN
OXAnrhZPgObyysw2m20+uSwtMVND88aXl1NWGZi1tlcy2aB067XfjM+Y1JiOGc+0
NEMOnSaelsshieQEAPFV1DUONTyKGmBlaTQ5fXXs9hoBjk9/kpCEGyzWWAYG3Kwa
zRowqBG9DlBBuwtMwji0KuzxaK+adRItkI/2F8Igf0pLFa3NQYZpCVfHRViX6Arb
MXiWvanmahvaZlFOmAS0hVRqLZzKXQMJC9PVVJ8qnIv+Qk1IX6FqAQNYTLbnBpje
a/tJMOaciKni7K6ft89vU+3cfNHmhvqRifV8k2D3ZlAmHx0gmDIIA3JCX1l0+xJA
oU5IfiMjKoIB+F13DUDlyPJK5QIC/heyuyM7UJuVgrYcaIPCE8tfWthGVZVWQ50w
v0pB9t2vmSEXizwhYpT7
=EtW3
-----END PGP SIGNATURE-----
Merge tag 'led_fixes_for_4-12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
Pull LED fix from Jacek Anaszewski:
"A single LED fix for 4.12-rc3.
leds-pca955x driver uses only i2c_smbus API and thus it should pass
I2C_FUNC_SMBUS_BYTE_DATA flag to i2c_check_functionality"
* tag 'led_fixes_for_4-12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
leds: pca955x: Correct I2C Functionality
Pull networking fixes from David Miller:
1) Fix state pruning in bpf verifier wrt. alignment, from Daniel
Borkmann.
2) Handle non-linear SKBs properly in SCTP ICMP parsing, from Davide
Caratti.
3) Fix bit field definitions for rss_hash_type of descriptors in mlx5
driver, from Jesper Brouer.
4) Defer slave->link updates until bonding is ready to do a full commit
to the new settings, from Nithin Sujir.
5) Properly reference count ipv4 FIB metrics to avoid use after free
situations, from Eric Dumazet and several others including Cong Wang
and Julian Anastasov.
6) Fix races in llc_ui_bind(), from Lin Zhang.
7) Fix regression of ESP UDP encapsulation for TCP packets, from
Steffen Klassert.
8) Fix mdio-octeon driver Kconfig deps, from Randy Dunlap.
9) Fix regression in setting DSCP on ipv6/GRE encapsulation, from Peter
Dawson.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
ipv4: add reference counting to metrics
net: ethernet: ax88796: don't call free_irq without request_irq first
ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets
sctp: fix ICMP processing if skb is non-linear
net: llc: add lock_sock in llc_ui_bind to avoid a race condition
bonding: Don't update slave->link until ready to commit
test_bpf: Add a couple of tests for BPF_JSGE.
bpf: add various verifier test cases
bpf: fix wrong exposure of map_flags into fdinfo for lpm
bpf: add bpf_clone_redirect to bpf_helper_changes_pkt_data
bpf: properly reset caller saved regs after helper call and ld_abs/ind
bpf: fix incorrect pruning decision when alignment must be tracked
arp: fixed -Wuninitialized compiler warning
tcp: avoid fastopen API to be used on AF_UNSPEC
net: move somaxconn init from sysctl code
net: fix potential null pointer dereference
geneve: fix fill_info when using collect_metadata
virtio-net: enable TSO/checksum offloads for Q-in-Q vlans
be2net: Fix offload features for Q-in-Q packets
vlan: Fix tcp checksum offloads in Q-in-Q vlans
...
- Fix indlen block reservation accounting bug when splitting delalloc extent
- Fix warnings about unused variables that appeared in -rc1.
- Don't spew errors when bmapping a local format directory
- Fix an off-by-one error in a delalloc eof assertion
- Make fsmap only return inode information for CAP_SYS_ADMIN
- Fix a potential mount time deadlock recovering cow extents
- Fix unaligned memory access in _btree_visit_blocks
- Fix various SEEK_HOLE/SEEK_DATA bugs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCgAGBQJZJwnxAAoJEPh/dxk0SrTr/TMQAKP6OMsjYxpro+1Uif+oPTQ6
vvUfXJMWLKc07QI/czwLDY4A36h2TZjNxpBJypSfVumlD82ZPa8gp6XFWngwIUb4
3G+A9zq4Fviq8Vzz3G75C8Q49h8IpmU3SimTlhS1BIcxe+upu2qplzM3yc6/T4MB
WTTqtjL3SaW5D2v0ZdPL9ulQKKAlL1WfbZV9dDJ4UiRw5Jlwj2Udg6HnbRvfrcZF
IziYlidrTIt64ecA9GqR32soXqFBGPKo6Wp9Pk+iWLlsfM6qcCt1m+yfM1JonRGA
wycygcrrjfR/lFHMQCGonLs1ajC6isLeMZ804P6OP2q6kfdtersedvY7XSoYsEJ4
ok4J3fiyqYgMGhPz7x0Y8IH9+gdudn7+fHiC5/RNkolEy8AbPPe21XhFDVxeTkCs
4GAHNGQfOEK2PT69Ya81taVzT/TpuIGIkUAaDH8vsfxwcVunM08/OffsCiinLMJx
bt3G7fH3wJ+VuYJS92amj3k6n6EAeHYc0dAVGd5e8dtN25079nBm+EP0Wp+j8uVl
PwaJjde68wxWUvuYXVK1a8vietRS7xChyta34cYcStd4wWu1knccpN/mjQnK/ucB
4etZspB1rQQx08KBqHVq8t508PA7nWtFxjE91JYkpvbyYym1WEH8Mz7rbVBI6NjS
Y/8+uPhFq2BU1b9skj0U
=pDjl
-----END PGP SIGNATURE-----
Merge tag 'xfs-4.12-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull XFS fixes from Darrick Wong:
"A few miscellaneous bug fixes & cleanups:
- Fix indlen block reservation accounting bug when splitting delalloc
extent
- Fix warnings about unused variables that appeared in -rc1.
- Don't spew errors when bmapping a local format directory
- Fix an off-by-one error in a delalloc eof assertion
- Make fsmap only return inode information for CAP_SYS_ADMIN
- Fix a potential mount time deadlock recovering cow extents
- Fix unaligned memory access in _btree_visit_blocks
- Fix various SEEK_HOLE/SEEK_DATA bugs"
* tag 'xfs-4.12-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: Move handling of missing page into one place in xfs_find_get_desired_pgoff()
xfs: Fix off-by-in in loop termination in xfs_find_get_desired_pgoff()
xfs: Fix missed holes in SEEK_HOLE implementation
xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff()
xfs: fix unaligned access in xfs_btree_visit_blocks
xfs: avoid mount-time deadlock in CoW extent recovery
xfs: only return detailed fsmap info if the caller has CAP_SYS_ADMIN
xfs: bad assertion for delalloc an extent that start at i_size
xfs: fix warnings about unused stack variables
xfs: BMAPX shouldn't barf on inline-format directories
xfs: fix indlen accounting error on partial delalloc conversion
Andrey Konovalov reported crashes in ipv4_mtu()
I could reproduce the issue with KASAN kernels, between
10.246.7.151 and 10.246.7.152 :
1) 20 concurrent netperf -t TCP_RR -H 10.246.7.152 -l 1000 &
2) At the same time run following loop :
while :
do
ip ro add 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
ip ro del 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
done
Cong Wang attempted to add back rt->fi in commit
82486aa6f1 ("ipv4: restore rt->fi for reference counting")
but this proved to add some issues that were complex to solve.
Instead, I suggested to add a refcount to the metrics themselves,
being a standalone object (in particular, no reference to other objects)
I tried to make this patch as small as possible to ease its backport,
instead of being super clean. Note that we believe that only ipv4 dst
need to take care of the metric refcount. But if this is wrong,
this patch adds the basic infrastructure to extend this to other
families.
Many thanks to Julian Anastasov for reviewing this patch, and Cong Wang
for his efforts on this problem.
Fixes: 2860583fe8 ("ipv4: Kill rt->fi")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function ax_init_dev (which is called only from the driver's .probe
function) calls free_irq in the error path without having requested the
irq in the first place. So drop the free_irq call in the error path.
Fixes: 825a2ff189 ("AX88796 network driver")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>