When netlink mmap on receive side is the consumer of nf queue data,
it can happen that in some edge cases, we write skb shared info into
the user space mmap buffer:
Assume a possible rx ring frame size of only 4096, and the network skb,
which is being zero-copied into the netlink skb, contains page frags
with an overall skb->len larger than the linear part of the netlink
skb.
skb_zerocopy(), which is generic and thus not aware of the fact that
shared info cannot be accessed for such skbs then tries to write and
fill frags, thus leaking kernel data/pointers and in some corner cases
possibly writing out of bounds of the mmap area (when filling the
last slot in the ring buffer this way).
I.e. the ring buffer slot is then of status NL_MMAP_STATUS_VALID, has
an advertised length larger than 4096, where the linear part is visible
at the slot beginning, and the leaked sizeof(struct skb_shared_info)
has been written to the beginning of the next slot (also corrupting
the struct nl_mmap_hdr slot header incl. status etc), since skb->end
points to skb->data + ring->frame_size - NL_MMAP_HDRLEN.
The fix adds and lets __netlink_alloc_skb() take the actual needed
linear room for the network skb + meta data into account. It's completely
irrelevant for non-mmaped netlink sockets, but in case mmap sockets
are used, it can be decided whether the available skb_tailroom() is
really large enough for the buffer, or whether it needs to internally
fallback to a normal alloc_skb().
>From nf queue side, the information whether the destination port is
an mmap RX ring is not really available without extra port-to-socket
lookup, thus it can only be determined in lower layers i.e. when
__netlink_alloc_skb() is called that checks internally for this. I
chose to add the extra ldiff parameter as mmap will then still work:
We have data_len and hlen in nfqnl_build_packet_message(), data_len
is the full length (capped at queue->copy_range) for skb_zerocopy()
and hlen some possible part of data_len that needs to be copied; the
rem_len variable indicates the needed remaining linear mmap space.
The only other workaround in nf queue internally would be after
allocation time by f.e. cap'ing the data_len to the skb_tailroom()
iff we deal with an mmap skb, but that would 1) expose the fact that
we use a mmap skb to upper layers, and 2) trim the skb where we
otherwise could just have moved the full skb into the normal receive
queue.
After the patch, in my test case the ring slot doesn't fit and therefore
shows NL_MMAP_STATUS_COPY, where a full skb carries all the data and
thus needs to be picked up via recv().
Fixes: 3ab1f683bf ("nfnetlink: add support for memory mapped netlink")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of netlink mmap, there can be situations where received frames
have to be placed into the normal receive queue. The ring buffer indicates
this through NL_MMAP_STATUS_COPY, so the user is asked to pick them up
via recvmsg(2) syscall, and to put the slot back to NL_MMAP_STATUS_UNUSED.
Commit 0ef707700f ("netlink: rx mmap: fix POLLIN condition") changed
polling, so that we walk in the worst case the whole ring through the
new netlink_has_valid_frame(), for example, when the ring would have no
NL_MMAP_STATUS_VALID, but at least one NL_MMAP_STATUS_COPY frame.
Since we do a datagram_poll() already earlier to pick up a mask that could
possibly contain POLLIN | POLLRDNORM already (due to NL_MMAP_STATUS_COPY),
we can skip checking the rx ring entirely.
In case the kernel is compiled with !CONFIG_NETLINK_MMAP, then all this is
irrelevant anyway as netlink_poll() is just defined as datagram_poll().
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Incorporate fw_ldst_cmd structure change for new firmware and also
update version string for the same
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There exist one issue by below case that case system hang:
ifconfig eth0 down
ifconfig eth0 hw ether 00:10:19:19:81:19
After eth0 down, all fec clocks are gated off. In the .fec_set_mac_address()
function, it will set new MAC address to registers, which causes system hang.
So it needs to add netif status check to avoid registers access when clocks are
gated off. Until eth0 up the new MAC address are wrote into related registers.
V2:
As Lucas Stach's suggestion, add a comment in the code to explain why it needed.
CC: Lucas Stach <l.stach@pengutronix.de>
CC: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hayes Wang says:
====================
r8152: fix the autoresume may fail
Fix the autosuspend issues which occur about linking change.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the runtime suspend issues result from the linking change.
Case 1:
a) link down occurs.
b) driver disable tx/rx.
c) autosuspend occurs.
d) hw linking up.
e) device suspends without enabling tx/rx.
f) couldn't wake up when receiving packets.
Case 2:
a) Nway results in linking down.
b) autosuspend occurs.
c) device suspends.
d) device may not wake up when linking up.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Split DRIVER_VERSION into NETNEXT_VERSION and NET_VERSION. Then,
according to the value of DRIVER_VERSION, we could know which
patches are used generally without comparing the source code.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/ipv6/route.c:2946:3-8: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values.
NULL check before some freeing functions is not needed.
Based on checkpatch warning
"kfree(NULL) is safe this check is probably not required"
and kfreeaddr.cocci by Julia Lawall.
Generated by: scripts/coccinelle/free/ifnullfree.cocci
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add Microchip LAN88XX phy driver for phylib.
Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current check of phydev with IS_ERR(phydev) may make not much sense
because of_phy_connect() returns NULL on failure instead of error value.
Still for checking result of phy_connect() IS_ERR() makes perfect sense.
So let's use combined check IS_ERR_OR_NULL() that covers both cases.
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In all cases, mbx->req.arg and mbx->rsp.arg have just been allocated
using kcalloc(), so these six memsets are redundant.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The double memset is a little ugly; using kzalloc avoids it altogether.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using kzalloc saves a tiny bit on .text.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
We save a little .text and get rid of the sizeof(...) style
inconsistency.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
PBIAS regulator is required for MMC module in OMAP2, OMAP3, OMAP4,
OMAP5 and DRA7 SoCs. Enable it here.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
This switches IPv6 policy routing to use the shared
fib_default_rule_pref() function of IPv4 and DECnet. It is also used in
multicast routing for IPv4 as well as IPv6.
The motivation for this patch is a complaint about iproute2 behaving
inconsistent between IPv4 and IPv6 when adding policy rules: Formerly,
IPv6 rules were assigned a fixed priority of 0x3FFF whereas for IPv4 the
assigned priority value was decreased with each rule added.
Since then all users of the default_pref field have been converted to
assign the generic function fib_default_rule_pref(), fib_nl_newrule()
may just use it directly instead. Therefore get rid of the function
pointer altogether and make fib_default_rule_pref() static, as it's not
used outside fib_rules.c anymore.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
For !CONFIG_OF of_get_property() is defined to always return NULL. Thus
there's no need to protect the call to of_get_property() with #ifdef
CONFIG_OF.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The macro to write 64-bits quantities to the 32-bits register swapped
the value and offsets arguments, we want to preserve the ordering of the
arguments with respect to how writel() is implemented for instance:
value first, offset/base second.
Fixes: 246d7f773c ("net: dsa: add Broadcom SF2 switch driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
when the verifier log is enabled the print_bpf_insn() is doing
bpf_alu_string[BPF_OP(insn->code) >> 4]
and
bpf_jmp_string[BPF_OP(insn->code) >> 4]
where BPF_OP is a 4-bit instruction opcode.
Malformed insns can cause out of bounds access.
Fix it by sizing arrays appropriately.
The bug was found by clang address sanitizer with libfuzzer.
Reported-by: Yonghong Song <yhs@plumgrid.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Problem:
The ecmp route replace support for ipv6 in the kernel, deletes the
existing ecmp route too early, ie when it installs the first nexthop.
If there is an error in installing the subsequent nexthops, its too late
to recover the already deleted existing route leaving the fib
in an inconsistent state.
This patch reduces the possibility of this by doing the following:
a) Changes the existing multipath route add code to a two stage process:
build rt6_infos + insert them
ip6_route_add rt6_info creation code is moved into
ip6_route_info_create.
b) This ensures that most errors are caught during building rt6_infos
and we fail early
c) Separates multipath add and del code. Because add needs the special
two stage mode in a) and delete essentially does not care.
d) In any event if the code fails during inserting a route again, a
warning is printed (This should be unlikely)
Before the patch:
$ip -6 route show
3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
/* Try replacing the route with a duplicate nexthop */
$ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
RTNETLINK answers: File exists
$ip -6 route show
/* previously added ecmp route 3000:1000:1000:1000::2 dissappears from
* kernel */
After the patch:
$ip -6 route show
3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
/* Try replacing the route with a duplicate nexthop */
$ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
RTNETLINK answers: File exists
$ip -6 route show
3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
Fixes: 2759647247 ("ipv6: fix ECMP route replacement")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fBLOCKREADINTR is masking the notification from the remote and should
hence be cleared while we're waiting the tx fifo to drain. Also change
the reset state to mask the notification, as send is the only use case
where we're interested in it.
Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Signed-off-by: Andy Gross <agross@codeaurora.org>
This patch fixes SMEM addressing issues when remote processors need to use
secure SMEM partitions.
Signed-off-by: Andy Gross <agross@codeaurora.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
This patch corrects private partition item access. Instead of falling back to
global for instances where we have an actual host and remote partition existing,
return the results of the private lookup.
Signed-off-by: Andy Gross <agross@codeaurora.org>
PCT_TO_HWP does not take the actual range of pstates exported
by HWP_CAPABILITIES in account, and is broken on most platforms.
Remove the macro and set the min and max pstate for hwp by
determining the range and adjusting by the min and max percent
limits values.
Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In current code, max_perf_pct might be smaller than min_perf_pct
by improper user input:
$ grep . /sys/devices/system/cpu/intel_pstate/m*_perf_pct
/sys/devices/system/cpu/intel_pstate/max_perf_pct:100
/sys/devices/system/cpu/intel_pstate/min_perf_pct:100
$ echo 80 > /sys/devices/system/cpu/intel_pstate/max_perf_pct
$ grep . /sys/devices/system/cpu/intel_pstate/m*_perf_pct
/sys/devices/system/cpu/intel_pstate/max_perf_pct:80
/sys/devices/system/cpu/intel_pstate/min_perf_pct:100
Fix this problem by 2 steps:
1. Normalize the user input to [min_policy, max_policy].
2. Make sure max_perf_pct>=min_perf_pct, suggested by Seiichi Ikarashi.
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Acked-by: Kristen Carlson Accardi <kristen@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There is no point returning suspend_opp, if it is disabled by the core.
As we can't use it at all. Fix it.
Fixes: 4eafbd15b6 ("PM / OPP: add dev_pm_opp_get_suspend_opp() helper")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Use stdout-path so that we don't have to put the console on the
kernel command line.
Cc: Tim Bird <tim.bird@sonymobile.com>
Cc: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Use stdout-path so that we don't have to put the console on the
kernel command line.
Cc: Mathieu Olivari <mathieu@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Use stdout-path so that we don't have to put the console on the
kernel command line.
Cc: Mike Rapoport <mike.rapoport@gmail.com>
Cc: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
* Switch to use pinctrl compatible for GPIOs
* Add RPM regulators for MSM8960
* Add SPI Ethernet support on MSM8960 CDP
* Add SMEM support along with dependencies
* Add PM8921 support for GPIO and MPP
* Fix GSBI cell index
* Switch to use real regulators on APQ8064 w/ SDCC
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJVuR9GAAoJEFKiBbHx2RXV9Q0P/2TXIsDmgFyyIA7edlqag3ch
m0coBGVv20bP5wIrYHjQywBJqfCuQPm70jVsl2Voxzw/69jCvzHlzia6Aqm6f+sL
jX941CpV7DVVACm0ltec3FqV33OZKRh9Eyz0Mq1jerrPsvwMnwGuAziH7zkzCNnr
b9aLdWqWr68HZve/8naRAFWpm5mvWJhADnbKh2UryH6ZWaT8ueWdqnhy4oteo6MK
sf0CWva9DdkH9GNE8F+N5ksOdV/a5IDdUVoujsaAaXJI/otqAoAn0PrzRTp1sV43
iAJKEAbAlhPKx/0YJIW1dpp8viyOcjAtNRpMLjA79HXF2yaYUlieLWYVhL851dKX
L4MNyYRauQf+JIR6fGwSN7Tg+/G8ZF5m70EZXkoya5JGBTnXbgZ/e0tytu8MXs3E
abiUeaYWFn4r1ta0V8glbk/iDlmNWcBUZz9NInrLzcjLOwq3+PJYdoq4mav5Qkrf
cpByhFRHEV2jToZ7t/uyz8NndwLKnioFHzKb+o8L+B9qzB0r/pEdF143gRWlWV6W
2WGIUf2wCOBEMmaWhbGvGdYP1RghtjSKRGoaN4d3mxSD2lU9OvkARYETKtg4hwo4
y8cWOhx3sNsrAxs0GdzN4bh06LcnKYGUnpuqQb3o3GAjxY3xTFlhK0FDpXULCU2W
c7A8yM5yN+csyAGl0r1/
=kyj6
-----END PGP SIGNATURE-----
Merge tag 'qcom-dt-for-4.3' into v4.2-rc2
Qualcomm ARM Based Device Tree Updates for v4.3
* Switch to use pinctrl compatible for GPIOs
* Add RPM regulators for MSM8960
* Add SPI Ethernet support on MSM8960 CDP
* Add SMEM support along with dependencies
* Add PM8921 support for GPIO and MPP
* Fix GSBI cell index
* Switch to use real regulators on APQ8064 w/ SDCC
We may already have gotten a proper fd struct through fdget(), so
whenever we return at the end of an map operation, we need to call
fdput(). However, each map operation from syscall side first probes
CHECK_ATTR() to verify that unused fields in the bpf_attr union are
zero.
In case of malformed input, we return with error, but the lookup to
the map_fd was already performed at that time, so that we return
without an corresponding fdput(). Fix it by performing an fdget()
only right before bpf_map_get(). The fdget() invocation on maps in
the verifier is not affected.
Fixes: db20fd2b01 ("bpf: add lookup/update/delete/iterate methods to BPF maps")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
/sys/class/watchdog/watchdogn/device/modalias can help to identify the
driver/module for a given watchdog node. However, many wdt devices do not
set their parent and so, we do not see an entry for device in sysfs for
such devices.
This patch fixes parent of watchdog_device so that
/sys/class/watchdog/watchdogn/device is populated.
Exceptions: booke, diag288, octeon, softdog and w83627hf -- They do not
have any parent. Not sure, how we can identify driver for these devices.
Signed-off-by: Pratyush Anand <panand@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Lubomir Rintel <lkundrak@v3.sk>
Acked-by: Maxime Coquelin <maxime.coquelin@st.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
syscon_node_to_regmap() returns a regmap or an ERR_PTR().
Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Commit dca1a4b5ff ("clk: at91: keep slow clk enabled to prevent system
hang") added a workaround for the slow clock as it is not properly handled
by its users.
Get and use the slow clock as it is necessary for the at91sam9 watchdog.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
The compatible "atmel,sama5d4-wdt" supports the SAMA5D4 watchdog driver
and the watchdog's WDT_MR register can be written more than once.
Signed-off-by: Wenyou Yang <wenyou.yang@atmel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
From SAMA5D4, the watchdog timer is upgrated with a new feature,
which is describled as in the datasheet, "WDT_MR can be written
until a LOCKMR command is issued in WDT_CR".
That is to say, as long as the bootstrap and u-boot don't issue
a LOCKMR command, WDT_MR can be written more than once in the driver.
So the SAMA5D4 watchdog driver's implementation is different from
the at91sam9260 watchdog driver implemented in file at91sam9_wdt.c.
The user application open the device file to enable the watchdog timer
hardware, and close to disable it, and set the watchdog timer timeout
by seting WDV and WDD fields of WDT_MR register, and ping the watchdog
by issuing WDRSTT command to WDT_CR register with hard-coded key.
Signed-off-by: Wenyou Yang <wenyou.yang@atmel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>