Use kmem_cache to allocate/free inet_frag_queue objects since they're
all the same size per inet_frags user and are alloced/freed in high volumes
thus making it a perfect case for kmem_cache.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the flags to an enum definion, swap FIRST_IN/LAST_IN to be in increasing
order and add comments explaining each flag and the inet_frag_queue struct
members.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The last_in field has been used to store various flags different from
first/last frag in so give it a more descriptive name: flags.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
clean up names related to socket filtering and bpf in the following way:
- everything that deals with sockets keeps 'sk_*' prefix
- everything that is pure BPF is changed to 'bpf_*' prefix
split 'struct sk_filter' into
struct sk_filter {
atomic_t refcnt;
struct rcu_head rcu;
struct bpf_prog *prog;
};
and
struct bpf_prog {
u32 jited:1,
len:31;
struct sock_fprog_kern *orig_prog;
unsigned int (*bpf_func)(const struct sk_buff *skb,
const struct bpf_insn *filter);
union {
struct sock_filter insns[0];
struct bpf_insn insnsi[0];
struct work_struct work;
};
};
so that 'struct bpf_prog' can be used independent of sockets and cleans up
'unattached' bpf use cases
split SK_RUN_FILTER macro into:
SK_RUN_FILTER to be used with 'struct sk_filter *' and
BPF_PROG_RUN to be used with 'struct bpf_prog *'
__sk_filter_release(struct sk_filter *) gains
__bpf_prog_release(struct bpf_prog *) helper function
also perform related renames for the functions that work
with 'struct bpf_prog *', since they're on the same lines:
sk_filter_size -> bpf_prog_size
sk_filter_select_runtime -> bpf_prog_select_runtime
sk_filter_free -> bpf_prog_free
sk_unattached_filter_create -> bpf_prog_create
sk_unattached_filter_destroy -> bpf_prog_destroy
sk_store_orig_filter -> bpf_prog_store_orig_filter
sk_release_orig_filter -> bpf_release_orig_filter
__sk_migrate_filter -> bpf_migrate_filter
__sk_prepare_filter -> bpf_prepare_filter
API for attaching classic BPF to a socket stays the same:
sk_attach_filter(prog, struct sock *)/sk_detach_filter(struct sock *)
and SK_RUN_FILTER(struct sk_filter *, ctx) to execute a program
which is used by sockets, tun, af_packet
API for 'unattached' BPF programs becomes:
bpf_prog_create(struct bpf_prog **)/bpf_prog_destroy(struct bpf_prog *)
and BPF_PROG_RUN(struct bpf_prog *, ctx) to execute a program
which is used by isdn, ppp, team, seccomp, ptp, xt_bpf, cls_bpf, test_bpf
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
to indicate that this function is converting classic BPF into eBPF
and not related to sockets
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
trivial rename to indicate that this functions performs classic BPF checking
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
trivial rename to better match semantics of macro
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
attaching bpf program to a socket involves multiple socket memory arithmetic,
since size of 'sk_filter' is changing when classic BPF is converted to eBPF.
Also common path of program creation has to deal with two ways of freeing
the memory.
Simplify the code by delaying socket charging until program is ready and
its size is known
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SCTP socket extensions API document describes the v4mapping option as
follows:
8.1.15. Set/Clear IPv4 Mapped Addresses (SCTP_I_WANT_MAPPED_V4_ADDR)
This socket option is a Boolean flag which turns on or off the
mapping of IPv4 addresses. If this option is turned on, then IPv4
addresses will be mapped to V6 representation. If this option is
turned off, then no mapping will be done of V4 addresses and a user
will receive both PF_INET6 and PF_INET type addresses on the socket.
See [RFC3542] for more details on mapped V6 addresses.
This description isn't really in line with what the code does though.
Introduce addr_to_user (renamed addr_v4map), which should be called
before any sockaddr is passed back to user space. The new function
places the sockaddr into the correct format depending on the
SCTP_I_WANT_MAPPED_V4_ADDR option.
Audit all places that touched v4mapped and either sanely construct
a v4 or v6 address then call addr_to_user, or drop the
unnecessary v4mapped check entirely.
Audit all places that call addr_to_user and verify they are on a sycall
return path.
Add a custom getname that formats the address properly.
Several bugs are addressed:
- SCTP_I_WANT_MAPPED_V4_ADDR=0 often returned garbage for
addresses to user space
- The addr_len returned from recvmsg was not correct when
returning AF_INET on a v6 socket
- flowlabel and scope_id were not zerod when promoting
a v4 to v6
- Some syscalls like bind and connect behaved differently
depending on v4mapped
Tested bind, getpeername, getsockname, connect, and recvmsg for proper
behaviour in v4mapped = 1 and 0 cases.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Net_device is a vast and important structure, but it has no kernel-doc
compliant documentation. This patch extracts the comments from the structure
to clean it up, and let the scripts extract documentation from it. I know that
the patch is big, but it's just reordering of comments into the appropriate
form, and adding a few more, for the missing members.
Signed-off-by: Karoly Kemeny <karoly.kemeny@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds and modifies code to support multiple Multicast and Unicast
Synopsys MAC filter configurations. The default configuration is defined to
support legacy driver behavior, which is 64 Multicast bins. The Unicast
filter code previously assumed all controllers support 32 or 16 Unicast
addresses based on controller version number, but this has been corrected
to support a default of 1 Unicast address. The filter configuration may
be specified through the devicetree using a Synopsys specific device tree
entry. This information was verified with Synopsys through
Synopsys Support Case #8000684337 and shared with the maintainer.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains netfilter updates for net-next, they are:
1) Add the reject expression for the nf_tables bridge family, this
allows us to send explicit reject (TCP RST / ICMP dest unrech) to
the packets matching a rule.
2) Simplify and consolidate the nf_tables set dumping logic. This uses
netlink control->data to filter out depending on the request.
3) Perform garbage collection in xt_hashlimit using a workqueue instead
of a timer, which is problematic when many entries are in place in
the tables, from Eric Dumazet.
4) Remove leftover code from the removed ulog target support, from
Paul Bolle.
5) Dump unmodified flags in the netfilter packet accounting when resetting
counters, so userspace knows that a counter was in overquota situation,
from Alexey Perevalov.
6) Fix wrong usage of the bitwise functions in nfnetlink_acct, also from
Alexey.
7) Fix a crash when adding new set element with an empty NFTA_SET_ELEM_LIST
attribute.
This patchset also includes a couple of cleanups for xt_LED from
Duan Jiong and for nf_conntrack_ipv4 (using coccinelle) from
Himangi Saraogi.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
libphy was originally written assuming all phy devices support clause 45
access extensions to the mmd registers through the indirection registers
located within the first 16 phy registers. This assumption is not true
in all cases, and one specific example is the Micrel ksz9021 10/100/1000
Mbps phy. Using the stmmac driver, accessing the mmd registers to query
and configure energy efficient Ethernet (EEE) features yielded unexpected
behavior.
This patch adds mmd access functions to the phy driver that can be
overriden by the phy specific driver if the phy does not support this
mechanism or uses it's own non-standard access mechanism. By default,
the IEEE Compatible clause 45 access mechanism described in clause 22
is used. With this patch, EEE query/configure functions as expected
using the stmmac and the Micrel ksz9021 phy.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This structure is not exposed to userspace, so fix this by defining
struct sk_filter; so we skip the casting in kernelspace. This is safe
since userspace has no way to lurk with that internal pointer.
Fixes: e6f30c7 ("netfilter: x_tables: add xt_bpf match")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current explanation of dcb_app->priority is wrong. It says priority is
expected to be a 3-bit unsigned integer which is only true when working with
DCBx-IEEE. Use of dcb_app->priority by DCBx-CEE expects it to be 802.1p user
priority bitmap. Updated accordingly
This affects the cxgb4 driver, but I will post those changes as part of a
larger changeset shortly.
Fixes: 3e29027af4 ("dcbnl: add support for ieee8021Qaz attributes")
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the event flow, we currently pass only a port number in the
void *data argument. Rather than pass a pointer to the event handlers,
we should use an "unsigned long" parameter, and pass the port number
value directly.
In the future, if necessary for some events, we can use the unsigned long
parameter to pass a pointer.
Based on a patch by Eli Cohen <eli@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There were many places where parameters which should be u8/u16 were
integer type.
Additionally, in 2 places, a check for a non-null pointer was added
before dereferencing the pointer (this is actually a bug fix).
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for a new mlx5 device which is VPI (i.e., ports can be
either IB or ETH), move the pci device functionality from mlx5_ib
to mlx5_core.
This involves the following changes:
1. Move mlx5_core_dev struct out of mlx5_ib_dev. mlx5_core_dev
is now an independent structure maintained by mlx5_core.
mlx5_ib_dev now has a pointer to that struct.
This requires changing a lot of places where the core_dev
struct was accessed via mlx5_ib_dev (now, this needs to
be a pointer dereference).
2. All PCI initializations are now done in mlx5_core. Thus,
it is now mlx5_core which does pci_register_device (and not
mlx5_ib, as was previously).
3. mlx5_ib now registers itself with mlx5_core as an "interface"
driver. This is very similar to the mechanism employed for
the mlx4 (ConnectX) driver. Once the HCA is initialized
(by mlx5_core), it invokes the interface drivers to do
their initializations.
4. There is a new event handler which the core registers:
mlx5_core_event(). This event handler invokes the
event handlers registered by the interfaces.
Based on a patch by Eli Cohen <eli@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Exynos has buggy firmware that puts bad data into the memory node. Commit
1c2f87c2 (ARM: Get rid of meminfo) exposed the bug by dropping the artificial
upper bound on the number of memory banks that can be added. Exynos fails to
boot after that commit. This branch fixes it by splitting the early DT parse
function and inserting a fixup hook. Exynos uses the hook to correct the DT
before parsing memory regions.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJT2HI1AAoJEMWQL496c2LNvfAP/ifY6foyrO2MHGxlGdghL3Xe
fHY+MxoywBqWwLuXjfSh0rIt/5KE80JvtTjnssSOHOZokOPa/O3N39SrQPaLRqW8
1XC5A/Qocokeii69iXgXn0aQChBhyrRW708q9iU43ucKwcmWNvrzgdq838XdVB3q
BGHeV9ADn57PHAitsOrDCJei//jgs94NXDKPmCwrTn62aiedeiiMAWYUfsPXFtsn
gloL8wT8gcD8ojaSvKWpGJtUbkFBNe1DVQgsmIfG0hNUuolpsbNZo688OoWJUCaj
0qQ2LqHD2djDMqxxj0xFxOx7GoQPZjAG9NlLkca3QG5dc1S+Bf//g11uxRAHQ2qD
3l24i825fp4kGL1NUfR+OK4PIqGwBbEnXoIgrWnVjQxw/adMlH3iWFfuZqe/fBIq
4CTe9buc+JGCdJUAp+DS3YRYtFPdlovgaJjCAAwKWEd4GpjLEKrGGL/dAkhyRP/j
77byHy8XgSB5moh7qiR0u1M3lyRmU54f5EdDimPGaMUJ2PSzSxuYZk41hRRrstVn
JCzDmblvTF4wai3t4Z+laUP0dAym/gwX/87UiRsO+hyXKGiVCq9AmDkueL2xLUuV
c8rqjXLcVZ5qicLP2uCtWpz96WVzTCa3CzcMufT7t6cErMLueSSARrxq2RrETsFo
SpeBf3cc90Edv8LP7V9W
=lmyQ
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux
Pull Exynos platform DT fix from Grant Likely:
"Device tree Exynos bug fix for v3.16-rc7
This bug fix has been brewing for a while. I hate sending it to you
so late, but I only got confirmation that it solves the problem this
past weekend. The diff looks big for a bug fix, but the majority of
it is only executed in the Exynos quirk case. Unfortunately it
required splitting early_init_dt_scan() in two and adding quirk
handling in the middle of it on ARM.
Exynos has buggy firmware that puts bad data into the memory node.
Commit 1c2f87c225 ("ARM: Get rid of meminfo") exposed the bug by
dropping the artificial upper bound on the number of memory banks that
can be added. Exynos fails to boot after that commit. This branch
fixes it by splitting the early DT parse function and inserting a
fixup hook. Exynos uses the hook to correct the DT before parsing
memory regions"
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
arm: Add devicetree fixup machine function
of: Add memory limiting function for flattened devicetrees
of: Split early_init_dt_scan into two parts
often during boot with Ubuntu 14.04 PV guests.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJT2PhgAAoJEFxbo/MsZsTRlzIH/1HjbkGZmRlOj5wcrYlWCUJ/
DGLBHc76so52xd9oP8COT5tuSVP6/usPPLFaOmVZ7fMiOpoyz9d3lc0g56otw3gJ
tTUFTyW0EoFtvmIl50OMC726p9azETjA3P2XJkV/D3GhBGGqgrP5uR+mRvisvq3y
eGZEx1UIHv1jov47TBFR1NcckXBWw+6J9m34y9h6an9VNDCuuGwYZ8dfGAFsLrVb
lGLTmgQQmyk4SexVINfOwL40KkVDVEq+X74HcPviyNHEIy66xLzMtKpL+Sf4xeuv
VG3JhqAUGuRGGK48rrbpxhBbpxGp35O9RV68YrGssxfuTejSYduw5zTzzt30QIA=
=cr8X
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.16-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen fix from David Vrabel:
"Fix BUG when trying to expand the grant table. This seems to occur
often during boot with Ubuntu 14.04 PV guests"
* tag 'stable/for-linus-3.16-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: safely map and unmap grant frames when in atomic context
This reverts commit 20fbe3ae99.
As reported by Stephen Rothwell, it causes compile failures in certain
configurations:
drivers/net/usb/cdc_subset.c:360:15: error: 'dummy_prereset' undeclared here (not in a function)
.pre_reset = dummy_prereset,
^
drivers/net/usb/cdc_subset.c:361:16: error: 'dummy_postreset' undeclared here (not in a function)
.post_reset = dummy_postreset,
^
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: David Miller <davem@davemloft.net>
Cc: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking fixes from David Miller:
1) Make fragmentation IDs less predictable, from Eric Dumazet.
2) TSO tunneling can crash in bnx2x driver, fix from Dmitry Kravkov.
3) Don't allow NULL msg->msg_name just because msg->msg_namelen is
non-zero, from Andrey Ryabinin.
4) ndm->ndm_type set using wrong macros, from Jun Zhao.
5) cdc-ether devices can come up with entries in their address filter,
so explicitly clear the filter after the device initializes. From
Oliver Neukum.
6) Forgotten refcount bump in xfrm_lookup(), from Steffen Klassert.
7) Short packets not padded properly, exposing random data, in bcmgenet
driver. Fix from Florian Fainelli.
8) xgbe_probe() doesn't return an error code, but rather zero, when
netif_set_real_num_tx_queues() fails. Fix from Wei Yongjun.
9) USB speed not probed properly in r8152 driver, from Hayes Wang.
10) Transmit logic choosing the outgoing port in the sunvnet driver
needs to consider a) is the port actually up and b) whether it is a
switch port. Fix from David L Stevens.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
net: phy: re-apply PHY fixups during phy_register_device
cdc-ether: clean packet filter upon probe
cdc_subset: deal with a device that needs reset for timeout
net: sendmsg: fix NULL pointer dereference
isdn/bas_gigaset: fix a leak on failure path in gigaset_probe()
ip: make IP identifiers less predictable
neighbour : fix ndm_type type error issue
sunvnet: only use connected ports when sending
can: c_can_platform: Fix raminit, use devm_ioremap() instead of devm_ioremap_resource()
bnx2x: fix crash during TSO tunneling
r8152: fix the checking of the usb speed
net: phy: Ensure the MDIO bus module is held
net: phy: Set the driver when registering an MDIO bus device
bnx2x: fix set_setting for some PHYs
hyperv: Fix error return code in netvsc_init_buf()
amd-xgbe: Fix error return code in xgbe_probe()
ath9k: fix aggregation session lockup
net: bcmgenet: correctly pad short packets
net: sctp: inherit auth_capable on INIT collisions
mac80211: fix crash on getting sta info with uninitialized rate control
...
arch_gnttab_map_frames() and arch_gnttab_unmap_frames() are called in
atomic context but were calling alloc_vm_area() which might sleep.
Also, if a driver attempts to allocate a grant ref from an interrupt
and the table needs expanding, then the CPU may already by in lazy MMU
mode and apply_to_page_range() will BUG when it tries to re-enable
lazy MMU mode.
These two functions are only used in PV guests.
Introduce arch_gnttab_init() to allocates the virtual address space in
advance.
Avoid the use of apply_to_page_range() by using saving and using the
array of PTE addresses from the alloc_vm_area() call (which ensures
that the required page tables are pre-allocated).
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Buggy bootloaders may pass bogus memory entries in the devicetree.
Add of_fdt_limit_memory to add an upper bound on the number of
entries that can be present in the devicetree.
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Tested-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Currently, early_init_dt_scan validates the header, sets the
boot params, and scans for chosen/memory all in one function.
Split this up into two separate functions (validation/setting
boot params in one, scanning in another) to allow for
additional setup between boot params and scanning the memory.
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Tested-by: Andreas Färber <afaerber@suse.de>
[glikely: s/early_init_dt_scan_all/early_init_dt_scan_nodes/]
Signed-off-by: Grant Likely <grant.likely@linaro.org>
This device needs to be reset to recover from a timeout.
Unfortunately this can be handled only at the level of
the subdrivers.
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
We create a proc dir for each network device, this will cause
conflicts when the devices have name "all" or "default".
Rather than emitting an ugly kernel warning, we could just
fail earlier by checking the device name.
Reported-by: Stephane Chazelas <stephane.chazelas@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SO_TIMESTAMPING API defines three types of timestamps: software,
hardware in raw format (hwtstamp) and hardware converted to system
format (syststamp). The last has been deprecated in favor of combining
hwtstamp with a PTP clock driver. There are no active users in the
kernel.
The option was device driver dependent. If set, but without hardware
support, the correct behavior is to return zero in the relevant field
in the SCM_TIMESTAMPING ancillary message. Without device drivers
implementing the option, this field is effectively always zero.
Remove the internal plumbing to dissuage new drivers from implementing
the feature. Keep the SOF_TIMESTAMPING_SYS_HARDWARE flag, however, to
avoid breaking existing applications that request the timestamp.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No device driver will ever return an skb_shared_info structure with
syststamp non-zero, so remove the branch that tests for this and
optionally marks the packet timestamp as TP_STATUS_TS_SYS_HARDWARE.
Do not remove the definition TP_STATUS_TS_SYS_HARDWARE, as processes
may refer to it.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A nice small set of bug fixes for arm-soc:
- two incorrect register addresses in DT files on shmobile and hisilicon
- one revert for a regression on omap
- one bug fix for a newly introduced pin controller binding
- one regression fix for the memory controller on omap
- one patch to avoid a harmless WARN_ON
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAU9fDBmCrR//JCVInAQIxCw/+IadEDDeP4WZHO0Bx9vm7Oj8XYlDg4xU8
O+SvqmJ3qDFNxbG7LEZ9B0dqcAaxkYPgF0LEy29uneQn+oKXykzRwhmXilB3akJR
Y/B3y7FJKch9dBZf+Kx+94NgHt1IdcaArWdSKBLgMN5/IZzRY3B8fo3AEjnHjt2P
c0kXasLOQ97aGiFobNHp5GLrR2uUjplzWjMDA7F9i6PQZ1grmDGJ2w67bZ8Uukwh
p2xYOmgHdyVRweFHrHlISNGWov8TPfGJpItM665ROMxJ+wREJ4rHp/VOA/74OMGf
heOEsUUhZOjEvNza8U4TCVroAqA26OCth8sd1mOOe+INPkt1IDAPK4zF0bxHt2it
PuxAVH43fyQ0oPerB9BfAwJOr+aSIQNYJRVpEDbwBU0d0/N/lERixPZxsmSDY4ES
cwzu9FTY2+tYfzS3WW/0fGDtIXXlEbcXnfxc3sSzjErV71GAq1UICxrBrUL5KoGY
YyBh4Ly6V6WzLC0dkRnYe+gEKIWn+SA95JGaYMYigQdIJHGKf7DoChWkDeWmrYwQ
cl34GZ5k79L6c2Az2YoON2R2vwByhP5kSZ5z6sNuyL0Z2TbRUeDw4qkjQcxFvfN0
NLqMidJhFZyKTjJtc0ttB+ah9kyZy+kyKoyKbIMDCk5zYTLAgh0PF85G0IJEEUU5
+qwzQP/ROjQ=
=Ny58
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"A nice small set of bug fixes for arm-soc:
- two incorrect register addresses in DT files on shmobile and hisilicon
- one revert for a regression on omap
- one bug fix for a newly introduced pin controller binding
- one regression fix for the memory controller on omap
- one patch to avoid a harmless WARN_ON"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: Revert enabling of twl configuration for n900
ARM: dts: fix L2 address in Hi3620
ARM: OMAP2+: gpmc: fix gpmc_hwecc_bch_capable()
pinctrl: dra: dt-bindings: Fix pull enable/disable
ARM: shmobile: r8a7791: Fix SD2CKCR register address
ARM: OMAP2+: l2c: squelch warning dump on power control setting
In "Counting Packets Sent Between Arbitrary Internet Hosts", Jeffrey and
Jedidiah describe ways exploiting linux IP identifier generation to
infer whether two machines are exchanging packets.
With commit 73f156a6e8 ("inetpeer: get rid of ip_id_count"), we
changed IP id generation, but this does not really prevent this
side-channel technique.
This patch adds a random amount of perturbation so that IP identifiers
for a given destination [1] are no longer monotonically increasing after
an idle period.
Note that prandom_u32_max(1) returns 0, so if generator is used at most
once per jiffy, this patch inserts no hole in the ID suite and do not
increase collision probability.
This is jiffies based, so in the worst case (HZ=1000), the id can
rollover after ~65 seconds of idle time, which should be fine.
We also change the hash used in __ip_select_ident() to not only hash
on daddr, but also saddr and protocol, so that ICMP probes can not be
used to infer information for other protocols.
For IPv6, adds saddr into the hash as well, but not nexthdr.
If I ping the patched target, we can see ID are now hard to predict.
21:57:11.008086 IP (...)
A > target: ICMP echo request, seq 1, length 64
21:57:11.010752 IP (... id 2081 ...)
target > A: ICMP echo reply, seq 1, length 64
21:57:12.013133 IP (...)
A > target: ICMP echo request, seq 2, length 64
21:57:12.015737 IP (... id 3039 ...)
target > A: ICMP echo reply, seq 2, length 64
21:57:13.016580 IP (...)
A > target: ICMP echo request, seq 3, length 64
21:57:13.019251 IP (... id 3437 ...)
target > A: ICMP echo reply, seq 3, length 64
[1] TCP sessions uses a per flow ID generator not changed by this patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jeffrey Knockel <jeffk@cs.unm.edu>
Reported-by: Jedidiah R. Crandall <crandall@cs.unm.edu>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Hannes Frederic Sowa <hannes@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville says:
====================
pull request: wireless-next 2014-07-25
Please pull this batch of updates intended for the 3.17 stream!
For the mac80211 bits, Johannes says:
"We have a lot of TDLS patches, among them a fix that should make hwsim
tests happy again. The rest, this time, is mostly small fixes."
For the Bluetooth bits, Gustavo says:
"Some more patches for 3.17. The most important change here is the move of
the 6lowpan code to net/6lowpan. It has been agreed with Davem that this
change will go through the bluetooth tree. The rest are mostly clean up and
fixes."
and,
"Here follows some more patches for 3.17. These are mostly fixes to what
we've sent to you before for next merge window."
For the iwlwifi bits, Emmanuel says:
"I have the usual amount of BT Coex stuff. Arik continues to work
on TDLS and Ariej contributes a few things for HS2.0. I added a few
more things to the firmware debugging infrastructure. Eran fixes a
small bug - pretty normal content."
And for the Atheros bits, Kalle says:
"For ath6kl me and Jessica added support for ar6004 hw3.0, our latest
version of ar6004.
For ath10k Janusz added a printout so that it's easier to check what
ath10k kconfig options are enabled. He also added a debugfs file to
configure maximum amsdu and ampdu values. Also we had few fixes as
usual."
On top of that is the usual large batch of various driver updates --
brcmfmac, mwifiex, the TI drivers, and wil6210 all get some action.
Rafał has also been very busy with b43 and related updates.
Also, I pulled the wireless tree into this in order to resolve a
merge conflict...
P.S. The change to fs/compat_ioctl.c reflects a name change in a
Bluetooth header file...
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Change formal parameter name to not shadow the global jiffies.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rehash is rare operation, don't force readers to take
the read-side rwlock.
Instead, we only have to detect the (rare) case where
the secret was altered while we are trying to insert
a new inetfrag queue into the table.
If it was changed, drop the bucket lock and recompute
the hash to get the 'new' chain bucket that we have to
insert into.
Joint work with Nikolay Aleksandrov.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
merge functionality into the eviction workqueue.
Instead of rebuilding every n seconds, take advantage of the upper
hash chain length limit.
If we hit it, mark table for rebuild and schedule workqueue.
To prevent frequent rebuilds when we're completely overloaded,
don't rebuild more than once every 5 seconds.
ipfrag_secret_interval sysctl is now obsolete and has been marked as
deprecated, it still can be changed so scripts won't be broken but it
won't have any effect. A comment is left above each unused secret_timer
variable to avoid confusion.
Joint work with Nikolay Aleksandrov.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'nqueues' counter is protected by the lru list lock,
once thats removed this needs to be converted to atomic
counter. Given this isn't used for anything except for
reporting it to userspace via /proc, just remove it.
We still report the memory currently used by fragment
reassembly queues.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the high_thresh limit is reached we try to toss the 'oldest'
incomplete fragment queues until memory limits are below the low_thresh
value. This happens in softirq/packet processing context.
This has two drawbacks:
1) processors might evict a queue that was about to be completed
by another cpu, because they will compete wrt. resource usage and
resource reclaim.
2) LRU list maintenance is expensive.
But when constantly overloaded, even the 'least recently used' element is
recent, so removing 'lru' queue first is not 'fairer' than removing any
other fragment queue.
This moves eviction out of the fast path:
When the low threshold is reached, a work queue is scheduled
which then iterates over the table and removes the queues that exceed
the memory limits of the namespace. It sets a new flag called
INET_FRAG_EVICTED on the evicted queues so the proper counters will get
incremented when the queue is forcefully expired.
When the high threshold is reached, no more fragment queues are
created until we're below the limit again.
The LRU list is now unused and will be removed in a followup patch.
Joint work with Nikolay Aleksandrov.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
First step to move eviction handling into a work queue.
We lose two spots that accounted evicted fragments in MIB counters.
Accounting will be restored since the upcoming work-queue evictor
invokes the frag queue timer callbacks instead.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull fuse fixes from Miklos Szeredi:
"These two pathes fix issues with the kernel-userspace protocol changes
in v3.15"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: add FUSE_NO_OPEN_SUPPORT flag to INIT
fuse: s_time_gran fix
The ulog targets were recently killed. A few references to the Kconfig
macros CONFIG_IP_NF_TARGET_ULOG and CONFIG_BRIDGE_EBT_ULOG were left
untouched. Kill these too.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
eBPF is used by socket filtering, seccomp and soon by tracing and
exposed to userspace, therefore 'sock_filter_int' name is not accurate.
Rename it to 'bpf_insn'
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- L2 cache regression fix for a warning about trying to access
a read-only register
- GPMC ECC software fallback regression fix for omap3
- Fix for dra7 pinctrl pull-up direction that causes signal issues
for anybody trying to use the internal pull up or down
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJT0PDDAAoJEBvUPslcq6Vz2kkP/2XcaWEl5xWEP6vTFDwy1RDL
apjp2qURWpJ579bT5y5KlGP8vyBeSLfdXl+ccCuHhBrRtYCZfsUdRaii/AHDcsd/
N0p1ZaAQLwfMXUo1sVgW2grSOJEo8QZs8DEZ7eJfE8SnH2g/i4j+VFknYOO1t6vA
+QoHRWcY0CRLNVHSyGGFk235pfdq1ZAKskayzQ4wCjOXuH2tKILjFsiCxItPStih
CmKqZCoO+BQMz7dLTGZsdchDTqf0PceMh7w6PWO65QeJxr16nWmGbqnZRlUGeBqo
vTZO1Rsfb4DlRYRGBxJ1ybVJw2cgmsx8fWKv4eYVrulGNKUE2m3UYj+oYCst0g5i
VOPMwLiLapciCZi/4er2VWtb9sFWY6XaTAJHVRtrtX6RZgZC5c3cWGzkykXOkF7N
Ut7He/TT41uc5OIjuG6WGQNCIfKOmfBcSDeNRqyr9YzZpn6lbJ+U4kk0kco1pyda
IHrRUD+yHuXK0FjZvZMDlWmKqP3QLK+xjL7LRzsiJ4yfymikckjS2RH0iP9PGuCV
T68PPl+rzyHjmVUnbBCgLnatTAQ9uHPCb4Eb5rRZaiO+4nUyqeqikMWU/DsiS4jc
1igudgOx53/0zCCQC/VHVL9HGb4fSrbevMzq8YR3uyyS2b/R8FFbGs6Y6TCIDWdi
TUSv/ckpA4uUbe22SFTn
=/FMt
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v3.16/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Merge "Two regression fixes for omaps and one fix for device
signaling" from Tony Lindgren:
- L2 cache regression fix for a warning about trying to access
a read-only register
- GPMC ECC software fallback regression fix for omap3
- Fix for dra7 pinctrl pull-up direction that causes signal issues
for anybody trying to use the internal pull up or down
* tag 'omap-for-v3.16/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP2+: gpmc: fix gpmc_hwecc_bch_capable()
pinctrl: dra: dt-bindings: Fix pull enable/disable
ARM: OMAP2+: l2c: squelch warning dump on power control setting
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Following patch enables all available tunnel GSO features for OVS
bridge device so that ovs can use hardware offloads available to
underling device.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
In order to allow handlers directly read upcalls from datapath,
we need to support per-handler netlink socket for each vport in
datapath. This commit makes this happen. Also, it is guaranteed
to be backward compatible with previous branch.
Signed-off-by: Alex Wang <alexw@nicira.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Pull libata regression fix from Tejun Heo:
"The last libata/for-3.16-fixes pull contained a regression introduced
by 1871ee134b ("libata: support the ata host which implements a
queue depth less than 32") which in turn was a fix for a regression
introduced earlier while changing queue tag order to accomodate hard
drives which perform poorly if tags are not allocated in circular
order (ugh...).
The regression happens only for SAS controllers making use of libata
to serve ATA devices. They don't fill an ata_host field which is used
by the new tag allocation function leading to NULL dereference.
This patch adds a new intermediate field ata_host->n_tags which is
initialized for both SAS and !SAS cases to fix the issue"
* 'for-3.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
libata: introduce ata_host->n_tags to avoid oops on SAS controllers
I triggered VM_BUG_ON() in vma_address() when I tried to migrate an
anonymous hugepage with mbind() in the kernel v3.16-rc3. This is
because pgoff's calculation in rmap_walk_anon() fails to consider
compound_order() only to have an incorrect value.
This patch introduces page_to_pgoff(), which gets the page's offset in
PAGE_CACHE_SIZE.
Kirill pointed out that page cache tree should natively handle
hugepages, and in order to make hugetlbfs fit it, page->index of
hugetlbfs page should be in PAGE_CACHE_SIZE. This is beyond this patch,
but page_to_pgoff() contains the point to be fixed in a single function.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It hasn't been used since commit 0fd7bac(net: relax rcvbuf limits).
Signed-off-by: Sorin Dumitru <sorin@returnze.ro>
Signed-off-by: David S. Miller <davem@davemloft.net>