This patch add the support of 6RD tunnels management via netlink.
Note that netdev_state_change() is now called when 6RD parameters are updated.
6RD parameters are updated only if there is at least one 6RD attribute.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch provides extensions to VXLAN for supporting Distributed
Overlay Virtual Ethernet (DOVE) networks. The patch includes:
+ a dove flag per VXLAN device to enable DOVE extensions
+ ARP reduction, whereby a bridge-connected VXLAN tunnel endpoint
answers ARP requests from the local bridge on behalf of
remote DOVE clients
+ route short-circuiting (aka L3 switching). Known destination IP
addresses use the corresponding destination MAC address for
switching rather than going to a (possibly remote) router first.
+ netlink notification messages for forwarding table and L3 switching
misses
Changes since v2
- combined bools into "u32 flags"
- replaced loop with !is_zero_ether_addr()
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Various drivers depend on INET because they used to select INET_LRO,
but they have all been converted to use GRO which has no such
dependency.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 2cb1deb56f ('ehea: Remove LRO
support') left behind the Kconfig depends/select and feature flag.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit fa37a9586f ('mlx4_en: Moving to
work with GRO') left behind the Kconfig depends/select, some dead
code and comments referring to LRO.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The wireless and wext includes in net-sysfs.c aren't
needed, so remove them.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If ndo_validate_addr is set to the generic eth_validate_addr
function there is no point in calling is_valid_ether_addr
from driver ndo_open if ndo_open is not used elsewhere in
the driver.
With this change is_valid_ether_addr will be called from the
generic eth_validate_addr function. So there should be no change
in the actual behavior.
Signed-off-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
flush_tasklet is a struct, not a pointer in percpu var.
so use this_cpu_ptr to get the member pointer.
Signed-off-by: Shan Wei <davidshan@tencent.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
linux/vhost.h was included twice.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move to circular buffers management macro and correct an error
with circular buffer initial condition.
Without this patch, the macb_tx_ring_avail() function was
not reporting the proper ring availability at startup:
macb macb: eth0: BUG! Tx Ring full when queue awake!
macb macb: eth0: tx_head = 0, tx_tail = 0
And hanginig forever...
I remove the macb_tx_ring_avail() function and use the
proven macros from circ_buf.h. CIRC_CNT() is used in the
"consumer" part of the driver: macb_tx_interrupt() to match
advice from Documentation/circular-buffers.txt.
Reported-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Tested-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In (464dc801c7 net: Don't export sysctls to unprivileged users)
I typoed and introduced a spurious backslash. Delete it.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove BUG_ONs or convert to WARN_ON_ONCE/WARN_ONs since a failure within a
networking device driver is no reason to shut down the entire machine.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reviewed-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trace all supported and enabled card features to s390dbf.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reviewed-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So far, virtual NICs whether attached to a VSWITCH or a guest LAN were always
displayed as guest LANs in the device driver attributes and messages, while
in fact it is a virtual NIC.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reviewed-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove BUG_ON's in claw driver, since the checked error conditions
are null pointer accesses.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove BUG_ON's in ctcm driver, since the checked error conditions
are null pointer accesses.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminate a variable that is never modified.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reviewed-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow an unpriviled user who has created a user namespace, and then
created a network namespace to effectively use the new network
namespace, by reducing capable(CAP_NET_ADMIN) calls to
ns_capable(net->user_ns,CAP_NET_ADMIN) calls.
Allow setting of the tun iff flags.
Allow creating of tun devices.
Allow adding a new queue to a tun device.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow privileged users in any user namespace to bind to
privileged sockets in network namespaces they control.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow an unpriviled user who has created a user namespace, and then
created a network namespace to effectively use the new network
namespace, by reducing capable(CAP_NET_ADMIN) and
capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
Allow the vlan ioctls:
SET_VLAN_INGRESS_PRIORITY_CMD
SET_VLAN_EGRESS_PRIORITY_CMD
SET_VLAN_FLAG_CMD
SET_VLAN_NAME_TYPE_CMD
ADD_VLAN_CMD
DEL_VLAN_CMD
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow an unpriviled user who has created a user namespace, and then
created a network namespace to effectively use the new network
namespace, by reducing capable(CAP_NET_ADMIN) and
capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
Allow setting bridge paramters via sysfs.
Allow all of the bridge ioctls:
BRCTL_ADD_IF
BRCTL_DEL_IF
BRCTL_SET_BRDIGE_FORWARD_DELAY
BRCTL_SET_BRIDGE_HELLO_TIME
BRCTL_SET_BRIDGE_MAX_AGE
BRCTL_SET_BRIDGE_AGING_TIME
BRCTL_SET_BRIDGE_STP_STATE
BRCTL_SET_BRIDGE_PRIORITY
BRCTL_SET_PORT_PRIORITY
BRCTL_SET_PATH_COST
BRCTL_ADD_BRIDGE
BRCTL_DEL_BRDIGE
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow an unpriviled user who has created a user namespace, and then
created a network namespace to effectively use the new network
namespace, by reducing capable(CAP_NET_ADMIN) and
capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
Allow creation of af_key sockets.
Allow creation of llc sockets.
Allow creation of af_packet sockets.
Allow sending xfrm netlink control messages.
Allow binding to netlink multicast groups.
Allow sending to netlink multicast groups.
Allow adding and dropping netlink multicast groups.
Allow sending to all netlink multicast groups and port ids.
Allow reading the netfilter SO_IP_SET socket option.
Allow sending netfilter netlink messages.
Allow setting and getting ip_vs netfilter socket options.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow an unpriviled user who has created a user namespace, and then
created a network namespace to effectively use the new network
namespace, by reducing capable(CAP_NET_ADMIN) and
capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
Settings that merely control a single network device are allowed.
Either the network device is a logical network device where
restrictions make no difference or the network device is hardware NIC
that has been explicity moved from the initial network namespace.
In general policy and network stack state changes are allowed while
resource control is left unchanged.
Allow the SIOCSIFADDR ioctl to add ipv6 addresses.
Allow the SIOCDIFADDR ioctl to delete ipv6 addresses.
Allow the SIOCADDRT ioctl to add ipv6 routes.
Allow the SIOCDELRT ioctl to delete ipv6 routes.
Allow creation of ipv6 raw sockets.
Allow setting the IPV6_JOIN_ANYCAST socket option.
Allow setting the IPV6_FL_A_RENEW parameter of the IPV6_FLOWLABEL_MGR
socket option.
Allow setting the IPV6_TRANSPARENT socket option.
Allow setting the IPV6_HOPOPTS socket option.
Allow setting the IPV6_RTHDRDSTOPTS socket option.
Allow setting the IPV6_DSTOPTS socket option.
Allow setting the IPV6_IPSEC_POLICY socket option.
Allow setting the IPV6_XFRM_POLICY socket option.
Allow sending packets with the IPV6_2292HOPOPTS control message.
Allow sending packets with the IPV6_2292DSTOPTS control message.
Allow sending packets with the IPV6_RTHDRDSTOPTS control message.
Allow setting the multicast routing socket options on non multicast
routing sockets.
Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, and SIOCDELTUNNEL ioctls for
setting up, changing and deleting tunnels over ipv6.
Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, SIOCDELTUNNEL ioctls for
setting up, changing and deleting ipv6 over ipv4 tunnels.
Allow the SIOCADDPRL, SIOCDELPRL, SIOCCHGPRL ioctls for adding,
deleting, and changing the potential router list for ISATAP tunnels.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow an unpriviled user who has created a user namespace, and then
created a network namespace to effectively use the new network
namespace, by reducing capable(CAP_NET_ADMIN) and
capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
Settings that merely control a single network device are allowed.
Either the network device is a logical network device where
restrictions make no difference or the network device is hardware NIC
that has been explicity moved from the initial network namespace.
In general policy and network stack state changes are allowed
while resource control is left unchanged.
Allow creating raw sockets.
Allow the SIOCSARP ioctl to control the arp cache.
Allow the SIOCSIFFLAG ioctl to allow setting network device flags.
Allow the SIOCSIFADDR ioctl to allow setting a netdevice ipv4 address.
Allow the SIOCSIFBRDADDR ioctl to allow setting a netdevice ipv4 broadcast address.
Allow the SIOCSIFDSTADDR ioctl to allow setting a netdevice ipv4 destination address.
Allow the SIOCSIFNETMASK ioctl to allow setting a netdevice ipv4 netmask.
Allow the SIOCADDRT and SIOCDELRT ioctls to allow adding and deleting ipv4 routes.
Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
adding, changing and deleting gre tunnels.
Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
adding, changing and deleting ipip tunnels.
Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for
adding, changing and deleting ipsec virtual tunnel interfaces.
Allow setting the MRT_INIT, MRT_DONE, MRT_ADD_VIF, MRT_DEL_VIF, MRT_ADD_MFC,
MRT_DEL_MFC, MRT_ASSERT, MRT_PIM, MRT_TABLE socket options on multicast routing
sockets.
Allow setting and receiving IPOPT_CIPSO, IP_OPT_SEC, IP_OPT_SID and
arbitrary ip options.
Allow setting IP_SEC_POLICY/IP_XFRM_POLICY ipv4 socket option.
Allow setting the IP_TRANSPARENT ipv4 socket option.
Allow setting the TCP_REPAIR socket option.
Allow setting the TCP_CONGESTION socket option.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow an unpriviled user who has created a user namespace, and then
created a network namespace to effectively use the new network
namespace, by reducing capable(CAP_NET_ADMIN) and
capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.
Settings that merely control a single network device are allowed.
Either the network device is a logical network device where
restrictions make no difference or the network device is hardware NIC
that has been explicity moved from the initial network namespace.
In general policy and network stack state changes are allowed
while resource control is left unchanged.
Allow ethtool ioctls.
Allow binding to network devices.
Allow setting the socket mark.
Allow setting the socket priority.
Allow setting the network device alias via sysfs.
Allow setting the mtu via sysfs.
Allow changing the network device flags via sysfs.
Allow setting the network device group via sysfs.
Allow the following network device ioctls.
SIOCGMIIPHY
SIOCGMIIREG
SIOCSIFNAME
SIOCSIFFLAGS
SIOCSIFMETRIC
SIOCSIFMTU
SIOCSIFHWADDR
SIOCSIFSLAVE
SIOCADDMULTI
SIOCDELMULTI
SIOCSIFHWBROADCAST
SIOCSMIIREG
SIOCBONDENSLAVE
SIOCBONDRELEASE
SIOCBONDSETHWADDR
SIOCBONDCHANGEACTIVE
SIOCBRADDIF
SIOCBRDELIF
SIOCSHWTSTAMP
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the user calling sendmsg has the appropriate privieleges
in their user namespace allow them to set the uid, gid, and
pid in the SCM_CREDENTIALS control message to any valid value.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of duplicate code in net_ctl_permissions and fix the comment.
Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Allow anyone with CAP_NET_ADMIN rights in the user namespace of the
the netowrk namespace to change sysctls.
- Allow anyone the uid of the user namespace root the same
permissions over the network namespace sysctls as the global root.
- Allow anyone with gid of the user namespace root group the same
permissions over the network namespace sysctl as the global root group.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- In rtnetlink_rcv_msg convert the capable(CAP_NET_ADMIN) check
to ns_capable(net->user-ns, CAP_NET_ADMIN). Allowing unprivileged
users to make netlink calls to modify their local network
namespace.
- In the rtnetlink doit methods add capable(CAP_NET_ADMIN) so
that calls that are not safe for unprivileged users are still
protected.
Later patches will remove the extra capable calls from methods
that are safe for unprivilged users.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for supporting the creation of network namespaces
by unprivileged users, modify all of the per net sysctl exports
and refuse to allow them to unprivileged users.
This makes it safe for unprivileged users in general to access
per net sysctls, and allows sysctls to be exported to unprivileged
users on an individual basis as they are deemed safe.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Current is implicitly avaiable so passing current->nsproxy isn't useful.
- The ctl_table_header is needed to find how the sysctl table is connected
to the rest of sysctl.
- ctl_table_root is avaiable in the ctl_table_header so no need to it.
With these changes it becomes possible to write a version of
net_sysctl_permission that takes into account the network namespace of
the sysctl table, an important feature in extending the user namespace.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The user namespace which creates a new network namespace owns that
namespace and all resources created in it. This way we can target
capability checks for privileged operations against network resources to
the user_ns which created the network namespace in which the resource
lives. Privilege to the user namespace which owns the network
namespace, or any parent user namespace thereof, provides the same
privilege to the network resource.
This patch is reworked from a version originally by
Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The copy of copy_net_ns used when the network stack is not
built is broken as it does not return -EINVAL when attempting
to create a new network namespace. We don't even have
a previous network namespace.
Since we need a copy of copy_net_ns in net/net_namespace.h that is
available when the networking stack is not built at all move the
correct version of copy_net_ns from net_namespace.c into net_namespace.h
Leaving us with just 2 versions of copy_net_ns. One version for when
we compile in network namespace suport and another stub for all other
occasions.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some pieces of network use core pieces of IPv6 stack. Keep
them available while letting new GSO offload pieces depend
on CONFIG_INET.
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
qlcnic_hw.c:1337:17: warning: cast removes address space of expression
qlcnic_hw.c:1337:17: warning: incorrect type in argument 2 (different address spaces)
qlcnic_hw.c:1337:17: expected void volatile [noderef] <asn:2>*addr
qlcnic_hw.c:1337:17: got void *<noident>
qlcnic_hw.c:1337:17: warning: cast removes address space of expression
qlcnic_hw.c:1337:17: warning: incorrect type in argument 1 (different address spaces)
qlcnic_hw.c:1337:17: expected void const volatile [noderef] <asn:2>*addr
qlcnic_hw.c:1337:17: got void *<noident>
The above warnings are originating from the macros QLCNIC_RD_DUMP_REG and
QLCNIC_WR_DUMP_REG.
The warnings are fixed and macros are replaced with equivalent functions
in the only file from where it is called.
The following warnings are fixed by making the functions static.
qlcnic_hw.c:543:5: warning: symbol 'qlcnic_set_fw_loopback' was not declared. Should it be static?
qlcnic_init.c:1853:6: warning: symbol 'qlcnic_process_rcv_diag' was not declared. Should it be static?
Signed-off-by: Sony Chacko <sony.chacko@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following warnings:
qlcnic_main.c: In function 'qlcnic_update_cmd_producer':
qlcnic_main.c:119:51: warning: unused parameter 'adapter' [-Wunused-parameter]
qlcnic_main.c:119: warning: unused parameter adapter
qlcnic_init.c: In function qlcnic_process_lro
qlcnic_init.c:1586: warning: unused parameter sds_ring
qlcnic_init.c: In function qlcnic_process_rcv_diag
qlcnic_init.c:1854: warning: unused parameter sds_ring
qlcnic_init.c: In function qlcnic_fetch_mac
qlcnic_init.c:1938: warning: unused parameter adapter
warning: 'pci_using_dac' may be used uninitialized in this function [-Wmaybe-uninitialized]
qlcnic_main.c:1569:10: note: 'pci_using_dac' was declared here
Signed-off-by: Sony Chacko <sony.chacko@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is a follow-up for patch "net: filter: add vlan tag access"
to support the new VLAN_TAG/VLAN_TAG_PRESENT accessors in BPF JIT.
Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Cc: Matt Evans <matt@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is a follow-up for patch "filter: add XOR instruction for use
with X/K" that implements BPF PowerPC JIT parts for the BPF XOR operation.
Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Cc: Matt Evans <matt@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit a24006ed12 ('ptp: Enable clock
drivers along with associated net/PHY drivers') I wrongly made
PTP_1588_CLOCK_PCH depend on PCH_GBE. The dependency is really the
other way around. Therefore make PCH_GBE select PTP_1588_CLOCK_PCH
and remove the 'default y' from the latter.
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes addrexceeded member from vxlan_dev struct as it is unused.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use bitmap_weight to count the total number of bits set in bitmap.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Cc: linux-sctp@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull KVM fix from Marcelo Tosatti:
"A correction for oops on module init with older Intel hosts."
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Fix invalid secondary exec controls in vmx_cpuid_update()
Merge misc fixes from Andrew Morton.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (12 patches)
revert "mm: fix-up zone present pages"
tmpfs: change final i_blocks BUG to WARNING
tmpfs: fix shmem_getpage_gfp() VM_BUG_ON
mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address
mm: revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures"
rapidio: fix kernel-doc warnings
swapfile: fix name leak in swapoff
memcg: fix hotplugged memory zone oops
mips, arc: fix build failure
memcg: oom: fix totalpages calculation for memory.swappiness==0
mm: fix build warning for uninitialized value
mm: add anon_vma_lock to validate_mm()
Revert commit 7f1290f2f2 ("mm: fix-up zone present pages")
That patch tried to fix a issue when calculating zone->present_pages,
but it caused a regression on 32bit systems with HIGHMEM. With that
change, reset_zone_present_pages() resets all zone->present_pages to
zero, and fixup_zone_present_pages() is called to recalculate
zone->present_pages when the boot allocator frees core memory pages into
buddy allocator. Because highmem pages are not freed by bootmem
allocator, all highmem zones' present_pages becomes zero.
Various options for improving the situation are being discussed but for
now, let's return to the 3.6 code.
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Under a particular load on one machine, I have hit shmem_evict_inode()'s
BUG_ON(inode->i_blocks), enough times to narrow it down to a particular
race between swapout and eviction.
It comes from the "if (freed > 0)" asymmetry in shmem_recalc_inode(),
and the lack of coherent locking between mapping's nrpages and shmem's
swapped count. There's a window in shmem_writepage(), between lowering
nrpages in shmem_delete_from_page_cache() and then raising swapped
count, when the freed count appears to be +1 when it should be 0, and
then the asymmetry stops it from being corrected with -1 before hitting
the BUG.
One answer is coherent locking: using tree_lock throughout, without
info->lock; reasonable, but the raw_spin_lock in percpu_counter_add() on
used_blocks makes that messier than expected. Another answer may be a
further effort to eliminate the weird shmem_recalc_inode() altogether,
but previous attempts at that failed.
So far undecided, but for now change the BUG_ON to WARN_ON: in usual
circumstances it remains a useful consistency check.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>