Commit Graph

66079 Commits

Author SHA1 Message Date
David Sterba
80a773fbfc btrfs: retrieve more info from FS_INFO ioctl
Provide the basic information about filesystem through the ioctl:
* b-tree node size (same as leaf size)
* sector size
* expected alignment of CLONE_RANGE and EXTENT_SAME ioctl arguments

Backward compatibility: if the values are 0, kernel does not provide
this information, the applications should ignore them.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:27 -07:00
David Sterba
7d824b6f9c btrfs: balance filter: add limit of processed chunks
This started as debugging helper, to watch the effects of converting
between raid levels on multiple devices, but could be useful standalone.

In my case the usage filter was not finegrained enough and led to
converting too many chunks at once. Another example use is in connection
with drange+devid or vrange filters that allow to work with a specific
chunk or even with a chunk on a given device.

The limit filter applies last, the value of 0 means no limiting.

CC: Ilya Dryomov <idryomov@gmail.com>
CC: Hugo Mills <hugo@carfax.org.uk>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:20:26 -07:00
Linus Torvalds
54539cd217 Merge branch 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu fix from Tejun Heo:
 "It is very late but this is an important percpu-refcount fix from
  Sebastian Ott.

  The problem is that percpu_ref_*() used __this_cpu_*() instead of
  this_cpu_*().  The difference between the two is that the latter is
  atomic on the local cpu while the former is not.  this_cpu_inc() is
  guaranteed to increment the percpu counter on the cpu that the
  operation is executed on without any synchronization; however,
  __this_cpu_inc() doesn't and if the local cpu invokes the function
  from different contexts (e.g.  process and irq) of the same CPU, it's
  not guaranteed to actually increment as it may be implemented as rmw.

  This bug existed from the get-go but it hasn't been noticed earlier
  probably because on x86 __this_cpu_inc() is equivalent to
  this_cpu_inc() as both get translated into single instruction;
  however, s390 uses the generic rmw implementation and gets affected by
  the bug.  Kudos to Sebastian and Heiko for diagnosing it.

  The change is very low risk and fixes a critical issue on the affected
  architectures, so I think it's a good candidate for inclusion although
  it's very late in the devel cycle.  On the other hand, this has been
  broken since v3.11, so backporting it through -stable post -rc1 won't
  be the end of the world.

  I'll ping Christoph whether __this_cpu_*() ops can be better annotated
  so that it can trigger lockdep warning when used from multiple
  contexts"

* 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu-refcount: fix usage of this_cpu_ops
2014-06-04 09:56:03 -07:00
Sebastian Ott
0c36b390a5 percpu-refcount: fix usage of this_cpu_ops
The percpu-refcount infrastructure uses the underscore variants of
this_cpu_ops in order to modify percpu reference counters.
(e.g. __this_cpu_inc()).

However the underscore variants do not atomically update the percpu
variable, instead they may be implemented using read-modify-write
semantics (more than one instruction).  Therefore it is only safe to
use the underscore variant if the context is always the same (process,
softirq, or hardirq). Otherwise it is possible to lose updates.

This problem is something that Sebastian has seen within the aio
subsystem which uses percpu refcounters both in process and softirq
context leading to reference counts that never dropped to zeroes; even
though the number of "get" and "put" calls matched.

Fix this by using the non-underscore this_cpu_ops variant which
provides correct per cpu atomic semantics and fixes the corrupted
reference counts.

Cc: Kent Overstreet <kmo@daterainc.com>
Cc: <stable@vger.kernel.org> # v3.11+
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
References: http://lkml.kernel.org/g/alpine.LFD.2.11.1406041540520.21183@denkbrett
2014-06-04 12:12:29 -04:00
Jianyu Zhan
c9482a5bdc kernfs: move the last knowledge of sysfs out from kernfs
There is still one residue of sysfs remaining: the sb_magic
SYSFS_MAGIC. However this should be kernfs user specific,
so this patch moves it out. Kerrnfs user should specify their
magic number while mouting.

Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-03 08:11:18 -07:00
Linus Torvalds
cae61ba37b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Unbreak zebra and other netlink apps, from Eric W Biederman.

 2) Some new qmi_wwan device IDs, from Aleksander Morgado.

 3) Fix info leak in DCB netlink handler of qlcnic driver, from Dan
    Carpenter.

 4) inet_getid() and ipv6_select_ident() do not generate monotonically
    increasing ID numbers, fix from Eric Dumazet.

 5) Fix memory leak in __sk_prepare_filter(), from Leon Yu.

 6) Netlink leftover bytes warning message is user triggerable, rate
    limit it.  From Michal Schmidt.

 7) Fix non-linear SKB panic in ipvs, from Peter Christensen.

 8) Congestion window undo needs to be performed even if only never
    retransmitted data is SACK'd, fix from Yuching Cheng.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (24 commits)
  net: filter: fix possible memory leak in __sk_prepare_filter()
  net: ec_bhf: Add runtime dependencies
  tcp: fix cwnd undo on DSACK in F-RTO
  netlink: Only check file credentials for implicit destinations
  ipheth: Add support for iPad 2 and iPad 3
  team: fix mtu setting
  net: fix inet_getid() and ipv6_select_ident() bugs
  net: qmi_wwan: interface #11 in Sierra Wireless MC73xx is not QMI
  net: qmi_wwan: add additional Sierra Wireless QMI devices
  bridge: Prevent insertion of FDB entry with disallowed vlan
  netlink: rate-limit leftover bytes warning and print process name
  bridge: notify user space after fdb update
  net: qmi_wwan: add Netgear AirCard 341U
  net: fix wrong mac_len calculation for vlans
  batman-adv: fix NULL pointer dereferences
  net/mlx4_core: Reset RoCE VF gids when guest driver goes down
  emac: aggregation of v1-2 PLB errors for IER register
  emac: add missing support of 10mbit in emac/rgmii
  can: only rename enabled led triggers when changing the netdev name
  ipvs: Fix panic due to non-linear skb
  ...
2014-06-02 18:16:41 -07:00
Linus Torvalds
8ee7a330fb USB fixes for 3.15-rc8
Here are some fixes for 3.15-rc8 that resolve a number of tiny USB
 issues that have been reported, and there are some new device ids as
 well.
 
 All have been tested in linux-next.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iEYEABECAAYFAlOM9RkACgkQMUfUDdst+ykhbACeKn9kmESO0QMC2ST0kQkQxHJX
 olYAoKbYZxvlZbdSJBmtYDbm1c5wrCfO
 =H5ae
 -----END PGP SIGNATURE-----

Merge tag 'usb-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some fixes for 3.15-rc8 that resolve a number of tiny USB
  issues that have been reported, and there are some new device ids as
  well.

  All have been tested in linux-next"

* tag 'usb-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  xhci: delete endpoints from bandwidth list before freeing whole device
  usb: pci-quirks: Prevent Sony VAIO t-series from switching usb ports
  USB: cdc-wdm: properly include types.h
  usb: cdc-wdm: export cdc-wdm uapi header
  USB: serial: option: add support for Novatel E371 PCIe card
  USB: ftdi_sio: add NovaTech OrionLXm product ID
  USB: io_ti: fix firmware download on big-endian machines (part 2)
  USB: Avoid runtime suspend loops for HCDs that can't handle suspend/resume
2014-06-02 16:56:42 -07:00
Eric W. Biederman
2d7a85f4b0 netlink: Only check file credentials for implicit destinations
It was possible to get a setuid root or setcap executable to write to
it's stdout or stderr (which has been set made a netlink socket) and
inadvertently reconfigure the networking stack.

To prevent this we check that both the creator of the socket and
the currentl applications has permission to reconfigure the network
stack.

Unfortunately this breaks Zebra which always uses sendto/sendmsg
and creates it's socket without any privileges.

To keep Zebra working don't bother checking if the creator of the
socket has privilege when a destination address is specified.  Instead
rely exclusively on the privileges of the sender of the socket.

Note from Andy: This is exactly Eric's code except for some comment
clarifications and formatting fixes.  Neither I nor, I think, anyone
else is thrilled with this approach, but I'm hesitant to wait on a
better fix since 3.15 is almost here.

Note to stable maintainers: This is a mess.  An earlier series of
patches in 3.15 fix a rather serious security issue (CVE-2014-0181),
but they did so in a way that breaks Zebra.  The offending series
includes:

    commit aa4cf9452f
    Author: Eric W. Biederman <ebiederm@xmission.com>
    Date:   Wed Apr 23 14:28:03 2014 -0700

        net: Add variants of capable for use on netlink messages

If a given kernel version is missing that series of fixes, it's
probably worth backporting it and this patch.  if that series is
present, then this fix is critical if you care about Zebra.

Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-02 16:34:09 -07:00
Jiri Pirko
9d0d68faea team: fix mtu setting
Now it is not possible to set mtu to team device which has a port
enslaved to it. The reason is that when team_change_mtu() calls
dev_set_mtu() for port device, notificator for NETDEV_PRECHANGEMTU
event is called and team_device_event() returns NOTIFY_BAD forbidding
the change. So fix this by returning NOTIFY_DONE here in case team is
changing mtu in team_change_mtu().

Introduced-by: 3d249d4c "net: introduce ethernet teaming device"
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-02 14:56:01 -07:00
Eric Dumazet
39c36094d7 net: fix inet_getid() and ipv6_select_ident() bugs
I noticed we were sending wrong IPv4 ID in TCP flows when MTU discovery
is disabled.
Note how GSO/TSO packets do not have monotonically incrementing ID.

06:37:41.575531 IP (id 14227, proto: TCP (6), length: 4396)
06:37:41.575534 IP (id 14272, proto: TCP (6), length: 65212)
06:37:41.575544 IP (id 14312, proto: TCP (6), length: 57972)
06:37:41.575678 IP (id 14317, proto: TCP (6), length: 7292)
06:37:41.575683 IP (id 14361, proto: TCP (6), length: 63764)

It appears I introduced this bug in linux-3.1.

inet_getid() must return the old value of peer->ip_id_count,
not the new one.

Lets revert this part, and remove the prevention of
a null identification field in IPv6 Fragment Extension Header,
which is dubious and not even done properly.

Fixes: 87c48fa3b4 ("ipv6: make fragment identifications less predictable")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-02 14:09:28 -07:00
Linus Torvalds
568180a517 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
 "A fair number of fixes across the field.  Nothing terribly
  complicated; the one liners in below changelog should be fairly
  descriptive.

  Noteworthy is the SB1 change which the result of changes to binutils
  resulting in one big gas warning for most files being assembled as
  well as the asid_cache and branch emulation fixes which fix corruption
  or possible uninteded behaviour of kernel or application code.  The
  remainder of fixes are more platforms or subsystem specific"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: R46000: Fix Micro-assembler field overflow for R4600 V2
  MIPS: ptrace: Avoid smp_processor_id() in preemptible code
  MIPS: Lemote 2F: cs5536: mfgpt: use raw locks
  MIPS: SB1: Fix excessive kernel warnings.
  MIPS: RC32434: fix broken PCI resource initialization
  MIPS: malta: memory.c: Initialize the 'memsize' variable
  MIPS: Fix typo when reporting cache and ftlb errors for ImgTec cores
  MIPS: Fix inconsistancy of __NR_Linux_syscalls value
  MIPS: Fix branch emulation of branch likely instructions.
  MIPS: Fix a typo error in AUDIT_ARCH definition
  MIPS: Change type of asid_cache to unsigned long
2014-06-01 18:28:58 -07:00
Linus Torvalds
fe45736f41 Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "The usual random collection of relatively small ARM fixes"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8063/1: bL_switcher: fix individual online status reporting of removed CPUs
  ARM: 8064/1: fix v7-M signal return
  ARM: 8057/1: amba: Add Qualcomm vendor ID.
  ARM: 8052/1: unwind: Fix handling of "Pop r4-r[4+nnn],r14" opcode
  ARM: 8051/1: put_user: fix possible data corruption in put_user
  ARM: 8048/1: fix v7-M setup stack location
2014-05-29 18:31:09 -07:00
Greg Kroah-Hartman
7ac3764fca USB: cdc-wdm: properly include types.h
The file include/uapi/linux/usb/cdc-wdm.h uses a __u16 so it needs to
include types.h as well to make the build system happy.

Fixes: 3edce1cf81 ("USB: cdc-wdm: implement IOCTL_WDM_MAX_COMMAND")
Cc: stable <stable@vger.kernel.org> # 3.10+
Cc: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-27 16:32:10 -07:00
Bjørn Mork
7d1896360f usb: cdc-wdm: export cdc-wdm uapi header
The include/uapi/linux/usb/cdc-wdm.h header defines cdc-wdm
userspace APIs and should be exported by make headers_install.

Cc: <stable@vger.kernel.org> # 3.10, 3.12, 3.14
Fixes: 3edce1cf81 ("USB: cdc-wdm: implement IOCTL_WDM_MAX_COMMAND")
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-27 16:29:39 -07:00
Linus Torvalds
c949ddf9eb Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dmaengine fixes from Vinod Koul:
 "We have three small fixes.

  First one from Andy reverts the devm_request irq as we need to ensure
  the tasklet is killed after irq is freed, so we need to do free irq in
  our code.  Other two from Arnd are fixing the compilation issue in
  omap and sa11x0 drivers with ARM randconfigs"

* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: sa11x0: remove broken #ifdef
  dmaengine: omap: hide filter_fn for built-in drivers
  dmaengine: dw: went back to plain {request,free}_irq() calls
2014-05-27 13:57:00 -07:00
srinik
fbebf59778 ARM: 8057/1: amba: Add Qualcomm vendor ID.
This patch adds Qualcomm amba vendor Id to the list. This ID is used in mmci driver. The ID selected in same lines like 0x41 is "A" for ARM, 0x51 is "Q" for Qualcomm.

As there are no physical register on Qcom SOC for amba vendor id, this is a fake ID assigned based on "Q" prefix from Qualcomm.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-05-25 23:44:27 +01:00
Linus Torvalds
037430078f Two fixes for -stable:
1/ async_mult() sometimes maps less buffers than initially requested.
    We end up freeing dmaengine_unmap_data on an invalid pool.
 
 2/ mv_xor: register write ordering fix
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTfoDyAAoJEB7SkWpmfYgC5sUP/3r0GU3071DcZ74rW7yW9UMC
 EW+GyYLISl2X4+zdTTV4iLhtOx4tnS+xock5LylrMi0WKD8ma4/6057dc+ham9t7
 1/5qeZyZ/lKUjOjhNBqBevzyKubWryafRklOnglP6KjNiDSKbVsvl7pgzPLTjhCE
 7tqv3/86Q6W7MIGFDUPQhrd2zTk46Gmu7GZixJ1jN3YhW0hqsqbk+QJALjT7rG5n
 0q7f7PM83e+f5LEqlM1Om4AEkCaYrf6P8Nkatcut6tecixplI2kDOgK/0JEIA5FT
 Fnr/3j6/WFYrpn/BVRR+33FeTA2ZUl2kWSVpyOKNiAHrzE2AGhlFN+uZelUERUGE
 cDeEw//dFdjf+JjotTdBtukzHlijhSLttX5cF2lK5XOUql1AK1WnWsPm4iZAQ1/x
 Hv3k6BnhwfIyyYRyo1qIr4QqYxpMjgW1U4Cn9WV03A+6OT4UK5Q/8dvIg6KtmF+o
 fXzHWCI/pRU6zLj5t1UmoNL+0EmT/gDAEkmHXdcbbKQUlB3tiGyPAzt1mCwlqcx8
 p3VQk9BqlNvDzYtGRCabCGaRxnxtlY0agBu+2AmXIwg6a+qd70BSoo1Nypb2bTMk
 qrHmWX7RvqssPiWa9PUfIsUYSoCG162n6Eh93cMgPNoxXl380o+kD8t5wzHoWSoB
 UdMsu58bfuBbDVIQM/ec
 =XFIL
 -----END PGP SIGNATURE-----

Merge tag 'dmaengine-fixes-3.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine

Pull dmaengine fixes from Dan Williams:
 "Two fixes for -stable:

   - async_mult() sometimes maps less buffers than initially requested.
      We end up freeing dmaengine_unmap_data on an invalid pool.

   - mv_xor: register write ordering fix"

* tag 'dmaengine-fixes-3.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine:
  dmaengine: fix dmaengine_unmap failure
  dma: mv_xor: Flush descriptors before activating a channel
2014-05-23 16:52:15 -07:00
Linus Torvalds
5fa6a683c0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "It looks like a sizeble collection but this is nearly 3 weeks of bug
  fixing while you were away.

   1) Fix crashes over IPSEC tunnels with NAT, the latter can reroute
      the packet through a non-IPSEC protected path and the code has to
      be able to handle SKBs attached to routes lacking an attached xfrm
      state.  From Steffen Klassert.

   2) Fix OOPSs in ipv4 and ipv6 ipsec layers for unsupported
      sub-protocols, also from Steffen Klassert.

   3) Set local_df on fragmented netfilter skbs otherwise we won't be
      able to forward successfully, from Florian Westphal.

   4) cdc_mbim ipv6 neighbour code does __vlan_find_dev_deep without
      holding RCU lock, from Bjorn Mork.

   5) local_df test in ip_may_fragment is inverted, from Florian
      Westphal.

   6) jme driver doesn't check for DMA mapping failures, from Neil
      Horman.

   7) qlogic driver doesn't calculate number of TX queues properly, from
      Shahed Shaikh.

   8) fib_info_cnt can drift irreversibly positive if we fail to
      allocate the fi->fib_metrics array, from Sergey Popovich.

   9) Fix use after free in ip6_route_me_harder(), also from Sergey
      Popovich.

  10) When SYSCTL is disabled, we don't handle local_port_range and
      ping_group_range defaults properly at all, from Cong Wang.

  11) Unaccelerated VLAN tagged frames improperly handled by cdc_mbim
      driver, fix from Bjorn Mork.

  12) cassini driver needs nested lock annotations for TX locking, from
      Emil Goode.

  13) On init error ipv6 VTI driver can unregister pernet ops twice,
      oops.  Fix from Mahtias Krause.

  14) If macvlan device is down, don't propagate IFF_ALLMULTI changes,
      from Peter Christensen.

  15) Missing NULL pointer check while parsing netlink config options in
      ip6_tnl_validate().  From Susant Sahani.

  16) Fix handling of neighbour entries during ipv6 router reachability
      probing, from Duan Jiong.

  17) x86 and s390 JIT address randomization has some address
      calculation bugs leading to crashes, from Alexei Starovoitov and
      Heiko Carstens.

  18) Clear up those uglies with nop patching and net_get_random_once(),
      from Hannes Frederic Sowa.

  19) Option length miscalculated in ip6_append_data(), fix also from
      Hannes Frederic Sowa.

  20) A while ago we fixed a race during device unregistry when a
      namespace went down, turns out there is a second place that needs
      similar protection.  From Cong Wang.

  21) In the new Altera TSE driver multicast filtering isn't working,
      disable it and just use promisc mode until the cause is found.
      From Vince Bridgers.

  22) When we disable router enabling in ipv6 we have to flush the
      cached routes explicitly, from Duan Jiong.

  23) NBMA tunnels should not cache routes on the tunnel object because
      the key is variable, from Timo Teräs.

  24) With stacked devices GRO information in skb->cb[] can be not setup
      properly, make sure it is in all code paths.  From Eric Dumazet.

  25) Really fix stacked vlan locking, multiple levels of nesting with
      intervening non-vlan devices are possible.  From Vlad Yasevich.

  26) Fallback ipip tunnel device's mtu is not setup properly, from
      Steffen Klassert.

  27) The packet scheduler's tcindex filter can crash because we
      structure copy objects with list_head's inside, oops.  From Cong
      Wang.

  28) Fix CHECKSUM_COMPLETE handling for ipv6 GRE tunnels, from Eric
      Dumazet.

  29) In some configurations 'itag' in __mkroute_input() can end up
      being used uninitialized because of how fib_validate_source()
      works.  Fix it by explitly initializing itag to zero like all the
      other fib_validate_source() callers do, from Li RongQing"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
  batman: fix a bogus warning from batadv_is_on_batman_iface()
  ipv4: initialise the itag variable in __mkroute_input
  bonding: Send ALB learning packets using the right source
  bonding: Don't assume 802.1Q when sending alb learning packets.
  net: doc: Update references to skb->rxhash
  stmmac: Remove unbalanced clk_disable call
  ipv6: gro: fix CHECKSUM_COMPLETE support
  net_sched: fix an oops in tcindex filter
  can: peak_pci: prevent use after free at netdev removal
  ip_tunnel: Initialize the fallback device properly
  vlan: Fix build error wth vlan_get_encap_level()
  can: c_can: remove obsolete STRICT_FRAME_ORDERING Kconfig option
  MAINTAINERS: Pravin Shelar is Open vSwitch maintainer.
  bnx2x: Convert return 0 to return rc
  bonding: Fix alb mode to only use first level vlans.
  bonding: Fix stacked device detection in arp monitoring
  macvlan: Fix lockdep warnings with stacked macvlan devices
  vlan: Fix lockdep warning with stacked vlan devices.
  net: Allow for more then a single subclass for netif_addr_lock
  net: Find the nesting level of a given device by type.
  ...
2014-05-23 15:29:43 -07:00
Linus Torvalds
f02f79dbcb Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "The biggest commit is an irqtime accounting loop latency fix, the rest
  are misc fixes all over the place: deadline scheduling, docs, numa,
  balancer and a bad to-idle latency fix"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/numa: Initialize newidle balance stats in sd_numa_init()
  sched: Fix updating rq->max_idle_balance_cost and rq->next_balance in idle_balance()
  sched: Skip double execution of pick_next_task_fair()
  sched: Use CPUPRI_NR_PRIORITIES instead of MAX_RT_PRIO in cpupri check
  sched/deadline: Fix memory leak
  sched/deadline: Fix sched_yield() behavior
  sched: Sanitize irq accounting madness
  sched/docbook: Fix 'make htmldocs' warnings caused by missing description
2014-05-23 10:04:04 -07:00
Linus Torvalds
e6a32c3ad1 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "The biggest changes are fixes for races that kept triggering Trinity
  crashes, plus liblockdep build fixes and smaller misc fixes.

  The liblockdep bits in perf/urgent are a pull mistake - they should
  have been in locking/urgent - but by the time I noticed other commits
  were added and testing was done :-/ Sorry about that"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Fix a race between ring_buffer_detach() and ring_buffer_attach()
  perf: Prevent false warning in perf_swevent_add
  perf: Limit perf_event_attr::sample_period to 63 bits
  tools/liblockdep: Remove all build files when doing make clean
  tools/liblockdep: Build liblockdep from tools/Makefile
  perf/x86/intel: Fix Silvermont's event constraints
  perf: Fix perf_event_init_context()
  perf: Fix race in removing an event
2014-05-23 10:02:34 -07:00
Masatake YAMATO
ad0f614e47 wait: swap EXIT_ZOMBIE(Z) and EXIT_DEAD(X) chars in TASK_STATE_TO_CHAR_STR
In commit ad86622b47 ("wait: swap EXIT_ZOMBIE and EXIT_DEAD to hide
EXIT_TRACE from user-space") the order of task state definitions were
changed: EXIT_DEAD and EXIT_ZOMBIE were swapped.  Though the charterers
for the states in TASK_STATE_TO_CHAR_STR string were not updated.  This
patch synchronizes the string to the order of definitions.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-05-23 09:37:29 -07:00
Huacai Chen
dc93f7b68a MIPS: Fix a typo error in AUDIT_ARCH definition
Missing a "|" in AUDIT_ARCH_MIPSEL64N32 macro definition.

Signed-off-by: Huacai Chen <chenhc@lemote.com>
Reviewed-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/6978/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-05-22 13:45:17 +02:00
Xuelin Shi
c1f43dd9c2 dmaengine: fix dmaengine_unmap failure
The count which is used to get_unmap_data maybe not the same as the
count computed in dmaengine_unmap which causes to free data in a
wrong pool.

This patch fixes this issue by keeping the map count with unmap_data
structure and use this count to get the pool.

Cc: <stable@vger.kernel.org>
Signed-off-by: Xuelin Shi <xuelin.shi@freescale.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2014-05-21 14:02:37 -07:00
Linus Torvalds
80932ec1c0 Merge branch 'renameat2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull renameat2 arch support from Miklos Szeredi:
 "I've collected architecture patches for the renameat2 syscall that
  maintainers acked and/or asked me to queue.

  This adds architecture support for the renameat2 syscall to m68k,
  parisc, ia64 and through asm-generic to arc, arm64, c6x, hexagon,
  metag, openrisc, score, tile, unicore32"

* 'renameat2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  scripts/checksyscalls.sh: Make renameat optional
  asm-generic: Add renameat2 syscall
  ia64: add renameat2 syscall
  parisc: add renameat2 syscall
  m68k: add renameat2 syscall
2014-05-22 05:34:57 +09:00
Linus Torvalds
439c610992 Driver core fixes for 3.15-rc6
Here are two driver core (well, sysfs) fixes for 3.15-rc6 that resolve
 some reported issues and a regression from 3.13.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iEUEABECAAYFAlN8LzgACgkQMUfUDdst+ynJnQCeKQt7KdEBlHAKI5/iP2IQVNNx
 KG8AmMepPCjpp9/MbrFQnx3miGgNEug=
 =813a
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are two driver core (well, sysfs) fixes for 3.15-rc6 that resolve
  some reported issues and a regression from 3.13"

* tag 'driver-core-3.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  sysfs: make sure read buffer is zeroed
  kernfs, sysfs, cgroup: restrict extra perm check on open to sysfs
2014-05-21 18:59:25 +09:00
Linus Torvalds
06eb4cc2e7 Merge branch 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull more cgroup fixes from Tejun Heo:
 "Three more patches to fix cgroup_freezer breakage due to the recent
  cgroup internal locking changes - an operation cgroup_freezer was
  using now requires sleepable context and cgroup_freezer was invoking
  that while holding a spin lock.  cgroup_freezer was using an overly
  elaborate hierarchical locking scheme.

  While it's possible to convert the hierarchical spinlocks directly to
  mutexes, this patch simplifies the overall locking so that it uses a
  global mutex.  This has the added benefit of avoiding iterating
  potentially huge number of tasks under a spinlock.  While the patch is
  on the larger side in the devel cycle, the changes made are mostly
  straight-forward and the locking logic is a lot simpler afterwards"

* 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: fix rcu_read_lock() leak in update_if_frozen()
  cgroup_freezer: replace freezer->lock with freezer_mutex
  cgroup: introduce task_css_is_root()
2014-05-21 18:36:40 +09:00
Linus Torvalds
31a3fcab11 Drivercore bugfixes for v3.15
This branch contains bug fixes important to get into v3.15. There is a
 fix for modifying properties seen during early boot, a fix for an
 incorrect prototype when CONFIG_OF=n, and a couple of corrections to
 device tree memory nodes on  a few platforms.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTdh+wAAoJEMWQL496c2LNfOsQAI/zg5nrUcTEPMg9MXANCPvL
 4cGQTbN0bbWLY55wXTglyw/1qlPWmGc7nE5EOpeuVU/HO/EOzDckMcXcm/kUX6mk
 7oZjrTVmAySBgBt1XdHOpN6C6IMfiFtsyLvUnpxF0D/Vm9FsD1NyfHlhPmExm4Gg
 DSPXf5YmgT9AZL4f1NtCOCcsm/zNpGNDGQLvwDU5CrUKYNAivv+C42ysScQY0DkG
 VOfSt9mDmRzWL1+cBq0qMEmXWO+vRpV/pg/OZYfgT8TFsJCNv4bsQp1DI+fJucMn
 E48FGuJ2S3YnFBiWc3dCnyEF3J/5zqmu1pH7kXbjEvGWJ0I4c7J1oVqTRAdYDpfy
 PIfAob4X8N9rCELO1P1GPrS7/xBYKjD51RKkT6saowvdhLD1e8XbMhAS1emoc2fq
 l16yCu+mk6WKi7fPOQDLLt9Rp41sx+9tl7XuS27BxoHQdFpLhY4yq1EYRXozuYDb
 oXo3e7tgOJSWLNnoJDU/1v1GE53cpiPC/++hGVg1hHKDCVxz19sUUAsaneDoz74s
 5rvSzyWzM7y5FG6L7pIVT//fRuceY5itmYY91MrOuUVhdN8/1a1altGuT60eol7g
 XYShsFrxs4gemgDZ4tfpva6/fCep3Nqz3brAV/7j8cE51SkdhlyMUftaJFSZvy/S
 LVM/lVHY57KxeODngtXW
 =kY49
 -----END PGP SIGNATURE-----

Merge tag 'dt-for-linus' of git://git.secretlab.ca/git/linux

Pull device tree fixes from Grant Likely:
 "Drivercore bugfixes for v3.15

  This branch contains bug fixes important to get into v3.15.  There is
  a fix for modifying properties seen during early boot, a fix for an
  incorrect prototype when CONFIG_OF=n, and a couple of corrections to
  device tree memory nodes on a few platforms"

* tag 'dt-for-linus' of git://git.secretlab.ca/git/linux:
  mips: dts: Fix missing device_type="memory" property in memory nodes
  arm: dts: Fix missing device_type="memory" for ste-ccu8540
  of: fix CONFIG_OF=n prototype of of_node_full_name()
  of: make of_update_property() usable earlier in the boot process
2014-05-21 17:54:55 +09:00
Arnd Bergmann
a8246fedac dmaengine: omap: hide filter_fn for built-in drivers
It is not possible to reference the omap_dma_filter_fn filter
function from a built-in driver if the dmaengine driver itself
is a loadable module, which is a valid configuration otherwise.

This provides only the dummy alternative if the function
is referenced by a built-in driver to allow a successful
build. The filter function is only required by ATAGS based
platforms, which will continue to be broken after this change
for the bogus configuration. When booting from DT, with the
dma channels correctly listed there, it will work fine.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Tony Lindgren <tony@atomide.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: dmaengine@vger.kernel.org
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2014-05-21 11:40:49 +05:30
Vlad Yasevich
e1618d461c vlan: Fix build error wth vlan_get_encap_level()
The new function vlan_get_encap_level() uses vlan_dev_priv()
which is only conditionally avaialble when VLAN support is
enabled.  Make vlan_get_encap_level() conditionally available
as well.

Fixes: 44a4085538 ("bonding: Fix stacked device detection in arp monitoring")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-20 11:24:26 -04:00
James Hogan
63ba600028 asm-generic: Add renameat2 syscall
Add the renameat2 syscall to the generic syscall list, which is used by the
following architectures: arc, arm64, c6x, hexagon, metag, openrisc, score,
tile, unicore32.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: linux-arch@vger.kernel.org
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: linux-hexagon@vger.kernel.org
Cc: linux-metag@vger.kernel.org
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
2014-05-20 10:59:38 +02:00
Linus Torvalds
c7d6891a77 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
 "MIPS fixes for various loose ends:

   - Fix workarounds for R4000 erratum.
   - Patch up DEC, Siemens-Nixdorf and Loongson hardware support.
   - Wire up renameat2 syscall.
   - Delete unused file - it was causing false warnings from maintenance
     scripts.
   - Revert a patch because it's functionality is now implemented twice
     which causes superfluous /proc/cpuinfo output.
   - Fix a microMIPS regression"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: mm: Fix broken microMIPS kernel regression.
  MIPS: Add new AUDIT_ARCH token for the N32 ABI on MIPS64
  MIPS: Wire up renameat2 syscall.
  MIPS: inst.h: Rename BITFIELD_FIELD to __BITFIELD_FIELD.
  MIPS: Remove file missed when removing rm9k support a while ago.
  MIPS/loongson2_cpufreq: Fix CPU clock rate setting
  MIPS: Loongson: No need to select GENERIC_HARDIRQS_NO__DO_IRQ
  MIPS: csum_partial.S CPU_DADDI_WORKAROUNDS bug fix
  MIPS: __strncpy_from_user_asm CPU_DADDI_WORKAROUNDS bug fix
  MIPS: __delay CPU_DADDI_WORKAROUNDS bug fix
  MIPS: DEC/SNI: O32 wrapper stack switching fixes
  MIPS: DEC: Bus error handler <asm/cpu-type.h> fixes
  MAINTAINERS: TURBOchannel: Update entry
  Revert "MIPS: MT: proc: Add support for printing VPE and TC ids"
2014-05-20 16:47:33 +09:00
Linus Torvalds
41abc90228 Metag architecture and related fixes for v3.15
Mostly fixes for metag and parisc relating to upgrowing stacks.
 
 * Fix missing compiler barriers in metag memory barriers.
 * Fix BUG_ON on metag when RLIMIT_STACK hard limit is increased beyond
   safe value.
 * Make maximum stack size configurable. This reduces the default user
   stack size back to 80MB (especially on parisc after their removal of
   _STK_LIM_MAX override). This only affects metag and parisc.
 * Remove metag _STK_LIM_MAX override to match other arches and follow
   parisc, now that it is safe to do so (due to the BUG_ON fix mentioned
   above).
 * Finally now that both metag and parisc _STK_LIM_MAX overrides have
   been removed, it makes sense to remove _STK_LIM_MAX altogether.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJTdAc3AAoJEGwLaZPeOHZ6L2QP/ihJag44CyWKKpeu/5FUkjP6
 62wX4cYKCFR9pTkOZDViWs7c+xrmW6OtORfQKuXu1g68L3v2cwb0HmcvybQ75pIQ
 CbaN+d5OnGPjHGYCSVqQBKlJ0qbcgQfoNUuCVOZx9kZgnCYQhJlh4HYRwHdUv9WY
 1FA3wor/JTTAiKvPBv+/t4NzTpTafpSIhYLahjxZbtuU1WjEfmj8QgWQpzTEJSeZ
 AyNE/nDlcYcdq4lDxMz2pcQfmJ2MpE56wvXJ7IdXadLaLp4yzc+WTAvFzNJ1XnAN
 2IcyNBpgF/vMXCbErA9QQegYwKd9jpF0w3oQmNLkgr27Kv27iL2sLIEWVn3FAXCu
 p+I0ypMlkD/gSdofCUaWTiGGOQiKMqAWJMfjky8RjA7Qz5TdLCldpjjuZEMKl8hM
 SLjkmgZHMG2/rJ8MosOL+ARAXl88v25gfM6rNIPTtMzH72qevrHgjFPj6pWHejhE
 0E43yDS+zt215HrFCXYBhVbFY1NM7JeBS8NFd9Y/8LKTWc8QSi2h8Q1ZaobKJi4O
 0zlKxRRR4QmmtF7S5wL/qOQ0U95HBvYSx+Ssp3C0eh/PEkZYWm0jiXtaKBCYtnDx
 33wRutv+R9sSkKaiiURBh9/VPWFLQ1ak5z+ejqrv32+oBzt/zmxb7LgwsxdAbAms
 9r/8XaY3V+JBPw7UxfQN
 =aveq
 -----END PGP SIGNATURE-----

Merge tag 'metag-for-v3.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag

Pull Metag architecture and related fixes from James Hogan:
 "Mostly fixes for metag and parisc relating to upgrowing stacks.

   - Fix missing compiler barriers in metag memory barriers.
   - Fix BUG_ON on metag when RLIMIT_STACK hard limit is increased
     beyond safe value.
   - Make maximum stack size configurable.  This reduces the default
     user stack size back to 80MB (especially on parisc after their
     removal of _STK_LIM_MAX override).  This only affects metag and
     parisc.
   - Remove metag _STK_LIM_MAX override to match other arches and follow
     parisc, now that it is safe to do so (due to the BUG_ON fix
     mentioned above).
   - Finally now that both metag and parisc _STK_LIM_MAX overrides have
     been removed, it makes sense to remove _STK_LIM_MAX altogether"

* tag 'metag-for-v3.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
  asm-generic: remove _STK_LIM_MAX
  metag: Remove _STK_LIM_MAX override
  parisc,metag: Do not hardcode maximum userspace stack size
  metag: Reduce maximum stack size to 256MB
  metag: fix memory barriers
2014-05-20 14:30:34 +09:00
Linus Torvalds
3f017a4ca2 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "Two small updates from the irq departement:

   - Provide missing inline stub for a SMP only function

   - Add sub-maintainer for the drivers/irqchip/ part of the irq
     subsystem.  YAY!"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Add co-maintainer for drivers/irqchip
  genirq: Provide irq_force_affinity fallback for non-SMP
2014-05-20 14:18:04 +09:00
Peter Zijlstra
b69cf53640 perf: Fix a race between ring_buffer_detach() and ring_buffer_attach()
Alexander noticed that we use RCU iteration on rb->event_list but do
not use list_{add,del}_rcu() to add,remove entries to that list, nor
do we observe proper grace periods when re-using the entries.

Merge ring_buffer_detach() into ring_buffer_attach() such that
attaching to the NULL buffer is detaching.

Furthermore, ensure that between any 'detach' and 'attach' of the same
event we observe the required grace period, but only when strictly
required. In effect this means that only ioctl(.request =
PERF_EVENT_IOC_SET_OUTPUT) will wait for a grace period, while the
normal initial attach and final detach will not be delayed.

This patch should, I think, do the right thing under all
circumstances, the 'normal' cases all should never see the extra grace
period, but the two cases:

 1) PERF_EVENT_IOC_SET_OUTPUT on an event which already has a
    ring_buffer set, will now observe the required grace period between
    removing itself from the old and attaching itself to the new buffer.

    This case is 'simple' in that both buffers are present in
    perf_event_set_output() one could think an unconditional
    synchronize_rcu() would be sufficient; however...

 2) an event that has a buffer attached, the buffer is destroyed
    (munmap) and then the event is attached to a new/different buffer
    using PERF_EVENT_IOC_SET_OUTPUT.

    This case is more complex because the buffer destruction does:
      ring_buffer_attach(.rb = NULL)
    followed by the ioctl() doing:
      ring_buffer_attach(.rb = foo);

    and we still need to observe the grace period between these two
    calls due to us reusing the event->rb_entry list_head.

In order to make 2 happen we use Paul's latest cond_synchronize_rcu()
call.

Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140507123526.GD13658@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-05-19 21:44:56 +09:00
Vlad Yasevich
44a4085538 bonding: Fix stacked device detection in arp monitoring
Prior to commit fbd929f2dc
	bonding: support QinQ for bond arp interval

the arp monitoring code allowed for proper detection of devices
stacked on top of vlans.  Since the above commit, the
code can still detect a device stacked on top of single
vlan, but not a device stacked on top of Q-in-Q configuration.
The search will only set the inner vlan tag if the route
device is the vlan device.  However, this is not always the
case, as it is possible to extend the stacked configuration.

With this patch it is possible to provision devices on
top Q-in-Q vlan configuration that should be used as
a source of ARP monitoring information.

For example:
ip link add link bond0 vlan10 type vlan proto 802.1q id 10
ip link add link vlan10 vlan100 type vlan proto 802.1q id 100
ip link add link vlan100 type macvlan

Note:  This patch limites the number of stacked VLANs to 2,
just like before.  The original, however had another issue
in that if we had more then 2 levels of VLANs, we would end
up generating incorrectly tagged traffic.  This is no longer
possible.

Fixes: fbd929f2dc (bonding: support QinQ for bond arp interval)
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@redhat.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Ding Tianhong <dingtianhong@huawei.com>
CC: Patric McHardy <kaber@trash.net>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 22:29:05 -04:00
Vlad Yasevich
c674ac30c5 macvlan: Fix lockdep warnings with stacked macvlan devices
Macvlan devices try to avoid stacking, but that's not always
successfull or even desired.  As an example, the following
configuration is perefectly legal and valid:

eth0 <--- macvlan0 <---- vlan0.10 <--- macvlan1

However, this configuration produces the following lockdep
trace:
[  115.620418] ======================================================
[  115.620477] [ INFO: possible circular locking dependency detected ]
[  115.620516] 3.15.0-rc1+ #24 Not tainted
[  115.620540] -------------------------------------------------------
[  115.620577] ip/1704 is trying to acquire lock:
[  115.620604]  (&vlan_netdev_addr_lock_key/1){+.....}, at: [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
[  115.620686]
but task is already holding lock:
[  115.620723]  (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40
[  115.620795]
which lock already depends on the new lock.

[  115.620853]
the existing dependency chain (in reverse order) is:
[  115.620894]
-> #1 (&macvlan_netdev_addr_lock_key){+.....}:
[  115.620935]        [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
[  115.620974]        [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
[  115.621019]        [<ffffffffa07296c3>] vlan_dev_set_rx_mode+0x53/0x110 [8021q]
[  115.621066]        [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
[  115.621105]        [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
[  115.621143]        [<ffffffff815da6be>] __dev_open+0xde/0x140
[  115.621174]        [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
[  115.621174]        [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
[  115.621174]        [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
[  115.621174]        [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
[  115.621174]        [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
[  115.621174]        [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
[  115.621174]        [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
[  115.621174]        [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
[  115.621174]        [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
[  115.621174]        [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
[  115.621174]        [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
[  115.621174]        [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
[  115.621174]        [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
[  115.621174]        [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
[  115.621174]
-> #0 (&vlan_netdev_addr_lock_key/1){+.....}:
[  115.621174]        [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60
[  115.621174]        [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
[  115.621174]        [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
[  115.621174]        [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
[  115.621174]        [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan]
[  115.621174]        [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
[  115.621174]        [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
[  115.621174]        [<ffffffff815da6be>] __dev_open+0xde/0x140
[  115.621174]        [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
[  115.621174]        [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
[  115.621174]        [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
[  115.621174]        [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
[  115.621174]        [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
[  115.621174]        [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
[  115.621174]        [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
[  115.621174]        [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
[  115.621174]        [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
[  115.621174]        [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
[  115.621174]        [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
[  115.621174]        [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
[  115.621174]        [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
[  115.621174]        [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b
[  115.621174]
other info that might help us debug this:

[  115.621174]  Possible unsafe locking scenario:

[  115.621174]        CPU0                    CPU1
[  115.621174]        ----                    ----
[  115.621174]   lock(&macvlan_netdev_addr_lock_key);
[  115.621174]                                lock(&vlan_netdev_addr_lock_key/1);
[  115.621174]                                lock(&macvlan_netdev_addr_lock_key);
[  115.621174]   lock(&vlan_netdev_addr_lock_key/1);
[  115.621174]
 *** DEADLOCK ***

[  115.621174] 2 locks held by ip/1704:
[  115.621174]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff815e6dbb>] rtnetlink_rcv+0x1b/0x40
[  115.621174]  #1:  (&macvlan_netdev_addr_lock_key){+.....}, at: [<ffffffff815da5be>] dev_set_rx_mode+0x1e/0x40
[  115.621174]
stack backtrace:
[  115.621174] CPU: 3 PID: 1704 Comm: ip Not tainted 3.15.0-rc1+ #24
[  115.621174] Hardware name: Hewlett-Packard HP xw8400 Workstation/0A08h, BIOS 786D5 v02.38 10/25/2010
[  115.621174]  ffffffff82339ae0 ffff880465f79568 ffffffff816ee20c ffffffff82339ae0
[  115.621174]  ffff880465f795a8 ffffffff816e9e1b ffff880465f79600 ffff880465b019c8
[  115.621174]  0000000000000001 0000000000000002 ffff880465b019c8 ffff880465b01230
[  115.621174] Call Trace:
[  115.621174]  [<ffffffff816ee20c>] dump_stack+0x4d/0x66
[  115.621174]  [<ffffffff816e9e1b>] print_circular_bug+0x200/0x20e
[  115.621174]  [<ffffffff810d4d43>] __lock_acquire+0x1773/0x1a60
[  115.621174]  [<ffffffff810d3172>] ? trace_hardirqs_on_caller+0xb2/0x1d0
[  115.621174]  [<ffffffff810d57f2>] lock_acquire+0xa2/0x130
[  115.621174]  [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80
[  115.621174]  [<ffffffff816f62e7>] _raw_spin_lock_nested+0x37/0x50
[  115.621174]  [<ffffffff815df49c>] ? dev_uc_sync+0x3c/0x80
[  115.621174]  [<ffffffff815df49c>] dev_uc_sync+0x3c/0x80
[  115.621174]  [<ffffffffa0696d2a>] macvlan_set_mac_lists+0xca/0x110 [macvlan]
[  115.621174]  [<ffffffff815da557>] __dev_set_rx_mode+0x57/0xa0
[  115.621174]  [<ffffffff815da5c6>] dev_set_rx_mode+0x26/0x40
[  115.621174]  [<ffffffff815da6be>] __dev_open+0xde/0x140
[  115.621174]  [<ffffffff815da9ad>] __dev_change_flags+0x9d/0x170
[  115.621174]  [<ffffffff815daaa9>] dev_change_flags+0x29/0x60
[  115.621174]  [<ffffffff811e1db1>] ? mem_cgroup_bad_page_check+0x21/0x30
[  115.621174]  [<ffffffff815e7f11>] do_setlink+0x321/0x9a0
[  115.621174]  [<ffffffff810d394c>] ? __lock_acquire+0x37c/0x1a60
[  115.621174]  [<ffffffff815ea59f>] rtnl_newlink+0x51f/0x730
[  115.621174]  [<ffffffff815ea169>] ? rtnl_newlink+0xe9/0x730
[  115.621174]  [<ffffffff815e6e75>] rtnetlink_rcv_msg+0x95/0x250
[  115.621174]  [<ffffffff810d329d>] ? trace_hardirqs_on+0xd/0x10
[  115.621174]  [<ffffffff815e6dbb>] ? rtnetlink_rcv+0x1b/0x40
[  115.621174]  [<ffffffff815e6de0>] ? rtnetlink_rcv+0x40/0x40
[  115.621174]  [<ffffffff81608b19>] netlink_rcv_skb+0xa9/0xc0
[  115.621174]  [<ffffffff815e6dca>] rtnetlink_rcv+0x2a/0x40
[  115.621174]  [<ffffffff81608150>] netlink_unicast+0xf0/0x1c0
[  115.621174]  [<ffffffff8160851f>] netlink_sendmsg+0x2ff/0x740
[  115.621174]  [<ffffffff815bc9db>] sock_sendmsg+0x8b/0xc0
[  115.621174]  [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0
[  115.621174]  [<ffffffff8119d4f8>] ? might_fault+0xa8/0xb0
[  115.621174]  [<ffffffff8119d4af>] ? might_fault+0x5f/0xb0
[  115.621174]  [<ffffffff815cb51e>] ? verify_iovec+0x5e/0xe0
[  115.621174]  [<ffffffff815bd4b9>] ___sys_sendmsg+0x369/0x380
[  115.621174]  [<ffffffff816faa0d>] ? __do_page_fault+0x11d/0x570
[  115.621174]  [<ffffffff810cfe9f>] ? up_read+0x1f/0x40
[  115.621174]  [<ffffffff816fab04>] ? __do_page_fault+0x214/0x570
[  115.621174]  [<ffffffff8120a10b>] ? mntput_no_expire+0x6b/0x1c0
[  115.621174]  [<ffffffff8120a0b7>] ? mntput_no_expire+0x17/0x1c0
[  115.621174]  [<ffffffff8120a284>] ? mntput+0x24/0x40
[  115.621174]  [<ffffffff815bdbb2>] __sys_sendmsg+0x42/0x80
[  115.621174]  [<ffffffff815bdc02>] SyS_sendmsg+0x12/0x20
[  115.621174]  [<ffffffff816ffd69>] system_call_fastpath+0x16/0x1b

Fix this by correctly providing macvlan lockdep class.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 22:14:49 -04:00
Vlad Yasevich
d38569ab2b vlan: Fix lockdep warning with stacked vlan devices.
This reverts commit dc8eaaa006.
	vlan: Fix lockdep warning when vlan dev handle notification

Instead we use the new new API to find the lock subclass of
our vlan device.  This way we can support configurations where
vlans are interspersed with other devices:
  bond -> vlan -> macvlan -> vlan

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 22:14:49 -04:00
Vlad Yasevich
25175ba5c9 net: Allow for more then a single subclass for netif_addr_lock
Currently netif_addr_lock_nested assumes that there can be only
a single nesting level between 2 devices.  However, if we
have multiple devices of the same type stacked, this fails.
For example:
 eth0 <-- vlan0.10 <-- vlan0.10.20

A more complicated configuration may stack more then one type of
device in different order.
Ex:
  eth0 <-- vlan0.10 <-- macvlan0 <-- vlan1.10.20 <-- macvlan1

This patch adds an ndo_* function that allows each stackable
device to report its nesting level.  If the device doesn't
provide this function default subclass of 1 is used.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 22:14:49 -04:00
Vlad Yasevich
4085ebe8c3 net: Find the nesting level of a given device by type.
Multiple devices in the kernel can be stacked/nested and they
need to know their nesting level for the purposes of lockdep.
This patch provides a generic function that determines a nesting
level of a particular device by its type (ex: vlan, macvlan, etc).
We only care about nesting of the same type of devices.

For example:
  eth0 <- vlan0.10 <- macvlan0 <- vlan1.20

The nesting level of vlan1.20 would be 1, since there is another vlan
in the stack under it.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 22:14:49 -04:00
David S. Miller
202630b445 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:

====================
pull request: wireless 2014-05-15

Please pull this batch of fixes for the 3.15 stream...

For the mac80211 bits, Johannes says:

"One fix is to get better VHT performance and the other fixes tracing
garbage or other potential issues with the interface name tracing."

And...

"This has a fix from Emmanuel for a problem I failed to fix - when
association is in progress then it needs to be cancelled while
suspending (I had fixed the same for authentication). Also included a
fix from myself for a userspace API problem that hit the iw tool and a
fix to the remain-on-channel framework."

For the iwlwifi bits, Emmanuel says:

"Alex fixes the scan by disabling the fragmented scan. David prevents
scan offload while associated, the firmware seems not to like it. I
fix a stupid bug I made in BT Coex, and fix a bad #ifdef clause in rate
scaling.  Along with that there is a fix for a NULL pointer exception
that can happen if we load the driver and our ISR gets called because
the interrupt line is shared. The fix has been tested by the reporter."

And...

"We have here a fix from David Spinadel that makes a previous fix more
complete, and an off-by-one issue fixed by Eliad in the same area.
I fix the monitor that broke on the way."

Beyond that...

Daniel Kim's one-liner fixes a brcmfmac regression caused by a typo
in an earlier commit..

Rajkumar Manoharan fixes an ath9k oops reported by David Herrmann.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 15:45:56 -04:00
Matan Barak
ce8d9e0d67 net/mlx4_core: Add UPDATE_QP SRIOV wrapper support
This patch adds UPDATE_QP SRIOV wrapper support.

The mechanism is a general one, but currently only source MAC
index changes are allowed for VFs.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-16 15:12:45 -04:00
Duan Jiong
be7a010d6f ipv6: update Destination Cache entries when gateway turn into host
RFC 4861 states in 7.2.5:

	The IsRouter flag in the cache entry MUST be set based on the
         Router flag in the received advertisement.  In those cases
         where the IsRouter flag changes from TRUE to FALSE as a result
         of this update, the node MUST remove that router from the
         Default Router List and update the Destination Cache entries
         for all destinations using that neighbor as a router as
         specified in Section 7.3.3.  This is needed to detect when a
         node that is used as a router stops forwarding packets due to
         being configured as a host.

Currently, when dealing with NA Message which IsRouter flag changes from
TRUE to FALSE, the kernel only removes router from the Default Router List,
and don't update the Destination Cache entries.

Now in order to update those Destination Cache entries, i introduce
function rt6_clean_tohost().

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-15 23:26:27 -04:00
Cong Wang
200b916f35 rtnetlink: wait for unregistering devices in rtnl_link_unregister()
From: Cong Wang <cwang@twopensource.com>

commit 50624c934d (net: Delay default_device_exit_batch until no
devices are unregistering) introduced rtnl_lock_unregistering() for
default_device_exit_batch(). Same race could happen we when rmmod a driver
which calls rtnl_link_unregister() as we call dev->destructor without rtnl
lock.

For long term, I think we should clean up the mess of netdev_run_todo()
and net namespce exit code.

Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-15 15:30:33 -04:00
John W. Linville
025a58fd9d Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2014-05-15 10:24:28 -04:00
Stephen Rothwell
4c358e1555 of: fix CONFIG_OF=n prototype of of_node_full_name()
Make the CONFIG_OF=n prototpe of of_node_full_name() mateh the CONFIG_OF=y
version.

Fixes compile warnings like this:

sound/soc/soc-core.c: In function 'soc_check_aux_dev':
sound/soc/soc-core.c:1667:3: warning: passing argument 1 of 'of_node_full_name' discards 'const' qualifier from pointer target type [enabled by default]
   codecname = of_node_full_name(aux_dev->codec_of_node);

when CONFIG_OF is not defined.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
2014-05-15 15:19:12 +01:00
James Hogan
ffe6902b66 asm-generic: remove _STK_LIM_MAX
_STK_LIM_MAX could be used to override the RLIMIT_STACK hard limit from
an arch's include/uapi/asm-generic/resource.h file, but is no longer
used since both parisc and metag removed the override. Therefore remove
it entirely, setting the hard RLIMIT_STACK limit to RLIM_INFINITY
directly in include/asm-generic/resource.h.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-arch@vger.kernel.org
Cc: Helge Deller <deller@gmx.de>
Cc: John David Anglin <dave.anglin@bell.net>
2014-05-15 00:32:09 +01:00
John W. Linville
eac94da8b4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 2014-05-14 15:39:45 -04:00
Hannes Frederic Sowa
3d4405226d net: avoid dependency of net_get_random_once on nop patching
net_get_random_once depends on the static keys infrastructure to patch up
the branch to the slow path during boot. This was realized by abusing the
static keys api and defining a new initializer to not enable the call
site while still indicating that the branch point should get patched
up. This was needed to have the fast path considered likely by gcc.

The static key initialization during boot up normally walks through all
the registered keys and either patches in ideal nops or enables the jump
site but omitted that step on x86 if ideal nops where already placed at
static_key branch points. Thus net_get_random_once branches not always
became active.

This patch switches net_get_random_once to the ordinary static_key
api and thus places the kernel fast path in the - by gcc considered -
unlikely path.  Microbenchmarks on Intel and AMD x86-64 showed that
the unlikely path actually beats the likely path in terms of cycle cost
and that different nop patterns did not make much difference, thus this
switch should not be noticeable.

Fixes: a48e42920f ("net: introduce new macro net_get_random_once")
Reported-by: Tuomas Räsänen <tuomasjjrasanen@tjjr.fi>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 00:37:34 -04:00
Markos Chandras
f2d0801f00 MIPS: Add new AUDIT_ARCH token for the N32 ABI on MIPS64
A MIPS64 kernel may support ELF files for all 3 MIPS ABIs
(O32, N32, N64). Furthermore, the AUDIT_ARCH_MIPS{,EL}64 token
does not provide enough information about the ABI for the 64-bit
process. As a result of which, userland needs to use complex
seccomp filters to decide whether a syscall belongs to the o32 or n32
or n64 ABI. Therefore, a new arch token for MIPS64/n32 is added so it
can be used by seccomp to explicitely set syscall filters for this ABI.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: Paul Moore <pmoore@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: linux-mips@linux-mips.org
Link: http://sourceforge.net/p/libseccomp/mailman/message/32239040/
Patchwork: https://patchwork.linux-mips.org/patch/6818/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-05-14 01:39:54 +02:00
Tejun Heo
5024ae29cd cgroup: introduce task_css_is_root()
Determining the css of a task usually requires RCU read lock as that's
the only thing which keeps the returned css accessible till its
reference is acquired; however, testing whether a task belongs to the
root can be performed without dereferencing the returned css by
comparing the returned pointer against the root one in init_css_set[]
which never changes.

Implement task_css_is_root() which can be invoked in any context.
This will be used by the scheduled cgroup_freezer change.

v2: cgroup no longer supports modular controllers.  No need to export
    init_css_set.  Pointed out by Li.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-05-13 11:26:27 -04:00