Commit Graph

468037 Commits

Author SHA1 Message Date
David S. Miller
4afba24e5f sparc64: Skip bogus PCI bridge ranges.
It seems that when a PCI Express bridge is not in use and has no devices
behind it, the ranges property is bogus.  Specifically the size property
is of the form [0xffffffff:...], and if you add this size to the resource
start address the 64-bit calculation will overflow.

Just check specifically for this size value signature and skip them.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 21:17:49 -07:00
David S. Miller
93a6423bd8 sparc64: Expand PCI bridge probing debug logging.
Dump the various aspects of the PCI bridge probed at boot time, most
importantly the bridge number ranges, and the ranges property.

This helps diagnose PCI resource issues and other problems by giving
ofpci_debug=1 on the boot command line.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 21:17:49 -07:00
David S. Miller
db8d457acc Merge git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net
Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2014-08-12

This series contains updates to i40e and e1000e.

Lucas provides a fix for i40e to resolve a compile issue where a header
was missing in the #includes.

Wei Yongjun provides a fix for i40e to resolve a sparse warning, where
a non-static function should be static.

Julia Lawall provides a fix for i40e which was found using Coccinelle,
where there was a typo in the name of the type given to sizeof().

Rickard Strandqvist provides a fix for i40e to replace the use of
strncpy() with strlcpy() to avoid strings that lack null termination.

Jean Sacren provides two e1000e fixes, first is a comment fix and second
removes an excessive space character in a debug message.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:08:55 -07:00
David S. Miller
a4688132a1 Merge branch 'xen-netback-synchronization'
Wei Liu says:

====================
xen-netback: synchronisation between core driver and netback

The zero-copy netback has far more interactions with core network driver than
the old copying backend. One significant thing is that netback now relies on
a callback from core driver to correctly release resources.

However correct synchronisation between core driver and netback is missing.
Currently netback relies on a loop to wait for core driver to release
resources. This is proven not enough and erroneous recently, partly due to code
structure, partly due to missing synchronisation. Short-live domains like
OpenMirage unikernels can easily trigger race in backend, rendering backend
unresponsive.

This patch series aims to slove this issue by introducing proper
synchronisation between core driver and netback.

Chagges in v4:
* avoid using wait queue
* remove dedicated loop for netif_napi_del
* remove unnecessary check on callback

Change in v3: improve commit message in patch 1

Change in v2: fix Zoltan's email address in commit message
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:07:52 -07:00
Wei Liu
b125285821 xen-netback: remove loop waiting function
The original implementation relies on a loop to check if all inflight
packets are freed. Now we have proper reference counting, there's no
need to use loop anymore.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:07:44 -07:00
Wei Liu
a64bd93452 xen-netback: don't stop dealloc kthread too early
Reference count the number of packets in host stack, so that we don't
stop the deallocation thread too early. If not, we can end up with
xenvif_free permanently waiting for deallocation thread to unmap grefs.

Reported-by: Thomas Leonard <talex5@gmail.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:07:44 -07:00
Wei Liu
ea2c5e1342 xen-netback: move NAPI add/remove calls
Originally netif_napi_add was in xenvif_init_queue and netif_napi_del
was in xenvif_deinit_queue, while kthreads were handled in
xenvif_connect and xenvif_disconnect. Move netif_napi_add and
netif_napi_del to xenvif_connect and xenvif_disconnect so that they
reside together with kthread operations.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:07:44 -07:00
David S. Miller
6880995871 Merge branch 'xen-netback-debugfs'
Wei Liu says:

====================
xen-netback: fix debugfs code

This small series fixes two problems in xen-netback debugfs code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:06:51 -07:00
Wei Liu
628fa76b09 xen-netback: fix debugfs entry creation
The original code is bogus. The function gets called in a loop which
leaks entries created in previous rounds.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:06:43 -07:00
Wei Liu
5c807005fa xen-netback: fix debugfs write length check
Enlarge buffer size and check input length properly, so that we don't
misuse -ENOSPC.

Note that command like "kickXXXX" is still allowed, that's one patch for
another day if we really want to be very strict on this.

Reported-by: SeeChen Ng <seechen81@gmail.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Zoltan Kiss <zoltan.kiss@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:06:43 -07:00
Willem de Bruijn
490cc7d03c net-timestamp: fix missing tcp fragmentation cases
Bytestream timestamps are correlated with a single byte in the skbuff,
recorded in skb_shinfo(skb)->tskey. When fragmenting skbuffs, ensure
that the tskey is set for the fragment in which the tskey falls
(seqno <= tskey < end_seqno).

The original implementation did not address fragmentation in
tcp_fragment or tso_fragment. Add code to inspect the sequence numbers
and move both tskey and the relevant tx_flags if necessary.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:06:06 -07:00
Willem de Bruijn
712a72213f net-timestamp: fix missing ACK timestamp
ACK timestamps are generated in tcp_clean_rtx_queue. The TSO datapath
can break out early, causing the timestamp code to be skipped. Move
the code up before the break.

Reported-by: David S. Miller <davem@davemloft.net>

Also fix a boundary condition: tp->snd_una is the next unacknowledged
byte and between tests inclusive (a <= b <= c), so generate a an ACK
timestamp if (prior_snd_una <= tskey <= tp->snd_una - 1).

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:06:06 -07:00
Libo Chen
cd094927c6 drivers/net/irda/donauboe.c: convert to module_pci_driver
Signed-off-by: Libo Chen <libo.chen@huawei.com>
Cc: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:05:52 -07:00
Maks Naumov
efd5029010 irda: Fix rd_frame control field initialization in irlap_send_rd_frame()
Signed-off-by: Maks Naumov <maksqwe1@ukr.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:05:52 -07:00
Anish Bhatt
8d21797df9 libcxgbi/cxgb4i : Fix ipv6 build failure caught with randconfig
Previous guard of IS_ENABLED(CONFIG_IPV6) is not sufficient when cxgbi drivers
are built into kernel but ipv6 is not.

v2: Use Kconfig to disable compiling cxgbi built into kernel when ipv6 is
compiled as a module

Fixes: e81fbf6cd6 ("libcxgbi:cxgb4i Guard ipv6 code with a config check")
Fixes: fc8d0590d9 ("libcxgbi: Add ipv6 api to driver")
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:04:47 -07:00
Govindarajulu Varadarajan
7b31b4deda tg3: fix return value in tg3_get_stats64
When tp->hw_stats is 0, tg3_get_stats64 should display previously
recorded stats. So it returns &tp->net_stats_prev. But the caller,
dev_get_stats, ignores the return value.

Fix this by assigning tp->net_stats_prev to stats and returning stats.

Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Acked-by: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:04:47 -07:00
Sowmini Varadhan
1d311ad2f9 sunvnet: Schedule maybe_tx_wakeup() as a tasklet from ldc_rx path
At the tail of vnet_event(), if we hit the maybe_tx_wakeup()
condition, we try to take the netif_tx_lock() in the
recv-interrupt-context and can deadlock with dev_watchdog().
vnet_event() should schedule maybe_tx_wakeup() as a tasklet
to avoid this deadlock

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:04:46 -07:00
Sowmini Varadhan
adddc32d6f sunvnet: Do not spin in an infinite loop when vio_ldc_send() returns EAGAIN
ldc_rx -> vnet_rx -> .. -> vnet_walk_rx->vnet_send_ack should not
spin into an infinite loop waiting  EAGAIN to lift.

The sender could have sent us a burst, and gone to lunch without
doing any more ldc_read()'s. That should not cause the receiver to
loop infinitely till soft-lockup kicks in.

Similarly __vnet_tx_trigger should only loop on EAGAIN a finite
number of times. The caller (vnet_start_xmit()) already has code
to reset the dring state and bail on errors from __vnet_tx_trigger

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Raghuram Kothakota <raghuram.kothakota@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:04:46 -07:00
Sowmini Varadhan
1f6394e382 sunvnet: Do not ask for an ACK for every dring transmit
No need to ask for an ack with every vnet_start_xmit()- the single
ACK with DRING_STOPPED is sufficient for the protocol, and we free
the sk_buff in vnet_start_xmit itself, so we dont need an ACK back.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Raghuram Kothakota <raghuram.kothakota@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:04:46 -07:00
chas williams - CONTRACTOR
8356f9d564 lec: Fix bug introduced by b67bfe0d42
b67bfe0d42 (hlist: drop the node
parameter from iterators) dropped the node parameter from
iterators which lec_tbl_walk() was using to iterate the list.

Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:04:46 -07:00
chas williams - CONTRACTOR
de713b5794 atm/svc: Fix blocking in wait loop
One should not call blocking primitives inside a wait loop, since both
require task_struct::state to sleep, so the inner will destroy the
outer state.

sigd_enq() will possibly sleep for alloc_skb().  Move sigd_enq() before
prepare_to_wait() to avoid sleeping while waiting interruptibly.  You do
not actually need to call sigd_enq() after the initial prepare_to_wait()
because we test the termination condition before calling schedule().

Based on suggestions from Peter Zijlstra.

Signed-off-by: Chas Williams <chas@cmf.n4rl.navy.mil>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:04:46 -07:00
Stanislaw Gruszka
10545937e8 myri10ge: check for DMA mapping errors
On IOMMU systems DMA mapping can fail, we need to check for
that possibility.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:04:46 -07:00
Christoph Jaeger
3791b3f6fb openvswitch: Fix memory leak in ovs_vport_alloc() error path
ovs_vport_alloc() bails out without freeing the memory 'vport' points to.

Picked up by Coverity - CID 1230503.

Fixes: 5cd667b0a4 ("openvswitch: Allow each vport to have an array of 'port_id's.")
Signed-off-by: Christoph Jaeger <cj@linux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-13 20:04:46 -07:00
Linus Torvalds
f0094b28f3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Several networking final fixes and tidies for the merge window:

   1) Changes during the merge window unintentionally took away the
      ability to build bluetooth modular, fix from Geert Uytterhoeven.

   2) Several phy_node reference count bug fixes from Uwe Kleine-König.

   3) Fix ucc_geth build failures, also from Uwe Kleine-König.

   4) Fix klog false positivies when netlink messages go to network
      taps, by properly resetting the network header.  Fix from Daniel
      Borkmann.

   5) Sizing estimate of VF netlink messages is too small, from Jiri
      Benc.

   6) New APM X-Gene SoC ethernet driver, from Iyappan Subramanian.

   7) VLAN untagging is erroneously dependent upon whether the VLAN
      module is loaded or not, but there are generic dependencies that
      matter wrt what can be expected as the SKB enters the stack.
      Make the basic untagging generic code, and do it unconditionally.
      From Vlad Yasevich.

   8) xen-netfront only has so many slots in it's transmit queue so
      linearize packets that have too many frags.  From Zoltan Kiss.

   9) Fix suspend/resume PHY handling in bcmgenet driver, from Florian
      Fainelli"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (55 commits)
  net: bcmgenet: correctly resume adapter from Wake-on-LAN
  net: bcmgenet: update UMAC_CMD only when link is detected
  net: bcmgenet: correctly suspend and resume PHY device
  net: bcmgenet: request and enable main clock earlier
  net: ethernet: myricom: myri10ge: myri10ge.c: Cleaning up missing null-terminate after strncpy call
  xen-netfront: Fix handling packets on compound pages with skb_linearize
  net: fec: Support phys probed from devicetree and fixed-link
  smsc: replace WARN_ON() with WARN_ON_SMP()
  xen-netback: Don't deschedule NAPI when carrier off
  net: ethernet: qlogic: qlcnic: Remove duplicate object file from Makefile
  wan: wanxl: Remove typedefs from struct names
  m68k/atari: EtherNEC - ethernet support (ne)
  net: ethernet: ti: cpmac.c: Cleaning up missing null-terminate after strncpy call
  hdlc: Remove typedefs from struct names
  airo_cs: Remove typedef local_info_t
  atmel: Remove typedef atmel_priv_ioctl
  com20020_cs: Remove typedef com20020_dev_t
  ethernet: amd: Remove typedef local_info_t
  net: Always untag vlan-tagged traffic on input.
  drivers: net: Add APM X-Gene SoC ethernet driver support.
  ...
2014-08-13 18:27:40 -06:00
Linus Torvalds
13b102bf48 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull Sparc fixes from David Miller:
 "Sparc bug fixes, one of which was preventing successful SMP boots with
  mainline"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc64: Fix pcr_ops initialization and usage bugs.
  sparc64: Do not disable interrupts in nmi_cpu_busy()
  sparc: Hook up seccomp and getrandom system calls.
  sparc: fix decimal printf format specifiers prefixed with 0x
2014-08-13 18:26:50 -06:00
Linus Torvalds
81c02a21b2 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/apic updates from Thomas Gleixner:
 "This is a major overhaul to the x86 apic subsystem consisting of the
  following parts:

   - Remove obsolete APIC driver abstractions (David Rientjes)

   - Use the irqdomain facilities to dynamically allocate IRQs for
     IOAPICs.  This is a prerequisite to enable IOAPIC hotplug support,
     and it also frees up wasted vectors (Jiang Liu)

   - Misc fixlets.

  Despite the hickup in Ingos previous pull request - caused by the
  missing fixup for the suspend/resume issue reported by Borislav - I
  strongly recommend that this update finds its way into 3.17.  Some
  history for you:

  This is preparatory work for physical IOAPIC hotplug.  The first
  attempt to support this was done by Yinghai and I shot it down because
  it just added another layer of obscurity and complexity to the already
  existing mess without tackling the underlying shortcomings of the
  current implementation.

  After quite some on- and offlist discussions, I requested that the
  design of this functionality must use generic infrastructure, i.e.
  irq domains, which provide all the mechanisms to dynamically map linux
  interrupt numbers to physical interrupts.

  Jiang picked up the idea and did a great job of consolidating the
  existing interfaces to manage the x86 (IOAPIC) interrupt system by
  utilizing irq domains.

  The testing in tip, Linux-next and inside of Intel on various machines
  did not unearth any oddities until Borislav exposed it to one of his
  oddball machines.  The issue was resolved quickly, but unfortunately
  the fix fell through the cracks and did not hit the tip tree before
  Ingo sent the pull request.  Not entirely Ingos fault, I also assumed
  that the fix was already merged when Ingo asked me whether he could
  send it.

  Nevertheless this work has a proper design, has undergone several
  rounds of review and the final fallout after applying it to tip and
  integrating it into Linux-next has been more than moderate.  It's the
  ground work not only for IOAPIC hotplug, it will also allow us to move
  the lowlevel vector allocation into the irqdomain hierarchy, which
  will benefit other architectures as well.  Patches are posted already,
  but they are on hold for two weeks, see below.

  I really appreciate the competence and responsiveness Jiang has shown
  in course of this endavour.  So I'm sure that any fallout of this will
  be addressed in a timely manner.

  FYI, I'm vanishing for 2 weeks into my annual kids summer camp kitchen
  duty^Wvacation, while you folks are drooling at KS/LinuxCon :) But HPA
  will have a look at the hopefully zero fallout until I'm back"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
  x86, irq, PCI: Keep IRQ assignment for PCI devices during suspend/hibernation
  x86/apic/vsmp: Make is_vsmp_box() static
  x86, apic: Remove enable_apic_mode callback
  x86, apic: Remove setup_portio_remap callback
  x86, apic: Remove multi_timer_check callback
  x86, apic: Replace noop_check_apicid_used
  x86, apic: Remove check_apicid_present callback
  x86, apic: Remove mps_oem_check callback
  x86, apic: Remove smp_callin_clear_local_apic callback
  x86, apic: Replace trampoline physical addresses with defaults
  x86, apic: Remove x86_32_numa_cpu_node callback
  x86: intel-mid: Use the new io_apic interfaces
  x86, vsmp: Remove is_vsmp_box() from apic_is_clustered_box()
  x86, irq: Clean up irqdomain transition code
  x86, irq, devicetree: Release IOAPIC pin when PCI device is disabled
  x86, irq, SFI: Release IOAPIC pin when PCI device is disabled
  x86, irq, mpparse: Release IOAPIC pin when PCI device is disabled
  x86, irq, ACPI: Release IOAPIC pin when PCI device is disabled
  x86, irq: Introduce helper functions to release IOAPIC pin
  x86, irq: Simplify the way to handle ISA IRQ
  ...
2014-08-13 18:23:32 -06:00
Linus Torvalds
d27c0d9018 Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/efix fixes from Peter Anvin:
 "Two EFI-related Kconfig changes, which happen to touch immediately
  adjacent lines in Kconfig and thus collapse to a single patch"

* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Enforce CONFIG_RELOCATABLE for EFI boot stub
  x86/efi: Fix 3DNow optimization build failure in EFI stub
2014-08-13 18:21:35 -06:00
Linus Torvalds
7453f33b2e Merge branch 'x86-xsave-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/xsave changes from Peter Anvin:
 "This is a patchset to support the XSAVES instruction required to
  support context switch of supervisor-only features in upcoming
  silicon.

  This patchset missed the 3.16 merge window, which is why it is based
  on 3.15-rc7"

* 'x86-xsave-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, xsave: Add forgotten inline annotation
  x86/xsaves: Clean up code in xstate offsets computation in xsave area
  x86/xsave: Make it clear that the XSAVE macros use (%edi)/(%rdi)
  Define kernel API to get address of each state in xsave area
  x86/xsaves: Enable xsaves/xrstors
  x86/xsaves: Call booting time xsaves and xrstors in setup_init_fpu_buf
  x86/xsaves: Save xstate to task's xsave area in __save_fpu during booting time
  x86/xsaves: Add xsaves and xrstors support for booting time
  x86/xsaves: Clear reserved bits in xsave header
  x86/xsaves: Use xsave/xrstor for saving and restoring user space context
  x86/xsaves: Use xsaves/xrstors for context switch
  x86/xsaves: Use xsaves/xrstors to save and restore xsave area
  x86/xsaves: Define a macro for handling xsave/xrstor instruction fault
  x86/xsaves: Define macros for xsave instructions
  x86/xsaves: Change compacted format xsave area header
  x86/alternative: Add alternative_input_2 to support alternative with two features and input
  x86/xsaves: Add a kernel parameter noxsaves to disable xsaves/xrstors
2014-08-13 18:20:04 -06:00
Linus Torvalds
fd1cf90580 Metag architecture changes for v3.17
Just a couple of minor static analysis fixes, removal of a NULL check
 that should never happen, and fix an error check where an unsigned value
 was being checked to see if it was negative.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJT6JHIAAoJEGwLaZPeOHZ6dfoP/0NrVlO+ldKALXEaPXvLyiWN
 MAhWt8xMHoTsgGlnbnr4Gi6mDDuSnH9CvCoelE/Htn4UOnNbH1bZ/XKJ9gyfnoIv
 NM0xlInWd9dN1PObgcLONBG5KF9vSb8yMhzCZ+GJYZY7mWmjp7641csLL+ksC5uD
 +qkl6+/RsEAXNHC9Hnzib0zpUf/q+u1ZWa5LmvoUTtCuOfdCEeAN+oCuM7+/d9vd
 iyrzXTzdLOn8OA6biNWHH9OVzBEvmAvkJbhccBN1VOUi85OWnmCMrBkB11T9hUSp
 +O2lNfAJYZfqf51OMh8OmdBhcN+GxJf8YIgJqlfzbEzKDRCqhUvvYIDI0tXmTKIC
 CSkifBSWfyi8/+Nrijs+8Z/q7EJJ3VHX2AFwQlZM1eH4cZR4MM0AmSAGsu+vwGsA
 QwkHRcKKqGJUXGusw5+ezPYy7Esn23KkWIBjucZm2gRc1zb+KqB8+TnypJMIpl5U
 Kf8yWahptEH9fTFjqu557E29Oe0lb9fKRGPovdLfcQ9jMNHaZHoix19+kMO+dol2
 LgKhvt1OEC/NCZf/FjFwOvkdRWmkbOLyn2NJinmCGpAmT88LUI/vcpGHpjWgdxhd
 svDd/hFxIUnv10E9wPf27UMrS5zVKJ1loPfQ8N6m89/q3Mx1Q+zErgeq6TWfWBAm
 UJG8U2dv4JeIWUHmcSQ+
 =3kq/
 -----END PGP SIGNATURE-----

Merge tag 'metag-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag

Pull metag architecture updates from James Hogan:
 "Just a couple of minor static analysis fixes, removal of a NULL check
  that should never happen, and fix an error check where an unsigned
  value was being checked to see if it was negative"

* tag 'metag-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
  metag: cachepart: Fix failure check
  metag: hugetlbpage: Remove null pointer checks that could never happen
2014-08-13 18:18:09 -06:00
Linus Torvalds
06b8ab5528 NFS client updates for Linux 3.17
Highlights include:
 
 - Stable fix for a bug in nfs3_list_one_acl()
 - Speed up NFS path walks by supporting LOOKUP_RCU
 - More read/write code cleanups
 - pNFS fixes for layout return on close
 - Fixes for the RCU handling in the rpcsec_gss code
 - More NFS/RDMA fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJT65zoAAoJEGcL54qWCgDyvq8QAJ+OKuC5dpngrZ13i4ZJIcK1
 TJSkWCr44FhYPlrmkLCntsGX6C0376oFEtJ5uqloqK0+/QtvwRNVSQMKaJopKIVY
 mR4En0WwpigxVQdW2lgto6bfOhzMVO+llVdmicEVrU8eeSThATxGNv7rxRzWorvL
 RX3TwBkWSc0kLtPi66VRFQ1z+gg5I0kngyyhsKnLOaHHtpTYP2JDZlRPRkokXPUg
 nmNedmC3JrFFkarroFIfYr54Qit2GW/eI2zVhOwHGCb45j4b2wntZ6wr7LpUdv3A
 OGDBzw59cTpcx3Hij9CFvLYVV9IJJHBNd2MJqdQRtgWFfs+aTkZdk4uilUJCIzZh
 f4BujQAlm/4X1HbPxsSvkCRKga7mesGM7e0sBDPHC1vu0mSaY1cakcj2kQLTpbQ7
 gqa1cR3pZ+4shCq37cLwWU0w1yElYe1c4otjSCttPCrAjXbXJZSFzYnHm8DwKROR
 t+yEDRL5BIXPu1nEtSnD2+xTQ3vUIYXooZWEmqLKgRtBTtPmgSn9Vd8P1OQXmMNo
 VJyFXyjNx5WH06Wbc/jLzQ1/cyhuPmJWWyWMJlVROyv+FXk9DJUFBZuTkpMrIPcF
 NlBXLV1GnA7PzMD9Xt9bwqteERZl6fOUDJLWS9P74kTk5c2kD+m+GaqC/rBTKKXc
 ivr2s7aIDV48jhnwBSVL
 =KE07
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "Highlights include:

   - stable fix for a bug in nfs3_list_one_acl()
   - speed up NFS path walks by supporting LOOKUP_RCU
   - more read/write code cleanups
   - pNFS fixes for layout return on close
   - fixes for the RCU handling in the rpcsec_gss code
   - more NFS/RDMA fixes"

* tag 'nfs-for-3.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (79 commits)
  nfs: reject changes to resvport and sharecache during remount
  NFS: Avoid infinite loop when RELEASE_LOCKOWNER getting expired error
  SUNRPC: remove all refcounting of groupinfo from rpcauth_lookupcred
  NFS: fix two problems in lookup_revalidate in RCU-walk
  NFS: allow lockless access to access_cache
  NFS: teach nfs_lookup_verify_inode to handle LOOKUP_RCU
  NFS: teach nfs_neg_need_reval to understand LOOKUP_RCU
  NFS: support RCU_WALK in nfs_permission()
  sunrpc/auth: allow lockless (rcu) lookup of credential cache.
  NFS: prepare for RCU-walk support but pushing tests later in code.
  NFS: nfs4_lookup_revalidate: only evaluate parent if it will be used.
  NFS: add checks for returned value of try_module_get()
  nfs: clear_request_commit while holding i_lock
  pnfs: add pnfs_put_lseg_async
  pnfs: find swapped pages on pnfs commit lists too
  nfs: fix comment and add warn_on for PG_INODE_REF
  nfs: check wait_on_bit_lock err in page_group_lock
  sunrpc: remove "ec" argument from encrypt_v2 operation
  sunrpc: clean up sparse endianness warnings in gss_krb5_wrap.c
  sunrpc: clean up sparse endianness warnings in gss_krb5_seal.c
  ...
2014-08-13 18:13:19 -06:00
Linus Torvalds
dc1cc85133 xfs: update for 3.17-rc1
This update contains:
 o conversion of the XFS core to pass negative error numbers
 o restructing of core XFS code that is shared with userspace to fs/xfs/libxfs
 o introduction of sysfs interface for XFS
 o bulkstat refactoring
 o demand driven speculative preallocation removal
 o XFS now always requires 64 bit sectors to be configured
 o metadata verifier changes to ensure CRCs are calculated during log recovery
 o various minor code cleanups
 o miscellaneous bug fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJT6dJvAAoJEK3oKUf0dfodYOYP/jhWGv29OrX958WC+U1kCItb
 WmNV6dSENWgtgYRgdtcxa5ZE0HeEndvN63LOCzAB9VegZK7XjpLob9J2f07gMvP1
 u4vv9LvgLdky/tYxttBYHuyen0hQO0r+esRg/xrawpklKQ3Efofi4MUSbb5ZBDzP
 QvVeNhIDI04A0CoDngWDkV1PpbdwDUjVZFZon36XVDTcSCf/h2B+nekjOVJQpEDC
 qJ9nZWRgAcm4IzZZBqwGt0LJ62Z+Ww8jksevWEnjcXm1xfsltL8gf8o6WOX5iXDR
 PSeoJwVRAxLfiQxnQSENEOeLvKWVXNyC/B10w1H20qZkvQJMsX/Fq0MraBL0Ediu
 0qyZgJsOyoOAvJYXWfUyykUe03dF5/cgrB+RpOLmp4B9lCugCEsQHtjRXEHE1EK2
 +4sHc13I7+SLIAlgySmJdrLe8Kq1vamo0/WD2wH+pdOvNmQ7HJzBUFEj1D83b55a
 DWfBjbz/N5lJW82Vfek2coURJGi/1B87koAtODXWfeM+nbBs47z0dv31G235Boq3
 +Qy2qQdCv05xV+CZn/RAmaudAFFnJEV8L4ADpwqSGfVM72ug0KQF8M5DblG1Q8sb
 xRAQq+krG7K4ui1kPNKcxoHoUIYQUJm0p1aU4V2OAQlikYCCaSgRYGF2xzSVC7j9
 uZd4e4bllsBdPF/Se+8W
 =EUjq
 -----END PGP SIGNATURE-----

Merge tag 'xfs-for-linus-3.17-rc1' of git://oss.sgi.com/xfs/xfs

Pull xfs update from Dave Chinner:
 "This update contains:
   - conversion of the XFS core to pass negative error numbers
   - restructing of core XFS code that is shared with userspace to
     fs/xfs/libxfs
   - introduction of sysfs interface for XFS
   - bulkstat refactoring
   - demand driven speculative preallocation removal
   - XFS now always requires 64 bit sectors to be configured
   - metadata verifier changes to ensure CRCs are calculated during log
     recovery
   - various minor code cleanups
   - miscellaneous bug fixes

  The diffstat is kind of noisy because of the restructuring of the code
  to make kernel/userspace code sharing simpler, along with the XFS wide
  change to use the standard negative error return convention (at last!)"

* tag 'xfs-for-linus-3.17-rc1' of git://oss.sgi.com/xfs/xfs: (45 commits)
  xfs: fix coccinelle warnings
  xfs: flush both inodes in xfs_swap_extents
  xfs: fix swapext ilock deadlock
  xfs: kill xfs_vnode.h
  xfs: kill VN_MAPPED
  xfs: kill VN_CACHED
  xfs: kill VN_DIRTY()
  xfs: dquot recovery needs verifiers
  xfs: quotacheck leaves dquot buffers without verifiers
  xfs: ensure verifiers are attached to recovered buffers
  xfs: catch buffers written without verifiers attached
  xfs: avoid false quotacheck after unclean shutdown
  xfs: fix rounding error of fiemap length parameter
  xfs: introduce xfs_bulkstat_ag_ichunk
  xfs: require 64-bit sector_t
  xfs: fix uflags detection at xfs_fs_rm_xquota
  xfs: remove XFS_IS_OQUOTA_ON macros
  xfs: tidy up xfs_set_inode32
  xfs: allow inode allocations in post-growfs disk space
  xfs: mark xfs_qm_quotacheck as static
  ...
2014-08-13 17:49:53 -06:00
Linus Torvalds
cec997093b Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota, reiserfs, UDF updates from Jan Kara:
 "Scalability improvements for quota, a few reiserfs fixes, and couple
  of misc cleanups (udf, ext2)"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  reiserfs: Fix use after free in journal teardown
  reiserfs: fix corruption introduced by balance_leaf refactor
  udf: avoid redundant memcpy when writing data in ICB
  fs/udf: re-use hex_asc_upper_{hi,lo} macros
  fs/quota: kernel-doc warning fixes
  udf: use linux/uaccess.h
  fs/ext2/super.c: Drop memory allocation cast
  quota: remove dqptr_sem
  quota: simplify remove_inode_dquot_ref()
  quota: avoid unnecessary dqget()/dqput() calls
  quota: protect Q_GETFMT by dqonoff_mutex
2014-08-13 17:45:40 -06:00
Linus Torvalds
8d2d441ac4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph updates from Sage Weil:
 "There is a lot of refactoring and hardening of the libceph and rbd
  code here from Ilya that fix various smaller bugs, and a few more
  important fixes with clone overlap.  The main fix is a critical change
  to the request_fn handling to not sleep that was exposed by the recent
  mutex changes (which will also go to the 3.16 stable series).

  Yan Zheng has several fixes in here for CephFS fixing ACL handling,
  time stamps, and request resends when the MDS restarts.

  Finally, there are a few cleanups from Himangi Saraogi based on
  Coccinelle"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (39 commits)
  libceph: set last_piece in ceph_msg_data_pages_cursor_init() correctly
  rbd: remove extra newlines from rbd_warn() messages
  rbd: allocate img_request with GFP_NOIO instead GFP_ATOMIC
  rbd: rework rbd_request_fn()
  ceph: fix kick_requests()
  ceph: fix append mode write
  ceph: fix sizeof(struct tYpO *) typo
  ceph: remove redundant memset(0)
  rbd: take snap_id into account when reading in parent info
  rbd: do not read in parent info before snap context
  rbd: update mapping size only on refresh
  rbd: harden rbd_dev_refresh() and callers a bit
  rbd: split rbd_dev_spec_update() into two functions
  rbd: remove unnecessary asserts in rbd_dev_image_probe()
  rbd: introduce rbd_dev_header_info()
  rbd: show the entire chain of parent images
  ceph: replace comma with a semicolon
  rbd: use rbd_segment_name_free() instead of kfree()
  ceph: check zero length in ceph_sync_read()
  ceph: reset r_resend_mds after receiving -ESTALE
  ...
2014-08-13 17:43:29 -06:00
Linus Torvalds
89838b80bb No significant changes, mostly small fixes here and there. The more important
fixes are:
 
 * UBI deleted list items while iterating the list with 'list_for_each_entry'
 * The UBI block driver did not work properly with very large UBI volumes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJT6IALAAoJECmIfjd9wqK0yRwP/R4XpNyWZOrK72fvb3M6ucK0
 SK3pu6FNfvOSYibmhOtZWjFbiMkgUd7o8jy+4uJa6PIMRQbKGrH/xvujPzON6GQT
 ZZxeJ7n1O8IAZL/dibqFpEv3NQFjgz1IaLnpktQkkp6rWsXCBYNbvFIywWYgkndt
 1w11QG/TAhkJcA/0Xy/+zmkHeXjOQrVeVMbBN2MMVekWg8JgMNTd9mkWGa1A1oGI
 nuq30nIkUP7LX9T/olFGhjFIeeb/bkmWgpzS4K9PI+hOGGCZvIWj/pu8Jj8DAR54
 UGA7qtMkTjxldvHPnuACsSKNS/N7UGY64kTwcucEcJaN2MXmaBFP8ipI/wtb9bCV
 fUoBt+p470E7iDoQymbM5zNw/HPNh4XWVd2X1oYVftmO70PZ6sWeOKWC+tJn/Aj8
 xsrgd1PcWmuyu399EFW/Il5ItOqfYsNIkVFNzIb1O6Pd8Ylt5eYtuyoPXk5aRA4p
 xZ3ohO3OihvWXwwtCUjforwPy37iaX8vwnuI9xdgnEisTHJWR6PYITvcrwqAIH44
 6PEqINU4zwGTGxOTOWKZxdtPVIuChuwqDyY+7+xQD+LIPQ1/BodozExGiQ0lhAmA
 DDC2Te8q5lpZTz03E8ot9zZHmjQoG+uo10DhEzT5U1ycv9p7dISGfbVucyuKDTw9
 JdUnI8cZ2k6mZi8q6dfA
 =zgqa
 -----END PGP SIGNATURE-----

Merge tag 'upstream-3.17-rc1' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS changes from Artem Bityutskiy:
 "No significant changes, mostly small fixes here and there.  The more
  important fixes are:

   - UBI deleted list items while iterating the list with
     'list_for_each_entry'
   - The UBI block driver did not work properly with very large UBI
     volumes"

* tag 'upstream-3.17-rc1' of git://git.infradead.org/linux-ubifs: (21 commits)
  UBIFS: Add log overlap assertions
  Revert "UBIFS: add a log overlap assertion"
  UBI: bugfix in ubi_wl_flush()
  UBI: block: Avoid disk size integer overflow
  UBI: block: Set disk_capacity out of the mutex
  UBI: block: Make ubiblock_resize return something
  UBIFS: add a log overlap assertion
  UBIFS: remove unnecessary check
  UBIFS: remove mst_mutex
  UBIFS: kernel-doc warning fix
  UBI: init_volumes: Ignore volumes with no LEBs
  UBIFS: replace seq_printf by seq_puts
  UBIFS: replace count*size kzalloc by kcalloc
  UBIFS: kernel-doc warning fix
  UBIFS: fix error path in create_default_filesystem()
  UBIFS: fix spelling of "scanned"
  UBIFS: fix some comments
  UBIFS: remove useless @ecc in struct ubifs_scan_leb
  UBIFS: remove useless statements
  UBIFS: Add missing break statements in dbg_chk_pnode()
  ...
2014-08-13 17:42:11 -06:00
Aneesh Kumar K.V
9e813308a5 powerpc/thp: Add tracepoints to track hugepage invalidate
Add tracepoint to track hugepage invalidate. This help us
in debugging difficult to track bugs.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:42 +10:00
Aneesh Kumar K.V
85c1fafd72 powerpc/mm: Use read barrier when creating real_pte
On ppc64 we support 4K hash pte with 64K page size. That requires
us to track the hash pte slot information on a per 4k basis. We do that
by storing the slot details in the second half of pte page. The pte bit
_PAGE_COMBO is used to indicate whether the second half need to be
looked while building real_pte. We need to use read memory barrier while
doing that so that load of hidx is not reordered w.r.t _PAGE_COMBO
check. On the store side we already do a lwsync in __hash_page_4K

CC: <stable@vger.kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:41 +10:00
Aneesh Kumar K.V
7e467245bf powerpc/thp: Use ACCESS_ONCE when loading pmdp
We would get wrong results in compiler recomputed old_pmd. Avoid
that by using ACCESS_ONCE

CC: <stable@vger.kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:41 +10:00
Aneesh Kumar K.V
969b7b208f powerpc/thp: Invalidate with vpn in loop
As per ISA, for 4k base page size we compare 14..65 bits of VA specified
with the entry_VA in tlb. That implies we need to make sure we do a
tlbie with all the possible 4k va we used to access the 16MB hugepage.
With 64k base page size we compare 14..57 bits of VA. Hence we cannot
ignore the lower 24 bits of va while tlbie .We also cannot tlb
invalidate a 16MB entry with just one tlbie instruction because
we don't track which va was used to instantiate the tlb entry.

CC: <stable@vger.kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:40 +10:00
Aneesh Kumar K.V
fc04795575 powerpc/thp: Handle combo pages in invalidate
If we changed base page size of the segment, either via sub_page_protect
or via remap_4k_pfn, we do a demote_segment which doesn't flush the hash
table entries. We do a lazy hash page table flush for all mapped pages
in the demoted segment. This happens when we handle hash page fault for
these pages.

We use _PAGE_COMBO bit along with _PAGE_HASHPTE to indicate whether a
pte is backed by 4K hash pte. If we find _PAGE_COMBO not set on the pte,
that implies that we could possibly have older 64K hash pte entries in
the hash page table and we need to invalidate those entries.

Use _PAGE_COMBO to determine the page size with which we should
invalidate the hash table entries on unmap.

CC: <stable@vger.kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:39 +10:00
Aneesh Kumar K.V
629149fae4 powerpc/thp: Invalidate old 64K based hash page mapping before insert of 4k pte
If we changed base page size of the segment, either via sub_page_protect
or via remap_4k_pfn, we do a demote_segment which doesn't flush the hash
table entries. We do a lazy hash page table flush for all mapped pages
in the demoted segment. This happens when we handle hash page fault
for these pages.

We use _PAGE_COMBO bit along with _PAGE_HASHPTE to indicate whether a
pte is backed by 4K hash pte. If we find _PAGE_COMBO not set on the pte,
that implies that we could possibly have older 64K hash pte entries in
the hash page table and we need to invalidate those entries.

Handle this correctly for 16M pages

CC: <stable@vger.kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:39 +10:00
Aneesh Kumar K.V
fa1f8ae80f powerpc/thp: Don't recompute vsid and ssize in loop on invalidate
The segment identifier and segment size will remain the same in
the loop, So we can compute it outside. We also change the
hugepage_invalidate interface so that we can use it the later patch

CC: <stable@vger.kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:38 +10:00
Aneesh Kumar K.V
b0aa44a3df powerpc/thp: Add write barrier after updating the valid bit
With hugepages, we store the hpte valid information in the pte page
whose address is stored in the second half of the PMD. Use a
write barrier to make sure clearing pmd busy bit and updating
hpte valid info are ordered properly.

CC: <stable@vger.kernel.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 18:20:37 +10:00
Nishanth Aravamudan
2fabf084b6 powerpc: reorder per-cpu NUMA information's initialization
There is an issue currently where NUMA information is used on powerpc
(and possibly ia64) before it has been read from the device-tree, which
leads to large slab consumption with CONFIG_SLUB and memoryless nodes.

NUMA powerpc non-boot CPU's cpu_to_node/cpu_to_mem is only accurate
after start_secondary(), similar to ia64, which is invoked via
smp_init().

Commit 6ee0578b4d ("workqueue: mark init_workqueues() as
early_initcall()") made init_workqueues() be invoked via
do_pre_smp_initcalls(), which is obviously before the secondary
processors are online.

Additionally, the following commits changed init_workqueues() to use
cpu_to_node to determine the node to use for kthread_create_on_node:

bce903809a ("workqueue: add wq_numa_tbl_len and
wq_numa_possible_cpumask[]")
f3f90ad469 ("workqueue: determine NUMA node of workers accourding to
the allowed cpumask")

Therefore, when init_workqueues() runs, it sees all CPUs as being on
Node 0. On LPARs or KVM guests where Node 0 is memoryless, this leads to
a high number of slab deactivations
(http://www.spinics.net/lists/linux-mm/msg67489.html).

Fix this by initializing the powerpc-specific CPU<->node/local memory
node mapping as early as possible, which on powerpc is
do_init_bootmem(). Currently that function initializes the mapping for
the boot CPU, but we extend it to setup the mapping for all possible
CPUs. Then, in smp_prepare_cpus(), we can correspondingly set the
per-cpu values for all possible CPUs. That ensures that before the
early_initcalls run (and really as early as possible), the per-cpu NUMA
mapping is accurate.

While testing memoryless nodes on PowerKVM guests with a fix to the
workqueue logic to use cpu_to_mem() instead of cpu_to_node(), with a
guest topology of:

available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
node 0 size: 0 MB
node 0 free: 0 MB
node 1 cpus: 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
node 1 size: 16336 MB
node 1 free: 15329 MB
node distances:
node   0   1
  0:  10  40
  1:  40  10

the slab consumption decreases from

Slab:             932416 kB
SUnreclaim:       902336 kB

to

Slab:             395264 kB
SUnreclaim:       359424 kB

And we a corresponding increase in the slab efficiency from

slab                                   mem     objs    slabs
                                      used   active   active
------------------------------------------------------------
kmalloc-16384                       337 MB   11.28%  100.00%
task_struct                         288 MB    9.93%  100.00%

to

slab                                   mem     objs    slabs
                                      used   active   active
------------------------------------------------------------
kmalloc-16384                        37 MB  100.00%  100.00%
task_struct                          31 MB  100.00%  100.00%

Powerpc didn't support memoryless nodes until recently (64bb80d87f
"powerpc/numa: Enable CONFIG_HAVE_MEMORYLESS_NODES" and 8c27226119
"powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"). Those commits also
helped improve memory consumption with these kind of environments.

Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 15:14:05 +10:00
Himangi Saraogi
d658972284 powerpc/perf/hv-24x7: Use kmem_cache_free
Free memory allocated using kmem_cache_zalloc using kmem_cache_free
rather than kfree.

The Coccinelle semantic patch that makes this change is as follows:

// <smpl>
@@
expression x,E,c;
@@

 x = \(kmem_cache_alloc\|kmem_cache_zalloc\|kmem_cache_alloc_node\)(c,...)
 ... when != x = E
     when != &x
?-kfree(x)
+kmem_cache_free(c,x)
// </smpl>

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 15:14:04 +10:00
Thomas Falcon
587870e865 powerpc/pseries/hvcserver: Fix endian issue in hvcs_get_partner_info
A buffer returned by H_VTERM_PARTNER_INFO contains device information
in big endian format, causing problems for little endian architectures.
This patch ensures that they are in cpu endian.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 15:14:04 +10:00
Anton Blanchard
a71d64b4dc powerpc: Hard disable interrupts in xmon
xmon only soft disables interrupts. This seems like a bad idea - we
certainly don't want decrementer and PMU exceptions going off when
we are debugging something inside xmon.

This issue was uncovered when the hard lockup detector went off
inside xmon. To ensure we wont get a spurious hard lockup warning,
I also call touch_nmi_watchdog() when exiting xmon.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 15:13:48 +10:00
Nishanth Aravamudan
56758e3c3c powerpc: remove duplicate definition of TEXASR_FS
It appears that commits 7f06f21d40 ("powerpc/tm: Add checking to
treclaim/trechkpt") and e4e3812150 ("KVM: PPC: Book3S HV: Add
transactional memory support") both added definitions of TEXASR_FS.
Remove one of them. At the same time, fix the alignment of the remaining
definition (should be tab-separated like the rest of the #defines).

Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 15:13:47 +10:00
Gavin Shan
5efbabe09d powerpc/pseries: Avoid deadlock on removing ddw
Function remove_ddw() could be called in of_reconfig_notifier and
we potentially remove the dynamic DMA window property, which invokes
of_reconfig_notifier again. Eventually, it leads to the deadlock as
following backtrace shows.

The patch fixes the above issue by deferring releasing the dynamic
DMA window property while releasing the device node.

=============================================
[ INFO: possible recursive locking detected ]
3.16.0+ #428 Tainted: G        W
---------------------------------------------
drmgr/2273 is trying to acquire lock:
 ((of_reconfig_chain).rwsem){.+.+..}, at: [<c000000000091890>] \
 .__blocking_notifier_call_chain+0x40/0x78

but task is already holding lock:
 ((of_reconfig_chain).rwsem){.+.+..}, at: [<c000000000091890>] \
 .__blocking_notifier_call_chain+0x40/0x78

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock((of_reconfig_chain).rwsem);
  lock((of_reconfig_chain).rwsem);
 *** DEADLOCK ***

 May be due to missing lock nesting notation

2 locks held by drmgr/2273:
 #0:  (sb_writers#4){.+.+.+}, at: [<c0000000001cbe70>] \
      .vfs_write+0xb0/0x1f8
 #1:  ((of_reconfig_chain).rwsem){.+.+..}, at: [<c000000000091890>] \
      .__blocking_notifier_call_chain+0x40/0x78

stack backtrace:
CPU: 17 PID: 2273 Comm: drmgr Tainted: G        W     3.16.0+ #428
Call Trace:
[c0000000137e7000] [c000000000013d9c] .show_stack+0x88/0x148 (unreliable)
[c0000000137e70b0] [c00000000083cd34] .dump_stack+0x7c/0x9c
[c0000000137e7130] [c0000000000b8afc] .__lock_acquire+0x128c/0x1c68
[c0000000137e7280] [c0000000000b9a4c] .lock_acquire+0xe8/0x104
[c0000000137e7350] [c00000000083588c] .down_read+0x4c/0x90
[c0000000137e73e0] [c000000000091890] .__blocking_notifier_call_chain+0x40/0x78
[c0000000137e7490] [c000000000091900] .blocking_notifier_call_chain+0x38/0x48
[c0000000137e7520] [c000000000682a28] .of_reconfig_notify+0x34/0x5c
[c0000000137e75b0] [c000000000682a9c] .of_property_notify+0x4c/0x54
[c0000000137e7650] [c000000000682bf0] .of_remove_property+0x30/0xd4
[c0000000137e76f0] [c000000000052a44] .remove_ddw+0x144/0x168
[c0000000137e7790] [c000000000053204] .iommu_reconfig_notifier+0x30/0xe0
[c0000000137e7820] [c00000000009137c] .notifier_call_chain+0x6c/0xb4
[c0000000137e78c0] [c0000000000918ac] .__blocking_notifier_call_chain+0x5c/0x78
[c0000000137e7970] [c000000000091900] .blocking_notifier_call_chain+0x38/0x48
[c0000000137e7a00] [c000000000682a28] .of_reconfig_notify+0x34/0x5c
[c0000000137e7a90] [c000000000682e14] .of_detach_node+0x44/0x1fc
[c0000000137e7b40] [c0000000000518e4] .ofdt_write+0x3ac/0x688
[c0000000137e7c20] [c000000000238430] .proc_reg_write+0xb8/0xd4
[c0000000137e7cd0] [c0000000001cbeac] .vfs_write+0xec/0x1f8
[c0000000137e7d70] [c0000000001cc3b0] .SyS_write+0x58/0xa0
[c0000000137e7e30] [c00000000000a064] syscall_exit+0x0/0x98

Cc: stable@vger.kernel.org
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 15:13:47 +10:00
Gavin Shan
f1b3929c23 powerpc/pseries: Failure on removing device node
While running command "drmgr -c phb -r -s 'PHB 528'", following
backtrace jumped out because the target device node isn't marked
with OF_DETACHED by of_detach_node(), which caused by error
returned from memory hotplug related reconfig notifier when
disabling CONFIG_MEMORY_HOTREMOVE. The patch fixes it.

ERROR: Bad of_node_put() on /pci@800000020000210/ethernet@0
CPU: 14 PID: 2252 Comm: drmgr Tainted: G        W     3.16.0+ #427
Call Trace:
[c000000012a776a0] [c000000000013d9c] .show_stack+0x88/0x148 (unreliable)
[c000000012a77750] [c00000000083cd34] .dump_stack+0x7c/0x9c
[c000000012a777d0] [c0000000006807c4] .of_node_release+0x58/0xe0
[c000000012a77860] [c00000000038a7d0] .kobject_release+0x174/0x1b8
[c000000012a77900] [c00000000038a884] .kobject_put+0x70/0x78
[c000000012a77980] [c000000000681680] .of_node_put+0x28/0x34
[c000000012a77a00] [c000000000681ea8] .__of_get_next_child+0x64/0x70
[c000000012a77a90] [c000000000682138] .of_find_node_by_path+0x1b8/0x20c
[c000000012a77b40] [c000000000051840] .ofdt_write+0x308/0x688
[c000000012a77c20] [c000000000238430] .proc_reg_write+0xb8/0xd4
[c000000012a77cd0] [c0000000001cbeac] .vfs_write+0xec/0x1f8
[c000000012a77d70] [c0000000001cc3b0] .SyS_write+0x58/0xa0
[c000000012a77e30] [c00000000000a064] syscall_exit+0x0/0x98

Cc: stable@vger.kernel.org
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 15:13:46 +10:00
Benjamin Herrenschmidt
ce8f150a17 powerpc/boot: Use correct zlib types for comparison
Avoids this warning:

arch/powerpc/boot/gunzip_util.c:118:9: warning: comparison of distinct pointer types lacks a cast

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-13 15:13:45 +10:00