Commit Graph

55461 Commits

Author SHA1 Message Date
Andrew Morton
05b0afd73d mm: add a reminder comment for __GFP_BITS_SHIFT
Cc: Glauber Costa <glommer@parallels.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:34 -08:00
Greg Thelen
44e33e8f95 res_counter: delete res_counter_write()
Since commit 628f423553 ("memcg: limit change shrink usage") both
res_counter_write() and write_strategy_fn have been unused.  This patch
deletes them both.

Signed-off-by: Greg Thelen <gthelen@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:33 -08:00
Lai Jiangshan
6715ddf945 hotplug: update nodemasks management
Update nodemasks management for N_MEMORY.

[lliubbo@gmail.com: fix build]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Lin Feng <linfeng@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Bob Liu <lliubbo@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:33 -08:00
Lai Jiangshan
38d7bee9d2 cpuset: use N_MEMORY instead N_HIGH_MEMORY
N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Lin Feng <linfeng@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:32 -08:00
Lai Jiangshan
8219fc48ad mm: node_states: introduce N_MEMORY
We have N_NORMAL_MEMORY for standing for the nodes that have normal memory
with zone_type <= ZONE_NORMAL.

And we have N_HIGH_MEMORY for standing for the nodes that have normal or
high memory.

But we don't have any word to stand for the nodes that have *any* memory.

And we have N_CPU but without N_MEMORY.

Current code reuse the N_HIGH_MEMORY for this purpose because any node
which has memory must have high memory or normal memory currently.

A)	But this reusing is bad for *readability*. Because the name
	N_HIGH_MEMORY just stands for high or normal:

A.example 1)
	mem_cgroup_nr_lru_pages():
		for_each_node_state(nid, N_HIGH_MEMORY)

	The user will be confused(why this function just counts for high or
	normal memory node? does it counts for ZONE_MOVABLE's lru pages?)
	until someone else tell them N_HIGH_MEMORY is reused to stand for
	nodes that have any memory.

A.cont) If we introduce N_MEMORY, we can reduce this confusing
	AND make the code more clearly:

A.example 2) mm/page_cgroup.c use N_HIGH_MEMORY twice:

	One is in page_cgroup_init(void):
		for_each_node_state(nid, N_HIGH_MEMORY) {

	It means if the node have memory, we will allocate page_cgroup map for
	the node. We should use N_MEMORY instead here to gaim more clearly.

	The second using is in alloc_page_cgroup():
		if (node_state(nid, N_HIGH_MEMORY))
			addr = vzalloc_node(size, nid);

	It means if the node has high or normal memory that can be allocated
	from kernel. We should keep N_HIGH_MEMORY here, and it will be better
	if the "any memory" semantic of N_HIGH_MEMORY is removed.

B)	This reusing is out-dated if we introduce MOVABLE-dedicated node.
	The MOVABLE-dedicated node should not appear in
	node_stats[N_HIGH_MEMORY] nor node_stats[N_NORMAL_MEMORY],
	because MOVABLE-dedicated node has no high or normal memory.

	In x86_64, N_HIGH_MEMORY=N_NORMAL_MEMORY, if a MOVABLE-dedicated node
	is in node_stats[N_HIGH_MEMORY], it is also means it is in
	node_stats[N_NORMAL_MEMORY], it causes SLUB wrong.

	The slub uses
		for_each_node_state(nid, N_NORMAL_MEMORY)
	and creates kmem_cache_node for MOVABLE-dedicated node and cause problem.

In one word, we need a N_MEMORY.  We just intrude it as an alias to
N_HIGH_MEMORY and fix all im-proper usages of N_HIGH_MEMORY in late
patches.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Lin Feng <linfeng@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:32 -08:00
Kirill A. Shutemov
79da5407ee thp: introduce sysfs knob to disable huge zero page
By default kernel tries to use huge zero page on read page fault.  It's
possible to disable huge zero page by writing 0 or enable it back by
writing 1:

echo 0 >/sys/kernel/mm/transparent_hugepage/khugepaged/use_zero_page
echo 1 >/sys/kernel/mm/transparent_hugepage/khugepaged/use_zero_page

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:32 -08:00
Kirill A. Shutemov
d8a8e1f0da thp, vmstat: implement HZP_ALLOC and HZP_ALLOC_FAILED events
hzp_alloc is incremented every time a huge zero page is successfully
	allocated. It includes allocations which where dropped due
	race with other allocation. Note, it doesn't count every map
	of the huge zero page, only its allocation.

hzp_alloc_failed is incremented if kernel fails to allocate huge zero
	page and falls back to using small pages.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:32 -08:00
Kirill A. Shutemov
e180377f1a thp: change split_huge_page_pmd() interface
Pass vma instead of mm and add address parameter.

In most cases we already have vma on the stack. We provides
split_huge_page_pmd_mm() for few cases when we have mm, but not vma.

This change is preparation to huge zero pmd splitting implementation.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:31 -08:00
Kirill A. Shutemov
93b4796ded thp: do_huge_pmd_wp_page(): handle huge zero page
On write access to huge zero page we alloc a new huge page and clear it.

If ENOMEM, graceful fallback: we create a new pmd table and set pte around
fault address to newly allocated normal (4k) page.  All other ptes in the
pmd set to normal zero page.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-12 17:38:31 -08:00
Jiri Kosina
818b930bc1 Merge branches 'for-3.7/upstream-fixes', 'for-3.8/hidraw', 'for-3.8/i2c-hid', 'for-3.8/multitouch', 'for-3.8/roccat', 'for-3.8/sensors' and 'for-3.8/upstream' into for-linus
Conflicts:
	drivers/hid/hid-core.c
2012-12-12 21:41:55 +01:00
Linus Torvalds
9977d9b379 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull big execve/kernel_thread/fork unification series from Al Viro:
 "All architectures are converted to new model.  Quite a bit of that
  stuff is actually shared with architecture trees; in such cases it's
  literally shared branch pulled by both, not a cherry-pick.

  A lot of ugliness and black magic is gone (-3KLoC total in this one):

   - kernel_thread()/kernel_execve()/sys_execve() redesign.

     We don't do syscalls from kernel anymore for either kernel_thread()
     or kernel_execve():

     kernel_thread() is essentially clone(2) with callback run before we
     return to userland, the callbacks either never return or do
     successful do_execve() before returning.

     kernel_execve() is a wrapper for do_execve() - it doesn't need to
     do transition to user mode anymore.

     As a result kernel_thread() and kernel_execve() are
     arch-independent now - they live in kernel/fork.c and fs/exec.c
     resp.  sys_execve() is also in fs/exec.c and it's completely
     architecture-independent.

   - daemonize() is gone, along with its parts in fs/*.c

   - struct pt_regs * is no longer passed to do_fork/copy_process/
     copy_thread/do_execve/search_binary_handler/->load_binary/do_coredump.

   - sys_fork()/sys_vfork()/sys_clone() unified; some architectures
     still need wrappers (ones with callee-saved registers not saved in
     pt_regs on syscall entry), but the main part of those suckers is in
     kernel/fork.c now."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (113 commits)
  do_coredump(): get rid of pt_regs argument
  print_fatal_signal(): get rid of pt_regs argument
  ptrace_signal(): get rid of unused arguments
  get rid of ptrace_signal_deliver() arguments
  new helper: signal_pt_regs()
  unify default ptrace_signal_deliver
  flagday: kill pt_regs argument of do_fork()
  death to idle_regs()
  don't pass regs to copy_process()
  flagday: don't pass regs to copy_thread()
  bfin: switch to generic vfork, get rid of pointless wrappers
  xtensa: switch to generic clone()
  openrisc: switch to use of generic fork and clone
  unicore32: switch to generic clone(2)
  score: switch to generic fork/vfork/clone
  c6x: sanitize copy_thread(), get rid of clone(2) wrapper, switch to generic clone()
  take sys_fork/sys_vfork/sys_clone prototypes to linux/syscalls.h
  mn10300: switch to generic fork/vfork/clone
  h8300: switch to generic fork/vfork/clone
  tile: switch to generic clone()
  ...

Conflicts:
	arch/microblaze/include/asm/Kbuild
2012-12-12 12:22:13 -08:00
Linus Torvalds
d027db132b ARM: arm-soc: SoC updates for 3.8
This contains the bulk of new SoC development for this merge window.
 
 Two new platforms have been added, the sunxi platforms (Allwinner A1x
 SoCs) by Maxime Ripard, and a generic Broadcom platform for a new
 series of ARMv7 platforms from them, where the hope is that we can
 keep the platform code generic enough to have them all share one mach
 directory. The new Broadcom platform is contributed by Christian Daudt.
 
 Highbank has grown support for Calxeda's next generation of hardware,
 ECX-2000.
 
 clps711x has seen a lot of cleanup from Alexander Shiyan, and he's also
 taken on maintainership of the platform.
 
 Beyond this there has been a bunch of work from a number of people on
 converting more platforms to IRQ domains, pinctrl conversion, cleanup
 and general feature enablement across most of the active platforms.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQyLCjAAoJEIwa5zzehBx3AdQP/R+L3+EQMjiEWt/p7g/ql5Em
 0SnP92CcGzrjgLTg9z1FeOazfOsGnkZAYUlDRkqfKobH3VqkhYFFtt1/0x0KMahm
 xcowHgMBOyimFdWT9vLK3J8U6DLui5XrEG9LGH2VL+lqmfjIyP/OOF3mVc0/+pV9
 WTLAsYswdBRSeiNuF43kqlfrOwF6xsPLgiNMlc82w6BzHqoHu6dOif5M9MqWaApS
 V74DPmwLD371Tyit6aHqt3JOqpgiPSHlmxkzomK+5idcW3Pa7HnzzFYmx85dk/eN
 J2siqIkoOu7tEfjIbNZTL2MYoX4tUUKv4qZZ3IOl3YSWaV3P5ilMApF01XVrkk8E
 DWOMhzte9hC7L90W+/kCPLF1VyeAhCem2KQWUitO71fKur3r+3ZaUokNVvWzkJIL
 7aduxAJOV2hfLgEqbjbjF3o4S8p63OV3kzivFJM1And15zDJo4+qqOh67+bPo4jj
 +R4du+SqzXriw4i3tDLGVpdjDffk4D41tbLzgkWAtvGyoP45yeYfHAzAh0pDFPRv
 ASfZVmZ5PhwAUAkIMnpC2sjgmxMYff3SYqmDgnsqXES7rbDH/hG+teymtHFTyUQp
 m+f60DNotSMcMvkLdvruLSB4aeTiwbfOqPn/g+aXYUlPuNMq1fVWgN7EJKWkamK4
 nRwaJmLwx1/ojcVbpy2G
 =YMKB
 -----END PGP SIGNATURE-----

Merge tag 'soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC updates from Olof Johansson:
 "This contains the bulk of new SoC development for this merge window.

  Two new platforms have been added, the sunxi platforms (Allwinner A1x
  SoCs) by Maxime Ripard, and a generic Broadcom platform for a new
  series of ARMv7 platforms from them, where the hope is that we can
  keep the platform code generic enough to have them all share one mach
  directory.  The new Broadcom platform is contributed by Christian
  Daudt.

  Highbank has grown support for Calxeda's next generation of hardware,
  ECX-2000.

  clps711x has seen a lot of cleanup from Alexander Shiyan, and he's
  also taken on maintainership of the platform.

  Beyond this there has been a bunch of work from a number of people on
  converting more platforms to IRQ domains, pinctrl conversion, cleanup
  and general feature enablement across most of the active platforms."

Fix up trivial conflicts as per Olof.

* tag 'soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (174 commits)
  mfd: vexpress-sysreg: Remove LEDs code
  irqchip: irq-sunxi: Add terminating entry for sunxi_irq_dt_ids
  clocksource: sunxi_timer: Add terminating entry for sunxi_timer_dt_ids
  irq: versatile: delete dangling variable
  ARM: sunxi: add missing include for mdelay()
  ARM: EXYNOS: Avoid early use of of_machine_is_compatible()
  ARM: dts: add node for PL330 MDMA1 controller for exynos4
  ARM: EXYNOS: Add support for secondary CPU bring-up on Exynos4412
  ARM: EXYNOS: add UART3 to DEBUG_LL ports
  ARM: S3C24XX: Add clkdev entry for camif-upll clock
  ARM: SAMSUNG: Add s3c24xx/s3c64xx CAMIF GPIO setup helpers
  ARM: sunxi: Add missing sun4i.dtsi file
  pinctrl: samsung: Do not initialise statics to 0
  ARM i.MX6: remove gate_mask from pllv3
  ARM i.MX6: Fix ethernet PLL clocks
  ARM i.MX6: rename PLLs according to datasheet
  ARM i.MX6: Add pwm support
  ARM i.MX51: Add pwm support
  ARM i.MX53: Add pwm support
  ARM: mx5: Replace clk_register_clkdev with clock DT lookup
  ...
2012-12-12 12:05:15 -08:00
Linus Torvalds
d01e4afdbb ARM: arm-soc: Cleanups on various subarchitectures
Cleanup patches for various ARM platforms and some of their associated
 drivers. There's also a branch in here that enables Freescale i.MX to be
 part of the multiplatform support -- the first "big" SoC that is moved
 over (more multiplatform work comes in a separate branch later during
 the merge window).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQx2p9AAoJEIwa5zzehBx3aPUQAIjV3VDf/ACkA4KUQu0BFg5U
 57OIkl6RCZvfKhYgq5+6OJ2AK6VkGh9PqTmXkDS7Nj3QMS/uWcb3U419aPJsd3Z/
 vNGpTl+J/YcAcFrKMqTyNv98TAiAOJlpm70CqmRbkhpMfoJb7//1JKqGTJPBO+tj
 8ZEwNGC0WbRNOSQTY/TTAhbZE1sqXwKy9mDLGmcwqKBY8H1TFHyPB6yWYFSxMHxS
 JAegbYhYO9FawOOLoi9ovT+2vUR9vDu0xxV4zUK9f5DqKcCb/wYuN0QkusjnEutm
 RfIt7iXHHzi35YPxtlrGgSz9EIYXKAafSzkgf3Ydpjci5DH/vbVexm/CT+V+SwOT
 SvucYJMALI/aOEFJWN/50L6B9zipSrWb51tK7WFXz/sUCrMQrXH3Mu99mjHZXSoL
 1cylsvs3DFQC7vHFLSjRpX6eJdfE+Hb0LZ878eXSbDVCOnU8odAQrofugqfmeVDk
 eN0+BWmchJgvljOiKVUQMC3PCquCaAAO1lm/HU7bWPlVigTuHSW0uisDyCYAtlt1
 dGxnbbhoFJvSH7CMOoMO7hIFnoNJEe6+uVUuwA/+iJouMXMJLoY6Da4L72h1Mp81
 o4Hr6Kxly/SMtURZ/6pCycx5ahb5TaahstYAoe7Qp1dMj5U2m6fUVfKkG7tUx1CW
 MIuvN3qJeW2qKWKmZRVM
 =zfPd
 -----END PGP SIGNATURE-----

Merge tag 'cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC cleanups on various subarchitectures from Olof Johansson:
 "Cleanup patches for various ARM platforms and some of their associated
  drivers.  There's also a branch in here that enables Freescale i.MX to
  be part of the multiplatform support -- the first "big" SoC that is
  moved over (more multiplatform work comes in a separate branch later
  during the merge window)."

Conflicts fixed as per Olof, including a silent semantic one in
arch/arm/mach-omap2/board-generic.c (omap_prcm_restart() was renamed to
omap3xxx_restart(), and a new user of the old name was added).

* tag 'cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (189 commits)
  ARM: omap: fix typo on timer cleanup
  ARM: EXYNOS: Remove unused regs-mem.h file
  ARM: EXYNOS: Remove unused non-dt support for dwmci controller
  ARM: Kirkwood: Use hw_pci.ops instead of hw_pci.scan
  ARM: OMAP3: cm-t3517: use GPTIMER for system clock
  ARM: OMAP2+: timer: remove CONFIG_OMAP_32K_TIMER
  ARM: SAMSUNG: use devm_ functions for ADC driver
  ARM: EXYNOS: no duplicate mask/unmask in eint0_15
  ARM: S3C24XX: SPI clock channel setup is fixed for S3C2443
  ARM: EXYNOS: Remove i2c0 resource information and setting of device names
  ARM: Kirkwood: checkpatch cleanups
  ARM: Kirkwood: Fix sparse warnings.
  ARM: Kirkwood: Remove unused includes
  ARM: kirkwood: cleanup lsxl board includes
  ARM: integrator: use BUG_ON where possible
  ARM: integrator: push down SC dependencies
  ARM: integrator: delete static UART1 mapping
  ARM: integrator: delete SC mapping on the CP
  ARM: integrator: remove static CP syscon mapping
  ARM: integrator: remove static AP syscon mapping
  ...
2012-12-12 11:51:39 -08:00
Linus Torvalds
8287361abc ARM: arm-soc: Header cleanups
This is a collection of header file cleanups, mostly for OMAP and AT91,
 that keeps moving the platforms in the direction of multiplatform by
 removing the need for mach-dependent header files used in drivers and
 other places.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQx2reAAoJEIwa5zzehBx3Z/oQAIHp8aXbUD0GvbBPQ2vIydZ8
 Qfdh2ypbyFB9GSLmPP6BhurEFUcwmW3MH2r9Kq68omPt4XxrFEt1SdEn3TgZlhTY
 YYzEv2DLbWfmKCDBFAEFullsfVkrpggyou5ty30YVp/u2lVLvChpkciE5/9HADy5
 uoh/W2KZcLE51tvnbRy/1DBaOtxEdCjJ9hTaef7FCB9ugOG+yMDYwIPRW/b4fNA6
 o/1NuZTWp0H4ePRG2oqLxYnjSiF63DVTJZjSJ+c5gW34bWgh8+/xzaLFA9U++/ig
 meGUD1Oe65+ctr+Wk2mV29eb02jauUe6qvkwt+iFIDDopgc2mn5BcW+ENpgpjhe3
 6l3ClRd94qh4cMWzqDa5PZCdetshiKqSH2IGpKS/dHNB1h5sBP7CCbvbalwzWd5n
 qjfkPC7kSFyU3XV+9SEAXE6HLKsiMQy9kRcKOMdTE5BA4FEAZk0wfiG9WcVCvS2D
 9tDC3X0aScyXx4Mkd4wIAZVhY68QuI17fPft9zZSj691YiUW5cR2yF1qbCka0yd5
 pHu/QSZg1+dUitMUvMPQmWJ15jnAtEYUtV/8zQFFVchgkxlrW+UtvW3zxghB6D+J
 ZwcjAMfQQz9fLoMQXlVovjQIhYsjw3SlV62ReBH7eyPs3+KfPIBn7Ks6mVQs3dS5
 AMUWORsB5u92XHl9EL9C
 =aggA
 -----END PGP SIGNATURE-----

Merge tag 'headers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC Header cleanups from Olof Johansson:
 "This is a collection of header file cleanups, mostly for OMAP and
  AT91, that keeps moving the platforms in the direction of
  multiplatform by removing the need for mach-dependent header files
  used in drivers and other places."

Fix up mostly trivial conflicts as per Olof.

* tag 'headers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (106 commits)
  ARM: OMAP2+: Move iommu/iovmm headers to platform_data
  ARM: OMAP2+: Make some definitions local
  ARM: OMAP2+: Move iommu2 to drivers/iommu/omap-iommu2.c
  ARM: OMAP2+: Move plat/iovmm.h to include/linux/omap-iommu.h
  ARM: OMAP2+: Move iopgtable header to drivers/iommu/
  ARM: OMAP: Merge iommu2.h into iommu.h
  atmel: move ATMEL_MAX_UART to platform_data/atmel.h
  ARM: OMAP: Remove omap_init_consistent_dma_size()
  arm: at91: move at91rm9200 rtc header in drivers/rtc
  arm: at91: move reset controller header to arm/arm/mach-at91
  arm: at91: move pit define to the driver
  arm: at91: move at91_shdwc.h to arch/arm/mach-at91
  arm: at91: move board header to arch/arm/mach-at91
  arn: at91: move at91_tc.h to arch/arm/mach-at91
  arm: at91 move at91_aic.h to arch/arm/mach-at91
  arm: at91 move board.h to arch/arm/mach-at91
  arm: at91: move platfarm_data to include/linux/platform_data/atmel.h
  arm: at91: drop machine defconfig
  ARM: OMAP: Remove NEED_MACH_GPIO_H
  ARM: OMAP: Remove unnecessary mach and plat includes
  ...
2012-12-12 11:45:16 -08:00
Linus Torvalds
b1286f4e9a Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM updates from Russell King:
 "Here's the updates for ARM for this merge window, which cover quite a
  variety of areas.

  There's a bunch of patch series from Will tackling various bugs like
  the PROT_NONE handling, ASID allocation, cluster boot protocol and
  ASID TLB tagging updates.

  We move to a build-time sorted exception table rather than doing the
  sorting at run-time, add support for the secure computing filter, and
  some updates to the perf code.  We also have sorted out the placement
  of some headers, fixed some build warnings, fixed some hotplug
  problems with the per-cpu TWD code."

* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (73 commits)
  ARM: 7594/1: Add .smp entry for REALVIEW_EB
  ARM: 7599/1: head: Remove boot-time HYP mode check for v5 and below
  ARM: 7598/1: net: bpf_jit_32: fix sp-relative load/stores offsets.
  ARM: 7595/1: syscall: rework ordering in syscall_trace_exit
  ARM: 7596/1: mmci: replace readsl/writesl with ioread32_rep/iowrite32_rep
  ARM: 7597/1: net: bpf_jit_32: fix kzalloc gfp/size mismatch.
  ARM: 7593/1: nommu: do not enable DCACHE_WORD_ACCESS when !CONFIG_MMU
  ARM: 7592/1: nommu: prevent generation of kernel unaligned memory accesses
  ARM: 7591/1: nommu: Enable the strict alignment (CR_A) bit only if ARCH < v6
  ARM: 7590/1: /proc/interrupts: limit the display of IPIs to online CPUs only
  ARM: 7587/1: implement optimized percpu variable access
  ARM: 7589/1: integrator: pass the lm resource to amba
  ARM: 7588/1: amba: create a resource parent registrator
  ARM: 7582/2: rename kvm_seq to vmalloc_seq so to avoid confusion with KVM
  ARM: 7585/1: kernel: fix nr_cpu_ids check in DT logical map init
  ARM: 7584/1: perf: fix link error when CONFIG_HW_PERF_EVENTS is not selected
  ARM: gic: use a private mapping for CPU target interfaces
  ARM: kernel: add logical mappings look-up
  ARM: kernel: add cpu logical map DT init in setup_arch
  ARM: kernel: add device tree init map function
  ...
2012-12-12 11:30:02 -08:00
Yan Burman
d4676eac0d net: ethtool: Add destination MAC address to flow steering API
Add ability to specify destination MAC address for L3/L4 flow spec
in order to be able to specify action for different VM's under vSwitch
configuration. This change is transparent to older userspace.

Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-12 13:02:30 -05:00
Cong Wang
cfd5675435 bridge: add support of adding and deleting mdb entries
This patch implents adding/deleting mdb entries via netlink.
Currently all entries are temp, we probably need a flag to distinguish
permanent entries too.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-12 13:02:30 -05:00
Cong Wang
37a393bc49 bridge: notify mdb changes via netlink
As Stephen mentioned, we need to monitor the mdb
changes in user-space, so add notifications via netlink too.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-12 13:02:30 -05:00
YOSHIFUJI Hideaki
fd0ea7dbfa ndisc: Unexport ndisc_{build,send}_skb().
These symbols were exported for bonding device by commit 305d552a
("bonding: send IPv6 neighbor advertisement on failover").

It bacame obsolete by commit 7c899432 ("bonding, ipv4, ipv6, vlan: Handle
NETDEV_BONDING_FAILOVER like NETDEV_NOTIFY_PEERS") and removed by
commit 4f5762ec ("bonding: Remove obsolete source file 'bond_ipv6.c'").

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-12 12:42:29 -05:00
stephen hemminger
895464fa4b uapi: add missing netconf.h to export list
Add netconf.h for use by iproute2.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-12 12:40:23 -05:00
Linus Torvalds
d206e09036 Merge branch 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup changes from Tejun Heo:
 "A lot of activities on cgroup side.  The big changes are focused on
  making cgroup hierarchy handling saner.

   - cgroup_rmdir() had peculiar semantics - it allowed cgroup
     destruction to be vetoed by individual controllers and tried to
     drain refcnt synchronously.  The vetoing never worked properly and
     caused good deal of contortions in cgroup.  memcg was the last
     reamining user.  Michal Hocko removed the usage and cgroup_rmdir()
     path has been simplified significantly.  This was done in a
     separate branch so that the memcg people can base further memcg
     changes on top.

   - The above allowed cleaning up cgroup lifecycle management and
     implementation of generic cgroup iterators which are used to
     improve hierarchy support.

   - cgroup_freezer updated to allow migration in and out of a frozen
     cgroup and handle hierarchy.  If a cgroup is frozen, all descendant
     cgroups are frozen.

   - netcls_cgroup and netprio_cgroup updated to handle hierarchy
     properly.

   - Various fixes and cleanups.

   - Two merge commits.  One to pull in memcg and rmdir cleanups (needed
     to build iterators).  The other pulled in cgroup/for-3.7-fixes for
     device_cgroup fixes so that further device_cgroup patches can be
     stacked on top."

Fixed up a trivial conflict in mm/memcontrol.c as per Tejun (due to
commit bea8c150a7 ("memcg: fix hotplugged memory zone oops") in master
touching code close to commit 2ef37d3fe4 ("memcg: Simplify
mem_cgroup_force_empty_list error handling") in for-3.8)

* 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (65 commits)
  cgroup: update Documentation/cgroups/00-INDEX
  cgroup_rm_file: don't delete the uncreated files
  cgroup: remove subsystem files when remounting cgroup
  cgroup: use cgroup_addrm_files() in cgroup_clear_directory()
  cgroup: warn about broken hierarchies only after css_online
  cgroup: list_del_init() on removed events
  cgroup: fix lockdep warning for event_control
  cgroup: move list add after list head initilization
  netprio_cgroup: allow nesting and inherit config on cgroup creation
  netprio_cgroup: implement netprio[_set]_prio() helpers
  netprio_cgroup: use cgroup->id instead of cgroup_netprio_state->prioidx
  netprio_cgroup: reimplement priomap expansion
  netprio_cgroup: shorten variable names in extend_netdev_table()
  netprio_cgroup: simplify write_priomap()
  netcls_cgroup: move config inheritance to ->css_online() and remove .broken_hierarchy marking
  cgroup: remove obsolete guarantee from cgroup_task_migrate.
  cgroup: add cgroup->id
  cgroup, cpuset: remove cgroup_subsys->post_clone()
  cgroup: s/CGRP_CLONE_CHILDREN/CGRP_CPUSET_CLONE_CHILDREN/
  cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free()
  ...
2012-12-12 08:18:24 -08:00
Linus Torvalds
50851c6248 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management update from Zhang Rui:
 "Highlights:

   - Introduction of thermal policy support, together with three new
     thermal governors, including step_wise, user_space, fire_share.

   - Introduction of ST-Ericsson db8500_thermal driver and ST-Ericsson
     db8500_cpufreq_cooling driver.

   - Thermal Kconfig file and Makefile refactor.

   - Fixes for generic thermal layer, generic cpucooling, rcar thermal
     driver and Exynos thermal driver."

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (36 commits)
  Thermal: Fix DEFAULT_THERMAL_GOVERNOR
  Thermal: fix a NULL pointer dereference when generic thermal layer is built as a module
  thermal: rcar: add rcar_zone_to_priv() macro
  thermal: rcar: fixup the unit of temperature
  thermal: cpu cooling: allow module builds
  thermal: cpu cooling: use const parameter while registering
  Thermal: Add ST-Ericsson DB8500 thermal properties and platform data.
  Thermal: Add ST-Ericsson DB8500 thermal driver.
  drivers/thermal/Makefile refactor
  Exynos: Add missing dependency
  Refactor drivers/thermal/Kconfig
  thermal: cpu_cooling: Make 'notify_device' static
  Thermal: Remove the cooling_cpufreq_list.
  Thermal: fix bug of counting cpu frequencies.
  Thermal: add indent for code alignment.
  thermal: rcar_thermal: remove explicitly used devm_kfree/iounap()
  thermal: user_space: Add missing static storage class specifiers
  thermal: fair_share: Add missing static storage class specifiers
  thermal: step_wise: Add missing static storage class specifiers
  Thermal: Fix oops and unlocking in thermal_sys.c
  ...
2012-12-12 07:57:13 -08:00
Linus Torvalds
99b8f42ee2 regmap: Updates for v3.8
Quite a few enhancements this time around, helpers and diagnostics for
 the most part which is good to see:
 
 - Addition of table based lookups for the register access checks from
   Davide Ciminaghi, making life easier for drivers with big blocks of
   similar registers.
 - Allow drivers to get the irqdomain for regmap irq_chips, allowing the
   domain to be used with other APIs.
 - Debug improvements for paged register maps.
 - Performance improvments for some of the diagnostic infrastructure,
   very helpful for devices with large register maps.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQxy2mAAoJELSic+t+oim9s+QP/0sXqx5MtY5C6xMhY64P3OOB
 A5zCHHl12eXFc9D8MnGzhS87AfVFsqMe7I/Opdq4uajqUXUe4qQcYYAo0p2lFWWo
 nD1R4K9TU2X8d4eF3S38anH8ADBR38tRwYeSzsbZqsLYuolbs92/TJJZKqsQjxdo
 TipVEEqWXHntc43fOhrvVqfmD+2n/9vVKBBC5vtx6IW2o1JSfcHoKe7+qnOeRUsc
 f+UWOhPM6NrNQnICzBrRQ6xTFT55TpfSyas7tHiMVaoWGe+p5aqsXbudzOjFlfny
 vEnRv2h5CJuqZdrv7g/vxV7i2KayOJq9rFS/1jQ4xcIOCIlNN14RBONKR7VuicGv
 I6GjXQPRjgIPnVxMhzujn25zZ2bFDpx1AZLrp/I1PZi3582/ZfF9twFxJ4T8OFOH
 uZESk/jMxaOZzEN0h/TNgcrmJnbpGh1SCBUihLDJi/eSglGtPkd7GtqutL/kHm01
 boWWv3nwAztNrh2HDQkTUi4+bqNtxh3ZTOPFHBu31SLcATUlEDCQ8zuwhfCPp60U
 jP9VYx1tMtK3sLg7wUKLZTCYQF9ezaquBlAOcMINbRisN2dByj/ZAfCMSTZ3ZJqQ
 iVoThJR3llKfWzS7X9PdUtNPv94Nb+YAfH7HH3wUYGJzgJHMgGek1x/S6D4VAMTu
 Wnad4By9rpFEnzT0x32i
 =Jy7K
 -----END PGP SIGNATURE-----

Merge tag 'regmap-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap updates from Mark Brown:
 "Quite a few enhancements this time around, helpers and diagnostics for
  the most part which is good to see:

   - Addition of table based lookups for the register access checks from
     Davide Ciminaghi, making life easier for drivers with big blocks of
     similar registers.
   - Allow drivers to get the irqdomain for regmap irq_chips, allowing
     the domain to be used with other APIs.
   - Debug improvements for paged register maps.
   - Performance improvments for some of the diagnostic infrastructure,
     very helpful for devices with large register maps."

* tag 'regmap-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: debugfs: Cache offsets of valid regions for dump
  regmap: debugfs: Factor out initial seek
  regmap: debugfs: Avoid overflows for very small reads
  regmap: Cache register and value sizes for debugfs
  regmap: introduce tables for readable/writeable/volatile/precious checks
  regmap: core: Report registers in hex when we can't cache
  regmap: Fix printing of size_t variable
  regmap: make lock/unlock functions customizable
  regmap: silence GCC warning
  regmap: Split raw writes that cross window boundaries
  regmap: Make return code checks consistent
  regmap: Factor range lookup out of page selection
  regmap: Provide debugfs read of register ranges
  regmap: Factor out debugfs register read
  regmap: Allow ranges to be named
  regmap: When we sanity check during range adds say what errors we find
  regmap: Rename n_ranges to num_ranges
  regmap: irq: Allow users to retrieve the irq_domain
2012-12-12 07:55:48 -08:00
Linus Torvalds
251a8cfeda Patch series to allow EFI variable backend to pstore
to hold multiple records.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQxjFJAAoJEKurIx+X31iBNssQAKCjiXe7dT8DigoKSPj3w1sJ
 8xcuJKmlZ5ykgElfQ1Lcfuz0KuMT8SZfBHrRCL389Yvtxq71NL9pMX1OyCqyfVyO
 uuKWD/+jVc5d11F15hC2l8zIU7v2S3TXQ64rZEgDDFXIPlTbEcEBy92arpq6Avj3
 6ioR+rf4jvOeC66keIcDOIl6r4tVn2f16B9Z6ffSOit0dVeRAOc7bB9LLaohETSY
 rCSKHgZLHBZRQv2yGKS3LB6KSL2DVm+lgNhzAVw4kEJNTJl04clK854L8EM7WfGj
 qREsvSUB4H4I2XMBW+HCU/6+4c4alSEr/2c8+xBeynL4+I2y+VLwySkwE+8DPgEV
 BjCuyjSmpZVuVz6R28hcJIbkK0sN1gntXitb2V157+89f3gVEMYwKaY787TuNl9g
 oWYjjeOk66r5xshdVvCnFIgZxgFItznn+kgRkNq0clGH+nWVsZfuYw/q2PLm9xoM
 TJR+ZZwMJC25TE99rh9CqVc0kuQi5SXfqjpd+fW8oJ2kSxj4/7SqmOpmK7oWLFNi
 kf0xS1yVsQB5mRx3nvERvA36lzNSiQRFXnMjh84be4NX8dYqiGoJQjOh/laemjxd
 8hpoMVuUMtXLC4roejyyLwddtTLshWm2ByULAHJRJpuUKzichkaqhuvvkMUjKfDj
 NunMg/OfNs78sTlNSATf
 =NAdA
 -----END PGP SIGNATURE-----

Merge tag 'please-pull-pstore_mevent' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux

Pull pstore fixes from Tony Luck:
 "Patch series to allow EFI variable backend to pstore to hold multiple
  records."

* tag 'please-pull-pstore_mevent' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
  efi_pstore: Add a format check for an existing variable name at erasing time
  efi_pstore: Add a format check for an existing variable name at reading time
  efi_pstore: Add a sequence counter to a variable name
  efi_pstore: Add ctime to argument of erase callback
  efi_pstore: Remove a logic erasing entries from a write callback to hold multiple logs
  efi_pstore: Add a logic erasing entries to an erase callback
  efi_pstore: Check remaining space with QueryVariableInfo() before writing data
2012-12-12 07:50:59 -08:00
Alexander Holler
83499b52c6 HID: sensors: autodetect USB HID sensor hubs
It should not be necessary to add IDs for HID sensor hubs to lists in
hid-core.c and hid-sensor-hub.c. So instead of a whitelist, autodetect such USB
HID sensor hubs, based on a collection of type physical inside a useage page of
type sensor. If some sensor hubs stil must be usable as raw devices, a
blacklist might be created.

Signed-off-by: Alexander Holler <holler@ahsoftware.de>
Acked-by: "Pandruvada, Srinivas" <srinivas.pandruvada@intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-12-12 16:49:10 +01:00
Linus Torvalds
d07e43d70e Merge branch 'omap-serial' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM OMAP serial updates from Russell King:
 "This series is a major reworking of the OMAP serial driver code fixing
  various bugs in the hardware-assisted flow control, extending up into
  serial_core for a couple of issues.  These fixes have been done as a
  set of progressive changes and transformations in the hope that no new
  bugs will be introduced by this series.

  The problems are many-fold, from the driver not being informed about
  updated settings, to the driver not knowing what the intentions of the
  upper layers are.

  The first four patches tackle the serial_core layer, allowing it to
  provide the necessary information to drivers, and the remaining
  patches allow the OMAP serial driver to take advantage of this.

  This brings hardware assisted RTS/CTS and XON/OFF flow control into a
  useful state.

  These patches have been in linux-next for most of the last cycle;
  indeed they predate the previous merge window.  They've also been
  posted to the OMAP people."

* 'omap-serial' of git://git.linaro.org/people/rmk/linux-arm: (21 commits)
  SERIAL: omap: fix hardware assisted flow control
  SERIAL: omap: simplify (2)
  SERIAL: omap: move xon/xoff setting earlier
  SERIAL: omap: always set TCR
  SERIAL: omap: simplify
  SERIAL: omap: don't read back LCR/MCR/EFR
  SERIAL: omap: serial_omap_configure_xonxoff() contents into set_termios
  SERIAL: omap: configure xon/xoff before setting modem control lines
  SERIAL: omap: remove OMAP_UART_SYSC_RESET and OMAP_UART_FIFO_CLR
  SERIAL: omap: move driver private definitions and structures to driver
  SERIAL: omap: remove 'irq_pending' bitfield
  SERIAL: omap: fix MCR TCRTLR bit handling
  SERIAL: omap: fix set_mctrl() breakage
  SERIAL: omap: no need to re-read EFR
  SERIAL: omap: remove setting of EFR SCD bit
  SERIAL: omap: allow hardware assisted IXANY mode to be disabled
  SERIAL: omap: allow hardware assisted rts/cts modes to be disabled
  SERIAL: core: add throttle/unthrottle callbacks for hardware assisted flow control
  SERIAL: core: add hardware assisted h/w flow control support
  SERIAL: core: add hardware assisted s/w flow control support
  ...

Conflicts:
	drivers/tty/serial/omap-serial.c
2012-12-12 07:45:16 -08:00
Zhang Rui
1f53ef17d3 Thermal: Fix DEFAULT_THERMAL_GOVERNOR
Fix DEFAULT_THERMAL_GOVERNOR to be consistant with the
default governor selected in kernel config file.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2012-12-12 15:34:48 +08:00
Anton Vorontsov
76d8a23b12 Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux
The merge is merely to fix conflicts before sending a pull request.

Conflicts:
	drivers/power/ab8500_btemp.c
	drivers/power/ab8500_charger.c
	drivers/power/ab8500_fg.c
	drivers/power/abx500_chargalg.c
	drivers/power/max8925_power.c

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
2012-12-11 22:15:57 -08:00
Eric Dumazet
1abbe1394a pkt_sched: avoid requeues if possible
With BQL being deployed, we can more likely have following behavior :

We dequeue a packet from qdisc in dequeue_skb(), then we realize target
tx queue is in XOFF state in sch_direct_xmit(), and we have to hold the
skb into gso_skb for later.

This shows in stats (tc -s qdisc dev eth0) as requeues.

Problem of these requeues is that high priority packets can not be
dequeued as long as this (possibly low prio and big TSO packet) is not
removed from gso_skb.

At 1Gbps speed, a full size TSO packet is 500 us of extra latency.

In some cases, we know that all packets dequeued from a qdisc are
for a particular and known txq :

- If device is non multi queue
- For all MQ/MQPRIO slave qdiscs

This patch introduces a new qdisc flag, TCQ_F_ONETXQUEUE to mark
this capability, so that dequeue_skb() is allowed to dequeue a packet
only if the associated txq is not stopped.

This indeed reduce latencies for high prio packets (or improve fairness
with sfq/fq_codel), and almost remove qdisc 'requeues'.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-12 00:16:47 -05:00
Linus Torvalds
b64c5fda38 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core timer changes from Ingo Molnar:
 "It contains continued generic-NOHZ work by Frederic and smaller
  cleanups."

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time: Kill xtime_lock, replacing it with jiffies_lock
  clocksource: arm_generic: use this_cpu_ptr per-cpu helper
  clocksource: arm_generic: use integer math helpers
  time/jiffies: Make clocksource_jiffies static
  clocksource: clean up parse_pmtmr()
  tick: Correct the comments for tick_sched_timer()
  tick: Conditionally build nohz specific code in tick handler
  tick: Consolidate tick handling for high and low res handlers
  tick: Consolidate timekeeping handling code
2012-12-11 18:22:46 -08:00
Linus Torvalds
f57d54bab6 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The biggest change affects group scheduling: we now track the runnable
  average on a per-task entity basis, allowing a smoother, exponential
  decay average based load/weight estimation instead of the previous
  binary on-the-runqueue/off-the-runqueue load weight method.

  This will inevitably disturb workloads that were in some sort of
  borderline balancing state or unstable equilibrium, so an eye has to
  be kept on regressions.

  For that reason the new load average is only limited to group
  scheduling (shares distribution) at the moment (which was also hurting
  the most from the prior, crude weight calculation and whose scheduling
  quality wins most from this change) - but we plan to extend this to
  regular SMP balancing as well in the future, which will simplify and
  speed up things a bit.

  Other changes involve ongoing preparatory work to extend NOHZ to the
  scheduler as well, eventually allowing completely irq-free user-space
  execution."

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  Revert "sched/autogroup: Fix crash on reboot when autogroup is disabled"
  cputime: Comment cputime's adjusting code
  cputime: Consolidate cputime adjustment code
  cputime: Rename thread_group_times to thread_group_cputime_adjusted
  cputime: Move thread_group_cputime() to sched code
  vtime: Warn if irqs aren't disabled on system time accounting APIs
  vtime: No need to disable irqs on vtime_account()
  vtime: Consolidate a bit the ctx switch code
  vtime: Explicitly account pending user time on process tick
  vtime: Remove the underscore prefix invasion
  sched/autogroup: Fix crash on reboot when autogroup is disabled
  cputime: Separate irqtime accounting from generic vtime
  cputime: Specialize irq vtime hooks
  kvm: Directly account vtime to system on guest switch
  vtime: Make vtime_account_system() irqsafe
  vtime: Gather vtime declarations to their own header file
  sched: Describe CFS load-balancer
  sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking
  sched: Make __update_entity_runnable_avg() fast
  sched: Update_cfs_shares at period edge
  ...
2012-12-11 18:21:38 -08:00
Linus Torvalds
090f8ccba3 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Lots of activity:

   211 files changed, 8328 insertions(+), 4116 deletions(-)

  most of it on the tooling side.

  Main changes:

   * ftrace enhancements and fixes from Steve Rostedt.

   * uprobes fixes, cleanups and preparation for the ARM port from Oleg
     Nesterov.

   * UAPI fixes, from David Howels - prepares the arch/x86 UAPI
     transition

   * Separate perf tests into multiple objects, one per test, from Jiri
     Olsa.

   * Make hardware event translations available in sysfs, from Jiri
     Olsa.

   * Fixes to /proc/pid/maps parsing, preparatory to supporting data
     maps, from Namhyung Kim

   * Implement ui_progress for GTK, from Namhyung Kim

   * Add framework for automated perf_event_attr tests, where tools with
     different command line options will be run from a 'perf test', via
     python glue, and the perf syscall will be intercepted to verify
     that the perf_event_attr fields set by the tool are those expected,
     from Jiri Olsa

   * Add a 'link' method for hists, so that we can have the leader with
     buckets for all the entries in all the hists.  This new method is
     now used in the default 'diff' output, making the sum of the
     'baseline' column be 100%, eliminating blind spots.

   * libtraceevent fixes for compiler warnings trying to make perf it
     build on some distros, like fedora 14, 32-bit, some of the warnings
     really pointed to real bugs.

   * Add a browser for 'perf script' and make it available from the
     report and annotate browsers.  It does filtering to find the
     scripts that handle events found in the perf.data file used.  From
     Feng Tang

   * perf inject changes to allow showing where a task sleeps, from
     Andrew Vagin.

   * Makefile improvements from Namhyung Kim.

   * Add --pre and --post command hooks in 'stat', from Peter Zijlstra.

   * Don't stop synthesizing threads when one vanishes, this is for the
     existing threads when we start a tool like trace.

   * Use sched:sched_stat_runtime to provide a thread summary, this
     produces the same output as the 'trace summary' subcommand of
     tglx's original "trace" tool.

   * Support interrupted syscalls in 'trace'

   * Add an event duration column and filter in 'trace'.

   * There are references to the man pages in some tools, so try to
     build Documentation when installing, warning the user if that is
     not possible, from Borislav Petkov.

   * Give user better message if precise is not supported, from David
     Ahern.

   * Try to find cross-built objdump path by using the session
     environment information in the perf.data file header, from Irina
     Tirdea, original patch and idea by Namhyung Kim.

   * Diplays more output on features check for make V=1, so that one can
     figure out what is happening by looking at gcc output, etc.  From
     Jiri Olsa.

   * Add on_exit implementation for systems without one, e.g.  Android,
     from Bernhard Rosenkraenzer.

   * Only process events for vcpus of interest, helps handling large
     number of events, from David Ahern.

   * Cross compilation fixes for Android, from Irina Tirdea.

   * Add documentation on compiling for Android, from Irina Tirdea.

   * perf diff improvements from Jiri Olsa.

   * Target (task/user/cpu/syswide) handling improvements, from Namhyung
     Kim.

   * Add support in 'trace' for tracing workload given by command line,
     from Namhyung Kim.

   * ... and much more."

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (194 commits)
  uprobes: Use percpu_rw_semaphore to fix register/unregister vs dup_mmap() race
  perf evsel: Introduce is_group_member method
  perf powerpc: Use uapi/unistd.h to fix build error
  tools: Pass the target in descend
  tools: Honour the O= flag when tool build called from a higher Makefile
  tools: Define a Makefile function to do subdir processing
  perf ui: Always compile browser setup code
  perf ui: Add ui_progress__finish()
  perf ui gtk: Implement ui_progress functions
  perf ui: Introduce generic ui_progress helper
  perf ui tui: Move progress.c under ui/tui directory
  perf tools: Add basic event modifier sanity check
  perf tools: Omit group members from perf_evlist__disable/enable
  perf tools: Ensure single disable call per event in record comand
  perf tools: Fix 'disabled' attribute config for record command
  perf tools: Fix attributes for '{}' defined event groups
  perf tools: Use sscanf for parsing /proc/pid/maps
  perf tools: Add gtk.<command> config option for launching GTK browser
  perf tools: Fix compile error on NO_NEWT=1 build
  perf hists: Initialize all of he->stat with zeroes
  ...
2012-12-11 18:14:31 -08:00
Linus Torvalds
aefb058b0c Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
 "Affinity fixes and a nested threaded IRQ handling fix."

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Always force thread affinity
  irq: Set CPU affinity right on thread creation
  genirq: Provide means to retrigger parent
2012-12-11 18:12:06 -08:00
Linus Torvalds
37ea95a959 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU update from Ingo Molnar:
 "The major features of this tree are:

     1. A first version of no-callbacks CPUs.  This version prohibits
        offlining CPU 0, but only when enabled via CONFIG_RCU_NOCB_CPU=y.
        Relaxing this constraint is in progress, but not yet ready
        for prime time.  These commits were posted to LKML at
        https://lkml.org/lkml/2012/10/30/724.

     2. Changes to SRCU that allows statically initialized srcu_struct
        structures.  These commits were posted to LKML at
        https://lkml.org/lkml/2012/10/30/296.

     3. Restructuring of RCU's debugfs output.  These commits were posted
        to LKML at https://lkml.org/lkml/2012/10/30/341.

     4. Additional CPU-hotplug/RCU improvements, posted to LKML at
        https://lkml.org/lkml/2012/10/30/327.
        Note that the commit eliminating __stop_machine() was judged to
        be too-high of risk, so is deferred to 3.9.

     5. Changes to RCU's idle interface, most notably a new module
        parameter that redirects normal grace-period operations to
        their expedited equivalents.  These were posted to LKML at
        https://lkml.org/lkml/2012/10/30/739.

     6. Additional diagnostics for RCU's CPU stall warning facility,
        posted to LKML at https://lkml.org/lkml/2012/10/30/315.
        The most notable change reduces the
        default RCU CPU stall-warning time from 60 seconds to 21 seconds,
        so that it once again happens sooner than the softlockup timeout.

     7. Documentation updates, which were posted to LKML at
        https://lkml.org/lkml/2012/10/30/280.
        A couple of late-breaking changes were posted at
        https://lkml.org/lkml/2012/11/16/634 and
        https://lkml.org/lkml/2012/11/16/547.

     8. Miscellaneous fixes, which were posted to LKML at
        https://lkml.org/lkml/2012/10/30/309.

     9. Finally, a fix for an lockdep-RCU splat was posted to LKML
        at https://lkml.org/lkml/2012/11/7/486."

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (49 commits)
  context_tracking: New context tracking susbsystem
  sched: Mark RCU reader in sched_show_task()
  rcu: Separate accounting of callbacks from callback-free CPUs
  rcu: Add callback-free CPUs
  rcu: Add documentation for the new rcuexp debugfs trace file
  rcu: Update documentation for TREE_RCU debugfs tracing
  rcu: Reduce default RCU CPU stall warning timeout
  rcu: Fix TINY_RCU rcu_is_cpu_rrupt_from_idle check
  rcu: Clarify memory-ordering properties of grace-period primitives
  rcu: Add new rcutorture module parameters to start/end test messages
  rcu: Remove list_for_each_continue_rcu()
  rcu: Fix batch-limit size problem
  rcu: Add tracing for synchronize_sched_expedited()
  rcu: Remove old debugfs interfaces and also RCU flavor name
  rcu: split 'rcuhier' to each flavor
  rcu: split 'rcugp' to each flavor
  rcu: split 'rcuboost' to each flavor
  rcu: split 'rcubarrier' to each flavor
  rcu: Fix tracing formatting
  rcu: Remove the interface "rcudata.csv"
  ...
2012-12-11 18:10:49 -08:00
Linus Torvalds
608ff1a210 Merge branch 'akpm' (Andrew's patchbomb)
Merge misc updates from Andrew Morton:
 "About half of most of MM.  Going very early this time due to
  uncertainty over the coreautounifiednumasched things.  I'll send the
  other half of most of MM tomorrow.  The rest of MM awaits a slab merge
  from Pekka."

* emailed patches from Andrew Morton: (71 commits)
  memory_hotplug: ensure every online node has NORMAL memory
  memory_hotplug: handle empty zone when online_movable/online_kernel
  mm, memory-hotplug: dynamic configure movable memory and portion memory
  drivers/base/node.c: cleanup node_state_attr[]
  bootmem: fix wrong call parameter for free_bootmem()
  avr32, kconfig: remove HAVE_ARCH_BOOTMEM
  mm: cma: remove watermark hacks
  mm: cma: skip watermarks check for already isolated blocks in split_free_page()
  mm, oom: fix race when specifying a thread as the oom origin
  mm, oom: change type of oom_score_adj to short
  mm: cleanup register_node()
  mm, mempolicy: remove duplicate code
  mm/vmscan.c: try_to_freeze() returns boolean
  mm: introduce putback_movable_pages()
  virtio_balloon: introduce migration primitives to balloon pages
  mm: introduce compaction and migration for ballooned pages
  mm: introduce a common interface for balloon pages mobility
  mm: redefine address_space.assoc_mapping
  mm: adjust address_space_operations.migratepage() return code
  arch/sparc/kernel/sys_sparc_64.c: s/COLOUR/COLOR/
  ...
2012-12-11 18:05:37 -08:00
Lai Jiangshan
511c2aba8f mm, memory-hotplug: dynamic configure movable memory and portion memory
Add online_movable and online_kernel for logic memory hotplug.  This is
the dynamic version of "movablecore" & "kernelcore".

We have the same reason to introduce it as to introduce "movablecore" &
"kernelcore".  It has the same motive as "movablecore" & "kernelcore", but
it is dynamic/running-time:

o We can configure memory as kernelcore or movablecore after boot.

  Userspace workload is increased, we need more hugepage, we can't use
  "online_movable" to add memory and allow the system use more
  THP(transparent-huge-page), vice-verse when kernel workload is increase.

  Also help for virtualization to dynamic configure host/guest's memory,
  to save/(reduce waste) memory.

  Memory capacity on Demand

o When a new node is physically online after boot, we need to use
  "online_movable" or "online_kernel" to configure/portion it as we
  expected when we logic-online it.

  This configuration also helps for physically-memory-migrate.

o all benefit as the same as existed "movablecore" & "kernelcore".

o Preparing for movable-node, which is very important for power-saving,
  hardware partitioning and high-available-system(hardware fault
  management).

(Note, we don't introduce movable-node here.)

Action behavior:
When a memoryblock/memorysection is onlined by "online_movable", the kernel
will not have directly reference to the page of the memoryblock,
thus we can remove that memory any time when needed.

When it is online by "online_kernel", the kernel can use it.
When it is online by "online", the zone type doesn't changed.

Current constraints:
Only the memoryblock which is adjacent to the ZONE_MOVABLE
can be online from ZONE_NORMAL to ZONE_MOVABLE.

[akpm@linux-foundation.org: use min_t, cleanups]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:28 -08:00
Joonsoo Kim
81df9bff26 bootmem: fix wrong call parameter for free_bootmem()
It is strange that alloc_bootmem() returns a virtual address and
free_bootmem() requires a physical address.  Anyway, free_bootmem()'s
first parameter should be physical address.

There are some call sites for free_bootmem() with virtual address.  So fix
them.

[akpm@linux-foundation.org: improve free_bootmem() and free_bootmem_pate() documentation]
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:28 -08:00
Marek Szyprowski
bc357f431c mm: cma: remove watermark hacks
Commits 2139cbe627 ("cma: fix counting of isolated pages") and
d95ea5d18e ("cma: fix watermark checking") introduced a reliable
method of free page accounting when memory is being allocated from CMA
regions, so the workaround introduced earlier by commit 49f223a9cd
("mm: trigger page reclaim in alloc_contig_range() to stabilise
watermarks") can be finally removed.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mel Gorman <mel@csn.ul.ie>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:27 -08:00
David Rientjes
e1e12d2f31 mm, oom: fix race when specifying a thread as the oom origin
test_set_oom_score_adj() and compare_swap_oom_score_adj() are used to
specify that current should be killed first if an oom condition occurs in
between the two calls.

The usage is

	short oom_score_adj = test_set_oom_score_adj(OOM_SCORE_ADJ_MAX);
	...
	compare_swap_oom_score_adj(OOM_SCORE_ADJ_MAX, oom_score_adj);

to store the thread's oom_score_adj, temporarily change it to the maximum
score possible, and then restore the old value if it is still the same.

This happens to still be racy, however, if the user writes
OOM_SCORE_ADJ_MAX to /proc/pid/oom_score_adj in between the two calls.
The compare_swap_oom_score_adj() will then incorrectly reset the old value
prior to the write of OOM_SCORE_ADJ_MAX.

To fix this, introduce a new oom_flags_t member in struct signal_struct
that will be used for per-thread oom killer flags.  KSM and swapoff can
now use a bit in this member to specify that threads should be killed
first in oom conditions without playing around with oom_score_adj.

This also allows the correct oom_score_adj to always be shown when reading
/proc/pid/oom_score.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:27 -08:00
David Rientjes
a9c58b907d mm, oom: change type of oom_score_adj to short
The maximum oom_score_adj is 1000 and the minimum oom_score_adj is -1000,
so this range can be represented by the signed short type with no
functional change.  The extra space this frees up in struct signal_struct
will be used for per-thread oom kill flags in the next patch.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Anton Vorontsov <anton.vorontsov@linaro.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:27 -08:00
Yasuaki Ishimatsu
fa26437517 mm: cleanup register_node()
register_node() is defined as extern in include/linux/node.h.  But the
function is only called from register_one_node() in driver/base/node.c.

So the patch defines register_node() as static.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:27 -08:00
Rafael Aquini
5733c7d11d mm: introduce putback_movable_pages()
The PATCH "mm: introduce compaction and migration for virtio ballooned pages"
hacks around putback_lru_pages() in order to allow ballooned pages to be
re-inserted on balloon page list as if a ballooned page was like a LRU page.

As ballooned pages are not legitimate LRU pages, this patch introduces
putback_movable_pages() to properly cope with cases where the isolated
pageset contains ballooned pages and LRU pages, thus fixing the mentioned
inelegant hack around putback_lru_pages().

Signed-off-by: Rafael Aquini <aquini@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:27 -08:00
Rafael Aquini
18468d93e5 mm: introduce a common interface for balloon pages mobility
Memory fragmentation introduced by ballooning might reduce significantly
the number of 2MB contiguous memory blocks that can be used within a guest,
thus imposing performance penalties associated with the reduced number of
transparent huge pages that could be used by the guest workload.

This patch introduces a common interface to help a balloon driver on
making its page set movable to compaction, and thus allowing the system
to better leverage the compation efforts on memory defragmentation.

[akpm@linux-foundation.org: use PAGE_FLAGS_CHECK_AT_PREP, s/__balloon_page_flags/page_flags_cleared/, small cleanups]
[rientjes@google.com: allow balloon compaction for any system with memory compaction enabled, which is the defconfig]
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:26 -08:00
Rafael Aquini
252aa6f5be mm: redefine address_space.assoc_mapping
Overhaul struct address_space.assoc_mapping renaming it to
address_space.private_data and its type is redefined to void*.  By this
approach we consistently name the .private_* elements from struct
address_space as well as allow extended usage for address_space
association with other data structures through ->private_data.

Also, all users of old ->assoc_mapping element are converted to reflect
its new name and type change (->private_data).

Signed-off-by: Rafael Aquini <aquini@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:26 -08:00
Rafael Aquini
78bd52097d mm: adjust address_space_operations.migratepage() return code
Memory fragmentation introduced by ballooning might reduce significantly
the number of 2MB contiguous memory blocks that can be used within a
guest, thus imposing performance penalties associated with the reduced
number of transparent huge pages that could be used by the guest workload.

This patch-set follows the main idea discussed at 2012 LSFMMS session:
"Ballooning for transparent huge pages" -- http://lwn.net/Articles/490114/
to introduce the required changes to the virtio_balloon driver, as well as
the changes to the core compaction & migration bits, in order to make
those subsystems aware of ballooned pages and allow memory balloon pages
become movable within a guest, thus avoiding the aforementioned
fragmentation issue

Following are numbers that prove this patch benefits on allowing
compaction to be more effective at memory ballooned guests.

Results for STRESS-HIGHALLOC benchmark, from Mel Gorman's mmtests suite,
running on a 4gB RAM KVM guest which was ballooning 512mB RAM in 64mB
chunks, at every minute (inflating/deflating), while test was running:

===BEGIN stress-highalloc

STRESS-HIGHALLOC
                 highalloc-3.7     highalloc-3.7
                     rc4-clean         rc4-patch
Pass 1          55.00 ( 0.00%)    62.00 ( 7.00%)
Pass 2          54.00 ( 0.00%)    62.00 ( 8.00%)
while Rested    75.00 ( 0.00%)    80.00 ( 5.00%)

MMTests Statistics: duration
                 3.7         3.7
           rc4-clean   rc4-patch
User         1207.59     1207.46
System       1300.55     1299.61
Elapsed      2273.72     2157.06

MMTests Statistics: vmstat
                                3.7         3.7
                          rc4-clean   rc4-patch
Page Ins                    3581516     2374368
Page Outs                  11148692    10410332
Swap Ins                         80          47
Swap Outs                      3641         476
Direct pages scanned          37978       33826
Kswapd pages scanned        1828245     1342869
Kswapd pages reclaimed      1710236     1304099
Direct pages reclaimed        32207       31005
Kswapd efficiency               93%         97%
Kswapd velocity             804.077     622.546
Direct efficiency               84%         91%
Direct velocity              16.703      15.682
Percentage direct scans          2%          2%
Page writes by reclaim        79252        9704
Page writes file              75611        9228
Page writes anon               3641         476
Page reclaim immediate        16764       11014
Page rescued immediate            0           0
Slabs scanned               2171904     2152448
Direct inode steals             385        2261
Kswapd inode steals          659137      609670
Kswapd skipped wait               1          69
THP fault alloc                 546         631
THP collapse alloc              361         339
THP splits                      259         263
THP fault fallback               98          50
THP collapse fail                20          17
Compaction stalls               747         499
Compaction success              244         145
Compaction failures             503         354
Compaction pages moved       370888      474837
Compaction move failure       77378       65259

===END stress-highalloc

This patch:

Introduce MIGRATEPAGE_SUCCESS as the default return code for
address_space_operations.migratepage() method and documents the expected
return code for the same method in failure cases.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:26 -08:00
Michel Lespinasse
db4fbfb952 mm: vm_unmapped_area() lookup function
Implement vm_unmapped_area() using the rb_subtree_gap and highest_vm_end
information to look up for suitable virtual address space gaps.

struct vm_unmapped_area_info is used to define the desired allocation
request:
 - lowest or highest possible address matching the remaining constraints
 - desired gap length
 - low/high address limits that the gap must fit into
 - alignment mask and offset

Also update the generic arch_get_unmapped_area[_topdown] functions to make
use of vm_unmapped_area() instead of implementing a brute force search.

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:25 -08:00
Rik van Riel
e4c6bfd2d7 mm: rearrange vm_area_struct for fewer cache misses
The kernel walks the VMA rbtree in various places, including the page
fault path.  However, the vm_rb node spanned two cache lines, on 64 bit
systems with 64 byte cache lines (most x86 systems).

Rearrange vm_area_struct a little, so all the information we need to do a
VMA tree walk is in the first cache line.

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:25 -08:00
Michel Lespinasse
d37371870c mm: augment vma rbtree with rb_subtree_gap
Define vma->rb_subtree_gap as the largest gap between any vma in the
subtree rooted at that vma, and their predecessor.  Or, for a recursive
definition, vma->rb_subtree_gap is the max of:

 - vma->vm_start - vma->vm_prev->vm_end
 - rb_subtree_gap fields of the vmas pointed by vma->rb.rb_left and
   vma->rb.rb_right

This will allow get_unmapped_area_* to find a free area of the right
size in O(log(N)) time, instead of potentially having to do a linear
walk across all the VMAs.

Also define mm->highest_vm_end as the vm_end field of the highest vma,
so that we can easily check if the following gap is suitable.

This does have the potential to make unmapping VMAs more expensive,
especially for processes with very large numbers of VMAs, where the VMA
rbtree can grow quite deep.

Signed-off-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:25 -08:00
Andi Kleen
42d7395feb mm: support more pagesizes for MAP_HUGETLB/SHM_HUGETLB
There was some desire in large applications using MAP_HUGETLB or
SHM_HUGETLB to use 1GB huge pages on some mappings, and stay with 2MB on
others.  This is useful together with NUMA policy: use 2MB interleaving
on some mappings, but 1GB on local mappings.

This patch extends the IPC/SHM syscall interfaces slightly to allow
specifying the page size.

It borrows some upper bits in the existing flag arguments and allows
encoding the log of the desired page size in addition to the *_HUGETLB
flag.  When 0 is specified the default size is used, this makes the
change fully compatible.

Extending the internal hugetlb code to handle this is straight forward.
Instead of a single mount it just keeps an array of them and selects the
right mount based on the specified page size.  When no page size is
specified it uses the mount of the default page size.

The change is not visible in /proc/mounts because internal mounts don't
appear there.  It also has very little overhead: the additional mounts
just consume a super block, but not more memory when not used.

I also exported the new flags to the user headers (they were previously
under __KERNEL__).  Right now only symbols for x86 and some other
architecture for 1GB and 2MB are defined.  The interface should already
work for all other architectures though.  Only architectures that define
multiple hugetlb sizes actually need it (that is currently x86, tile,
powerpc).  However tile and powerpc have user configurable hugetlb
sizes, so it's not easy to add defines.  A program on those
architectures would need to query sysfs and use the appropiate log2.

[akpm@linux-foundation.org: cleanups]
[rientjes@google.com: fix build]
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hillf Danton <dhillf@gmail.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:25 -08:00
Will Deacon
a1dd450bcb mm: thp: set the accessed flag for old pages on access fault
On x86 memory accesses to pages without the ACCESSED flag set result in
the ACCESSED flag being set automatically.  With the ARM architecture a
page access fault is raised instead (and it will continue to be raised
until the ACCESSED flag is set for the appropriate PTE/PMD).

For normal memory pages, handle_pte_fault will call pte_mkyoung
(effectively setting the ACCESSED flag).  For transparent huge pages,
pmd_mkyoung will only be called for a write fault.

This patch ensures that faults on transparent hugepages which do not
result in a CoW update the access flags for the faulting pmd.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Ni zhan Chen <nizhan.chen@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:24 -08:00
Lai Jiangshan
d9713679db memory_hotplug: fix possible incorrect node_states[N_NORMAL_MEMORY]
Currently memory_hotplug only manages the node_states[N_HIGH_MEMORY], it
forgets to manage node_states[N_NORMAL_MEMORY].  This may cause
node_states[N_NORMAL_MEMORY] to become incorrect.

Example, if a node is empty before online, and we online a memory which is
in ZONE_NORMAL.  And after online, node_states[N_HIGH_MEMORY] is correct,
but node_states[N_NORMAL_MEMORY] is incorrect, the online code doesn't set
the new online node to node_states[N_NORMAL_MEMORY].

The same thing will happen when offlining (the offline code doesn't clear
the node from node_states[N_NORMAL_MEMORY] when needed).  Some memory
managment code depends node_states[N_NORMAL_MEMORY], so we have to fix up
the node_states[N_NORMAL_MEMORY].

We add node_states_check_changes_online() and
node_states_check_changes_offline() to detect whether
node_states[N_HIGH_MEMORY] and node_states[N_NORMAL_MEMORY] are changed
while hotpluging.

Also add @status_change_nid_normal to struct memory_notify, thus the
memory hotplug callbacks know whether the node_states[N_NORMAL_MEMORY] are
changed.  (We can add a @flags and reuse @status_change_nid instead of
introducing @status_change_nid_normal, but it will add much more
complexity in memory hotplug callback in every subsystem.  So introducing
@status_change_nid_normal is better and it doesn't change the sematics of
@status_change_nid)

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Rob Landley <rob@landley.net>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:23 -08:00
Wen Congyang
8732794b16 numa: convert static memory to dynamically allocated memory for per node device
We use a static array to store struct node.  In many cases, we don't have
too many nodes, and some memory will be unused.  Convert it to per-device
dynamically allocated memory.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:23 -08:00
Wen Congyang
b023f46813 memory-hotplug: skip HWPoisoned page when offlining pages
hwpoisoned may be set when we offline a page by the sysfs interface
/sys/devices/system/memory/soft_offline_page or
/sys/devices/system/memory/hard_offline_page. If we don't clear
this flag when onlining pages, this page can't be freed, and will
not in free list. So we can't offline these pages again. So we
should skip such page when offlining pages.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:22 -08:00
Kirill A. Shutemov
d84da3f9e4 mm: use IS_ENABLED(CONFIG_COMPACTION) instead of COMPACTION_BUILD
We don't need custom COMPACTION_BUILD anymore, since we have handy
IS_ENABLED().

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:22 -08:00
Kirill A. Shutemov
e5adfffc85 mm: use IS_ENABLED(CONFIG_NUMA) instead of NUMA_BUILD
We don't need custom NUMA_BUILD anymore, since we have handy
IS_ENABLED().

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:22 -08:00
David Rientjes
19965460e3 mm, memcg: make mem_cgroup_out_of_memory() static
mem_cgroup_out_of_memory() is only referenced from within file scope, so
it can be marked static.

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:22 -08:00
Namjae Jeon
d0e1d66b5a writeback: remove nr_pages_dirtied arg from balance_dirty_pages_ratelimited_nr()
There is no reason to pass the nr_pages_dirtied argument, because
nr_pages_dirtied value from the caller is unused in
balance_dirty_pages_ratelimited_nr().

Signed-off-by: Namjae Jeon <linkinjeon@gmail.com>
Signed-off-by: Vivek Trivedi <vtrivedi018@gmail.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:21 -08:00
Linus Torvalds
414a6750e5 USB patches for 3.8-rc1
Here's the big set of USB patches for 3.8-rc1.
 
 Lots of USB host driver cleanups in here, and a bit of a reorg of the
 EHCI driver to make it easier for the different EHCI platform drivers to
 all work together nicer, which was a reduction in overall code.  We also
 deleted some unused firmware files, and got rid of the very old
 file_storage usb gadget driver that had been broken for a long time.
 This means we ended up removing way more code than added, always a nice
 thing to see:
 	 310 files changed, 3028 insertions(+), 10754 deletions(-)
 
 Other than that, the usual set of new device ids, driver fixes, gadget
 driver and controller updates and the like.
 
 All of these have been in the linux-next tree for a number of weeks.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlDHhIYACgkQMUfUDdst+yk9gACgsZYgMtbrqdbhbjbYkfO7nYkB
 iegAoIvj3qwe33TbZ8OJnIkaWcMiT8CW
 =u6Bf
 -----END PGP SIGNATURE-----

Merge tag 'usb-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB patches from Greg Kroah-Hartman:
 "Here's the big set of USB patches for 3.8-rc1.

  Lots of USB host driver cleanups in here, and a bit of a reorg of the
  EHCI driver to make it easier for the different EHCI platform drivers
  to all work together nicer, which was a reduction in overall code.  We
  also deleted some unused firmware files, and got rid of the very old
  file_storage usb gadget driver that had been broken for a long time.
  This means we ended up removing way more code than added, always a
  nice thing to see:
	 310 files changed, 3028 insertions(+), 10754 deletions(-)

  Other than that, the usual set of new device ids, driver fixes, gadget
  driver and controller updates and the like.

  All of these have been in the linux-next tree for a number of weeks.

  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

* tag 'usb-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (228 commits)
  USB: mark uas driver as BROKEN
  xhci: Add Lynx Point LP to list of Intel switchable hosts
  uwb: fix uwb_dev_unlock() missed at an error path in uwb_rc_cmd_async()
  USB: ftdi_sio: Add support for Newport AGILIS motor drivers
  MAINTAINERS: remove drivers/block/ub.c
  USB: chipidea: fix use after free bug
  ezusb: add dependency to USB
  usb: ftdi_sio: fixup BeagleBone A5+ quirk
  USB: cp210x: add Virtenio Preon32 device id
  usb: storage: remove redundant memset() in usb_probe_stor1()
  USB: option: blacklist network interface on Huawei E173
  USB: OHCI: workaround for hardware bug: retired TDs not added to the Done Queue
  USB: add new zte 3g-dongle's pid to option.c
  USB: opticon: switch to generic read implementation
  USB: opticon: refactor reab-urb processing
  USB: opticon: use usb-serial bulk-in urb
  USB: opticon: increase bulk-in size
  USB: opticon: use port as urb context
  USB: opticon: pass port to get_serial_info
  USB: opticon: make private data port specific
  ...
2012-12-11 14:48:20 -08:00
Linus Torvalds
c6bd5bcc49 TTY/Serial merge for 3.8-rc1
Here's the big tty/serial tree set of changes for 3.8-rc1.
 
 Contained in here is a bunch more reworks of the tty port layer from Jiri and
 bugfixes from Alan, along with a number of other tty and serial driver updates
 by the various driver authors.
 
 Also, Jiri has been coerced^Wconvinced to be the co-maintainer of the TTY
 layer, which is much appreciated by me.
 
 All of these have been in the linux-next tree for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlDHhgwACgkQMUfUDdst+ynI6wCcC+YeBwncnoWHvwLAJOwAZpUL
 bysAn28o780/lOsTzp3P1Qcjvo69nldo
 =hN/g
 -----END PGP SIGNATURE-----

Merge tag 'tty-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull TTY/Serial merge from Greg Kroah-Hartman:
 "Here's the big tty/serial tree set of changes for 3.8-rc1.

  Contained in here is a bunch more reworks of the tty port layer from
  Jiri and bugfixes from Alan, along with a number of other tty and
  serial driver updates by the various driver authors.

  Also, Jiri has been coerced^Wconvinced to be the co-maintainer of the
  TTY layer, which is much appreciated by me.

  All of these have been in the linux-next tree for a while.

  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

Fixed up some trivial conflicts in the staging tree, due to the fwserial
driver having come in both ways (but fixed up a bit in the serial tree),
and the ioctl handling in the dgrp driver having been done slightly
differently (staging tree got that one right, and removed both
TIOCGSOFTCAR and TIOCSSOFTCAR).

* tag 'tty-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (146 commits)
  staging: sb105x: fix potential NULL pointer dereference in mp_chars_in_buffer()
  staging/fwserial: Remove superfluous free
  staging/fwserial: Use WARN_ONCE when port table is corrupted
  staging/fwserial: Destruct embedded tty_port on teardown
  staging/fwserial: Fix build breakage when !CONFIG_BUG
  staging: fwserial: Add TTY-over-Firewire serial driver
  drivers/tty/serial/serial_core.c: clean up HIGH_BITS_OFFSET usage
  staging: dgrp: dgrp_tty.c: Audit the return values of get/put_user()
  staging: dgrp: dgrp_tty.c: Remove the TIOCSSOFTCAR ioctl handler from dgrp driver
  serial: ifx6x60: Add modem power off function in the platform reboot process
  serial: mxs-auart: unmap the scatter list before we copy the data
  serial: mxs-auart: disable the Receive Timeout Interrupt when DMA is enabled
  serial: max310x: Setup missing "can_sleep" field for GPIO
  tty/serial: fix ifx6x60.c declaration warning
  serial: samsung: add devicetree properties for non-Exynos SoCs
  serial: samsung: fix potential soft lockup during uart write
  tty: vt: Remove redundant null check before kfree.
  tty/8250 Add check for pci_ioremap_bar failure
  tty/8250 Add support for Commtech's Fastcom Async-335 and Fastcom Async-PCIe cards
  tty/8250 Add XR17D15x devices to the exar_handle_irq override
  ...
2012-12-11 14:08:47 -08:00
Linus Torvalds
8966961b31 Staging driver tree merge for 3.8-rc1
Here's the big staging tree merge for 3.8-rc1
 
 There's a lot of patches in here, the majority being the comedi rework/cleanup
 that has been ongoing and is causing a huge reduction in overall code size,
 which is amazing to watch.  We also removed some older drivers (telephony and
 rts_pstor), and added a new one (fwserial which also came in through the tty
 tree due to tty api changes, take that one if you get merge conflicts.)
 
 The iio and ipack drivers are moving out of the staging area into their
 own part of the kernel as they have been cleaned up sufficiently and are
 working well.
 
 Overall, again a reduction of code:
  768 files changed, 31887 insertions(+), 82166 deletions(-)
 
 All of this has been in the linux-next tree for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlDHiRkACgkQMUfUDdst+ykyCgCglEbrWzevtCOl+C2KAmIR+Jnt
 qUsAoLOXYeQq1aadxzqh3tK+ytlMrWd1
 =Ync9
 -----END PGP SIGNATURE-----

Merge tag 'staging-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging driver tree merge from Greg Kroah-Hartman:
 "Here's the big staging tree merge for 3.8-rc1

  There's a lot of patches in here, the majority being the comedi
  rework/cleanup that has been ongoing and is causing a huge reduction
  in overall code size, which is amazing to watch.  We also removed some
  older drivers (telephony and rts_pstor), and added a new one (fwserial
  which also came in through the tty tree due to tty api changes, take
  that one if you get merge conflicts.)

  The iio and ipack drivers are moving out of the staging area into
  their own part of the kernel as they have been cleaned up sufficiently
  and are working well.

  Overall, again a reduction of code:
   768 files changed, 31887 insertions(+), 82166 deletions(-)

  All of this has been in the linux-next tree for a while.

  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

* tag 'staging-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (1298 commits)
  iio: imu: adis16480: remove duplicated include from adis16480.c
  iio: gyro: adis16136: remove duplicated include from adis16136.c
  iio:imu: adis16480: show_firmware() buffer too small
  iio:gyro: adis16136: divide by zero in write_frequency()
  iio: adc: Add Texas Instruments ADC081C021/027 support
  iio:ad7793: Add support for the ad7796 and ad7797
  iio:ad7793: Add support for the ad7798 and ad7799
  staging:iio: Move ad7793 driver out of staging
  staging:iio:ad7793: Implement stricter id checking
  staging:iio:ad7793: Move register definitions from header to source
  staging:iio:ad7793: Rework regulator handling
  staging:iio:ad7793: Rework platform data
  staging:iio:ad7793: Use kstrtol instead of strict_strtol
  staging:iio:ad7793: Use usleep_range instead of msleep
  staging:iio:ad7793: Fix temperature scale
  staging:iio:ad7793: Fix VDD monitor scale
  staging: gdm72xx: unlock on error in init_usb()
  staging: panel: pass correct lengths to keypad_send_key()
  staging: comedi: addi_apci_2032: fix interrupt support
  staging: comedi: addi_apci_2032: move i_APCI2032_ConfigDigitalOutput()
  ...
2012-12-11 13:59:44 -08:00
Linus Torvalds
6a5971d8fe Char/Misc driver merge for 3.8-rc1
Here is the "big" char/misc driver patches for 3.8-rc1.  I'm starting to
 put random driver subsystems that I had previously sent you through the
 driver-core tree in this tree, as it makes more sense to do so.
 
 Nothing major here, the various __dev* removals, some mei driver
 updates, and other random driver-specific things from the different
 maintainers and developers.
 
 Note, some MFD drivers got added through this tree, and they are also
 coming in through the "real" MFD tree as well, due to some major
 mis-communication between me and the different developers.  If you have
 any merge conflicts, take the ones from the MFD tree, not these ones,
 sorry about that.
 
 All of this has been in linux-next for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlDHj7AACgkQMUfUDdst+ym7pQCgxhFDGQRJimG+Ddag+ghfLhQh
 Ql0AoJsWVFvQjb7q1NO7OvOABaxjEJdu
 =na5b
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull Char/Misc driver merge from Greg Kroah-Hartman:
 "Here is the "big" char/misc driver patches for 3.8-rc1.  I'm starting
  to put random driver subsystems that I had previously sent you through
  the driver-core tree in this tree, as it makes more sense to do so.

  Nothing major here, the various __dev* removals, some mei driver
  updates, and other random driver-specific things from the different
  maintainers and developers.

  Note, some MFD drivers got added through this tree, and they are also
  coming in through the "real" MFD tree as well, due to some major
  mis-communication between me and the different developers.  If you
  have any merge conflicts, take the ones from the MFD tree, not these
  ones, sorry about that.

  All of this has been in linux-next for a while.

  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

Fix up trivial conflict in drivers/mmc/host/Kconfig due to new drivers
having been added (both at the end, as usual..)

* tag 'char-misc-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (84 commits)
  MAINTAINERS: remove drivers/staging/hv/
  misc/st_kim: Free resources in the error path of probe()
  drivers/char: for hpet, add count checking, and ~0UL instead of -1
  w1-gpio: Simplify & get rid of defines
  w1-gpio: Pinctrl-fy
  extcon: remove use of __devexit_p
  extcon: remove use of __devinit
  extcon: remove use of __devexit
  drivers: uio: Only allocate new private data when probing device tree node
  drivers: uio_dmem_genirq: Allow partial success when opening device
  drivers: uio_dmem_genirq: Don't use DMA_ERROR_CODE to indicate unmapped regions
  drivers: uio_dmem_genirq: Don't mix address spaces for dynamic region vaddr
  uio: remove use of __devexit
  uio: remove use of __devinitdata
  uio: remove use of __devinit
  uio: remove use of __devexit_p
  char: remove use of __devexit
  char: remove use of __devinitconst
  char: remove use of __devinitdata
  char: remove use of __devinit
  ...
2012-12-11 13:56:38 -08:00
John W. Linville
f9c4d420c1 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2012-12-11 16:24:55 -05:00
Linus Torvalds
cff2f741b8 Driver core updates for 3.8-rc1
Here's the large driver core updates for 3.8-rc1.
 
 The biggest thing here is the various __dev* marking removals.  This is
 going to be a pain for the merge with different subsystem trees, I know,
 but all of the patches included here have been ACKed by their various
 subsystem maintainers, as they wanted them to go through here.
 
 If this is too much of a pain, I can pull all of them out of this tree
 and just send you one with the other fixes/updates and then, after
 3.8-rc1 is out, do the rest of the removals to ensure we catch them all,
 it's up to you.  The merges should all be trivial, and Stephen has been
 doing them all in linux-next for a few weeks now quite easily.
 
 Other than the __dev* marking removals, there's nothing major here, some
 firmware loading updates and other minor things in the driver core.
 
 All of these have (much to Stephen's annoyance), been in linux-next for
 a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlDHkPkACgkQMUfUDdst+ykaWgCfW7AM30cv0nzoVO08ax6KjlG1
 KVYAn3z/KYazvp4B6LMvrW9y0G34Wmad
 =yvVr
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg Kroah-Hartman:
 "Here's the large driver core updates for 3.8-rc1.

  The biggest thing here is the various __dev* marking removals.  This
  is going to be a pain for the merge with different subsystem trees, I
  know, but all of the patches included here have been ACKed by their
  various subsystem maintainers, as they wanted them to go through here.

  If this is too much of a pain, I can pull all of them out of this tree
  and just send you one with the other fixes/updates and then, after
  3.8-rc1 is out, do the rest of the removals to ensure we catch them
  all, it's up to you.  The merges should all be trivial, and Stephen
  has been doing them all in linux-next for a few weeks now quite
  easily.

  Other than the __dev* marking removals, there's nothing major here,
  some firmware loading updates and other minor things in the driver
  core.

  All of these have (much to Stephen's annoyance), been in linux-next
  for a while.

  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

Fixed up trivial conflicts in drivers/gpio/gpio-{em,stmpe}.c due to gpio
update.

* tag 'driver-core-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (93 commits)
  modpost.c: Stop checking __dev* section mismatches
  init.h: Remove __dev* sections from the kernel
  acpi: remove use of __devinit
  PCI: Remove __dev* markings
  PCI: Always build setup-bus when PCI is enabled
  PCI: Move pci_uevent into pci-driver.c
  PCI: Remove CONFIG_HOTPLUG ifdefs
  unicore32/PCI: Remove CONFIG_HOTPLUG ifdefs
  sh/PCI: Remove CONFIG_HOTPLUG ifdefs
  powerpc/PCI: Remove CONFIG_HOTPLUG ifdefs
  mips/PCI: Remove CONFIG_HOTPLUG ifdefs
  microblaze/PCI: Remove CONFIG_HOTPLUG ifdefs
  dma: remove use of __devinit
  dma: remove use of __devexit_p
  firewire: remove use of __devinitdata
  firewire: remove use of __devinit
  leds: remove use of __devexit
  leds: remove use of __devinit
  leds: remove use of __devexit_p
  mmc: remove use of __devexit
  ...
2012-12-11 13:13:55 -08:00
John W. Linville
c66cfd5325 Merge branch 'for-john' of git://git.sipsolutions.net/mac80211-next 2012-12-11 16:04:03 -05:00
Linus Torvalds
b0885d01f9 GPIO follow up patch and type change for v3.5 merge window
Primarily device driver additions, features and bug fixes. Not much
 touching gpio common subsystem support. Should not be scary.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQx2nHAAoJEEFnBt12D9kBKnIP/0y/0LMUUMVp1LN/UeCyPdq+
 rqWdI57a+ToAiGcdSG2INR5fjC8duPP1UOSgQXQFsAdF1qy7vN7ejPqmoAGcVoRW
 P9O64s8eYQprabx3fSiqAOhav6ZUpxyfri9z/sz8JaTlpJrbiqf1MrxFQ/0oXZa9
 KqOFAJvKn+iqWjcpFkmeIvNsFT2lTeURyXhvYWUFig/VVuS335+FZYX0Ic1C69YM
 tf0Z+a6XO8JnAKgC13GsyJ6ctXA1kg1oKLnLHEekr3Qhkic3MTFKS2dPExzGjnbi
 NY4ev2SxAq74CFwSJDuhPiPk20FpveHKHLsptFdNpCR9lMG038oRnqAnYyw3gV/w
 z4GufpIZGK/xemIgHqNHejxS+tcH4Ax1wU++TkmIvsPJDq7uZPX/Gu9/+BMpeKxJ
 oJJV+mRCKDcjxXcxtrybF9+t8WdVZfW2qSt1K7LRO3eRV2n9Y+20R6iGKXhYxHaj
 TaQTtXIbc4q5ANg72O+c8htBhy0a2H1O5CtrXwwxBBHHsRachyHT6V9AD+7AKZ6e
 YElRV+v8dOviuUcj+nbf2riA7KnwtBLYfwdVQzTfbD1Fq8RUvMEjq2XQXYKhrMSw
 r8gp1sUnFmAVOikJFqVgYN8NyToVEyw1i2LH8skzCUnE0PPi+kT8CtaY+tTMuF3v
 mBixcMEKKhzsFtYmAqU+
 =2XeO
 -----END PGP SIGNATURE-----

Merge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6

Pull GPIO updates from Grant Likely:
 "GPIO follow up patch and type change for v3.5 merge window

  Primarily device driver additions, features and bug fixes.  Not much
  touching gpio common subsystem support.  Should not be scary."

* tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux-2.6: (34 commits)
  gpio: Provide the STMPE GPIO driver with its own IRQ Domain
  gpio: add TS-5500 DIO blocks support
  gpio: pcf857x: use client->irq for gpio_to_irq()
  gpio: stmpe: Add DT support for stmpe gpio
  gpio: pl061 depends on ARM
  gpio/pl061: remove old comment
  gpio: SPEAr: add spi chipselect control driver
  gpio: gpio-max710x: Support device tree probing
  gpio: twl4030: Use only TWL4030_MODULE_LED for LED configuration
  gpio: tegra: read output value when gpio is set in direction_out
  gpio: pca953x: Add compatible strings to gpio-pca953x driver
  gpio: pca953x: Register an IRQ domain
  gpio: mvebu: Set free callback for gpio_chip
  gpio: tegra: Drop exporting static functions
  gpio: tegra: Staticize non-exported symbols
  gpio: tegra: fix suspend/resume apis
  gpio-pch: Set parent dev for gpio chip
  gpio: em: Fix build errors
  GPIO: clps711x: use platform_device_unregister in gpio_clps711x_init()
  gpio/tc3589x: convert to use the simple irqdomain
  ...
2012-12-11 13:00:56 -08:00
Linus Torvalds
bad73c5aa0 ACPI and power management updates for 3.8-rc1
* Introduction of device PM QoS flags.
 
 * ACPI device power management update allowing subsystems other than
   PCI to use it more easily.
 
 * ACPI device enumeration rework allowing additional kinds of devices
   to be enumerated via ACPI.  From Mika Westerberg, Adrian Hunter,
   Mathias Nyman, Andy Shevchenko, and Rafael J. Wysocki.
 
 * ACPICA update to version 20121018 from Bob Moore and Lv Zheng.
 
 * ACPI memory hotplug update from Wen Congyang and Yasuaki Ishimatsu.
 
 * Introduction of acpi_handle_<level>() messaging macros and ACPI-based CPU
   hot-remove support from Toshi Kani.
 
 * ACPI EC updates from Feng Tang.
 
 * cpufreq updates from Viresh Kumar, Fabio Baltieri and others.
 
 * cpuidle changes to quickly notice governor prediction failure from
   Youquan Song.
 
 * Support for using multiple cpuidle drivers at the same time and cpuidle
   cleanups from Daniel Lezcano.
 
 * devfreq updates from Nishanth Menon and others.
 
 * cpupower update from Thomas Renninger.
 
 * Fixes and small cleanups all over the place.
 
 --
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJQxxevAAoJEKhOf7ml8uNsHacQAK2xoQozDddPBAaTCf1OWW/G
 J4E2qMn+gy4AtFmQ0/xeZVvvylOCn9eu/kv3QJ/bFlVoUsqTgfPwQBjT6HjF1Acn
 7yVFdr9H/tn2wi9nW2Gv6Tl2Hr/kFLpo0f5Kd40Z+P8SV5sKX5ktJcVpJ/I/P4Vh
 Qw2nWtj7hQktZDERzgG4ZpMbQToNhbLhbKWB9ad3/XQSSA9JkfgvBFgrTEGHcZD5
 bwsggjNdOAWNGeDdzRsQSDn0Alld5ILVdSJ5xKimO1O70WvKc7fqA1IdYRIeKL90
 yz4bcoYKzl9iktlw8+x5o1U9mrc8TFV5p4+zV+t5Z6pzS/J3kWvnsW4zu9sCrxFv
 Wic3SKyiem7s2dxrYyj4ZXAci3GK4ouRTrCLqk7/00tEGdwAQD1ZNfsUJp6jKayz
 FvtZUgItcOyrlQ6B4nh951OY6dI3AUYJ2NuWWNr5NZkgVAvQGV8zTGOImbeVeL2+
 pMiw14zScO3ylYilVcjTKDDMj2sDZ68mw5PIcbmksvWsCLo26jDBVDtLVmtYWyd4
 ek3WnOrQZr0R3agvOLLssMKXompvpP+N4Klf4rV+GtqGsWtHryYKys2Laju9FwFj
 yYLchxYlxhGTzqq8LjF90HDL0TWpPe6cPi+B5ow9g/SXLexbMKNQGhv3Jovm2yR3
 j54tKBWy7e9AAYEDPirX
 =6OEP
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-for-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management updates from Rafael Wysocki:

 - Introduction of device PM QoS flags.

 - ACPI device power management update allowing subsystems other than
   PCI to use it more easily.

 - ACPI device enumeration rework allowing additional kinds of devices
   to be enumerated via ACPI.  From Mika Westerberg, Adrian Hunter,
   Mathias Nyman, Andy Shevchenko, and Rafael J. Wysocki.

 - ACPICA update to version 20121018 from Bob Moore and Lv Zheng.

 - ACPI memory hotplug update from Wen Congyang and Yasuaki Ishimatsu.

 - Introduction of acpi_handle_<level>() messaging macros and ACPI-based
   CPU hot-remove support from Toshi Kani.

 - ACPI EC updates from Feng Tang.

 - cpufreq updates from Viresh Kumar, Fabio Baltieri and others.

 - cpuidle changes to quickly notice governor prediction failure from
   Youquan Song.

 - Support for using multiple cpuidle drivers at the same time and
   cpuidle cleanups from Daniel Lezcano.

 - devfreq updates from Nishanth Menon and others.

 - cpupower update from Thomas Renninger.

 - Fixes and small cleanups all over the place.

* tag 'pm+acpi-for-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (196 commits)
  mmc: sdhci-acpi: enable runtime-pm for device HID INT33C6
  ACPI: add Haswell LPSS devices to acpi_platform_device_ids list
  ACPI: add documentation about ACPI 5 enumeration
  pnpacpi: fix incorrect TEST_ALPHA() test
  ACPI / PM: Fix header of acpi_dev_pm_detach() in acpi.h
  ACPI / video: ignore BIOS initial backlight value for HP Folio 13-2000
  ACPI : do not use Lid and Sleep button for S5 wakeup
  ACPI / PNP: Do not crash due to stale pointer use during system resume
  ACPI / video: Add "Asus UL30VT" to ACPI video detect blacklist
  ACPI: do acpisleep dmi check when CONFIG_ACPI_SLEEP is set
  spi / ACPI: add ACPI enumeration support
  gpio / ACPI: add ACPI support
  PM / devfreq: remove compiler error with module governors (2)
  cpupower: IvyBridge (0x3a and 0x3e models) support
  cpupower: Provide -c param for cpupower monitor to schedule process on all cores
  cpupower tools: Fix warning and a bug with the cpu package count
  cpupower tools: Fix malloc of cpu_info structure
  cpupower tools: Fix issues with sysfs_topology_read_file
  cpupower tools: Fix minor warnings
  cpupower tools: Update .gitignore for files created in the debug directories
  ...
2012-12-11 12:45:35 -08:00
Linus Torvalds
b58ed041a3 Device tree changes for v3.8
Bug fixes, little cleanups, and documentation changes. The most invasive
 thing here touches a bunch of the arch directories to use a common build
 rule for .dtb files. There are no major changes to functionality here
 other than a ew new helper functions.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQx2/sAAoJEEFnBt12D9kB8aYP/iInsIXDi6c0sGRNgYoFBZUH
 pGZH/PGcZjDD2WPUOos56KTDsWt2kIZm/4ck1mBmXMD5pKM4znIQyMTkhKFiyO0l
 CARwCjyRuaHQJHt88Pcaqhk34HGLgV7ntImMHwpIn0mNbjx7hruk9aU9Jqdcd32j
 2ANbUYU9ikgq0e9s/+xQfB6QZxH1zecQkMalhQWvihtVybpG3xW67IROZv/69uL0
 OtHZZJ7nCwXcEb84rcEHU781tO0CoQhr/pId7DirQEKaD8oNLHsejFEgGEFfFrvZ
 eFmJP/YmZh7Ll+NNTMHnQwyDNa2LFuLazbjaqCqUHQgcw9daFFeP5ga3Uqp7WZbU
 4kIxBQJ3byELnttFVQbsMQag+IAgmgfp8YdJFk3VTJDJKwpMJAO1sQeoEUHnnWAr
 cyyCzROaFt1hUkhRiXso+IYNLvb60o0/2NJRTiObPdljy9OSXW6O3wFEGk9F/6u0
 jgl9tisfLUL0orKnJ5tR3kbHaJKQd+6HgBRJNiuB+9FYO43j1cY/sYggSNCkiCMy
 RvlGYDCJ+lfmZdZ4T+QYLpjsOenFosV0JZGU6MVlZmEGzTtLyhALZQKNHuxLw1Nc
 oTe8Fx5aJrWEYe/HGqm4C43DqEiQUCmZ9NPae0JyfocmjuwARSnepvWe1XZNa/8t
 aNfK6UkTO/IL3zOtOHfd
 =LB3n
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6

Pull device tree changes from Grant Likely:
 "Here are the DT changes I've got queued up for v3.8.  As described
  below, there are a lot of bug fixes here and documentation updates but
  nothing major:

  Bug fixes, little cleanups, and documentation changes.  The most
  invasive thing here touches a bunch of the arch directories to use a
  common build rule for .dtb files.  There are no major changes to
  functionality here other than a few new helper functions."

* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6: (34 commits)
  arm64: Fix the dtbs target building
  mtd: nand: davinci: fix the binding documentation
  rtc: rtc-mv: Add the device tree binding documentation
  devicetree/bindings: Move gpio-leds binding into leds directory
  of/vendor-prefixes: add Imagination Technologies
  microblaze: use new common dtc rule
  c6x: use new common dtc rule
  openrisc: use new common dtc rule
  arm64: Add dtbs target for building all the enabled dtb files
  arm64: use new common dtc rule
  ARM: dt: change .dtb build rules to build in dts directory
  kbuild: centralize .dts->.dtb rule
  Fix build when CONFIG_W1_MASTER_GPIO=m b exporting "allnodes"
  of/spi: Honour "status=disabled" property of device
  of_mdio: Honour "status=disabled" property of device
  of_i2c: Honour "status=disabled" property of device
  powerpc: Fix fallout from device_node->name constification
  of: add 'const' for of_parse_phandle parameter *np
  Documentation: correct of_platform_populate() argument list
  script: dtc: clean generated files
  ...
2012-12-11 11:30:41 -08:00
Linus Torvalds
9ada9fd5df Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
Pull EDAC fixes from Borislav Petkov:

 - EDAC core error path fix, from Denis Kirjanov.

 - Generalization of AMD MCE bank names and some minor error reporting
   improvements.

 - EDAC core cleanups and simplifications, from Wei Yongjun.

 - amd64_edac fixes for sysfs-reported values, from Josh Hunt.

 - some heavy amd64_edac error reporting path shaving, leading to
   removing a bunch of code.

 - amd64_edac error injection method improvements.

 - EDAC core cleanups and fixes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (24 commits)
  EDAC, pci_sysfs: Use for_each_pci_dev to simplify the code
  EDAC: Handle error path in edac_mc_sysfs_init() properly
  MCE, AMD: Dump error status
  MCE, AMD: Report decoded error type first
  MCE, AMD: Dump CPU f/m/s triple with the error
  MCE, AMD: Remove functional unit references
  EDAC: Convert to use simple_open()
  EDAC, Calxeda highbank: Convert to use simple_open()
  EDAC: Fix mc size reported in sysfs
  EDAC: Fix csrow size reported in sysfs
  EDAC: Pass mci parent
  EDAC: Add memory controller flags
  amd64_edac: Fix csrows size and pages computation
  amd64_edac: Use DBAM_DIMM macro
  amd64_edac: Fix K8 chip select reporting
  amd64_edac: Reorganize error reporting path
  amd64_edac: Do not check whether error address is valid
  amd64_edac: Improve error injection
  amd64_edac: Cleanup error injection code
  amd64_edac: Small fixlets and cleanups
  ...
2012-12-11 11:28:43 -08:00
Linus Torvalds
c45564e916 Merge branch 'for-v3.8' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull CMA and DMA-mapping update from Marek Szyprowski:
 "Another set of Contiguous Memory Allocator and DMA-mapping framework
  updates for v3.8.

  This pull request consists only of two patches.  The first fixes a
  long standing issue with dmapools (the code predates current GIT
  history), which forced all allocations to use GFP_ATOMIC flag,
  ignoring the flags passed by the caller.  The second patch changes CMA
  code to correctly use phys_addr_t type what enables support for LPAE
  systems."

* 'for-v3.8' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  drivers: cma: represent physical addresses as phys_addr_t
  mm: dmapool: use provided gfp flags for all dma_alloc_coherent() calls
2012-12-11 11:27:10 -08:00
Linus Torvalds
93874681aa The common clock framework changes for 3.8 are comprised of lots of
fixes for existing platforms as well as new ports for some ARM
 platforms.  In addition there are new clk drivers for audio devices and
 MFDs.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQxtdeAAoJEDqPOy9afJhJwzUP/2/oaBAGXakQf+TTOsRo2IMh
 ejwgOxFsBcspR0OrJ73TAPDqbgY3xZ+BeVdvbIiYikcZLqT9dZsoN7oa9udcu6aL
 1OxBT6F/CFnxUR4EVkpUdQ+vVIR8svxsAAv71zvaVGCeie0D7MDL2JgK8TvgRxHF
 DKxFYJ935CJC64JHJBYhW/1b4T/Tt94z/nYMijcQxkjmpEimTm/qLHpbK6OCQFUU
 fmvs3VmSA4p7hBmgXu3zp6NkOF3JJa7NWb+3kJh1UmqM7xh/CijxZP2YHhLkIdU1
 g2qhYVKIIxmAFa7xJjXY05VrjMKvAkXGNJGVwCFQHnP17By4Pni3BDsQ+61u30Nj
 B/bIRrzAC17EOh6c6pAZIbNLTHHaQGe0XQMDuHGsjgmVpn2CTRmIduEVJPiq9wAk
 lNkwqh6Dftq72Xepy1RieqFuDOO8kHSsOPqS2e9A9yDuh5bzLsKlhKWKUahhxrML
 TnRBd7NfwctoEsKy42HtrXA2+iQsQDmHXNlec3ARNgWS3Hhre7qb1d0Q00y28OTA
 RWyPoxOn1O+wQsV2cu3I1LKVo9CmNU55evHG5zSDPIA3GsrMcPZmP/4KM9Vbs3Ye
 5BIMtptUrOeZQ2PRxcTHnCbWvch5bQyvDkDiK/xR7XsiQIheE/0Ak8wGgVZ7TW4d
 0zLm7UmmkmFu4xTwf2Nk
 =GoXf
 -----END PGP SIGNATURE-----

Merge tag 'clk-for-linus' of git://git.linaro.org/people/mturquette/linux

Pull clock framework changes from Mike Turquette:
 "The common clock framework changes for 3.8 are comprised of lots of
  fixes for existing platforms as well as new ports for some ARM
  platforms.  In addition there are new clk drivers for audio devices
  and MFDs."

Fix up trivial conflict in <linux/clk-provider.h> (removal of 'inline'
clashing with return type fixes)

* tag 'clk-for-linus' of git://git.linaro.org/people/mturquette/linux: (51 commits)
  MAINTAINERS: bad email address for Mike Turquette
  clk: introduce optional disable_unused callback
  clk: ux500: fix bit error
  clk: clock multiplexers may register out of order
  clk: ux500: Initial support for abx500 clock driver
  CLK: SPEAr: Remove unused dummy apb_pclk
  CLK: SPEAr: Correct index scanning done for clock synths
  CLK: SPEAr: Update clock rate table
  CLK: SPEAr: Add missing clocks
  CLK: SPEAr: Set CLK_SET_RATE_PARENT for few clocks
  CLK: SPEAr13xx: fix parent names of multiple clocks
  CLK: SPEAr13xx: Fix mux clock names
  CLK: SPEAr: Fix dev_id & con_id for multiple clocks
  clk: move IM-PD1 clocks to drivers/clk
  clk: make ICST driver handle the VCO registers
  clk: add GPLv2 headers to the Versatile clock files
  clk: mxs: Use a better name for the USB PHY clock
  clk: spear: Add stub functions for spear3[0|1|2]0_clk_init()
  CLK: clk-twl6040: fix return value check in twl6040_clk_probe()
  clk: ux500: Register nomadik keypad clock lookups for u8500
  ...
2012-12-11 11:25:08 -08:00
Linus Torvalds
505cbedab9 This is the pinctrl big pull request for v3.8.
As can be seen from the diffstat the major changes
 are:
 
 - A big conversion of the AT91 pinctrl driver and
   the associated ACKed platform changes under
   arch/arm/max-at91 and its device trees. This
   has been coordinated with the AT91 maintainers
   to go in through the pinctrl tree.
 
 - A larger chunk of changes to the SPEAr drivers
   and the addition of the "plgpio" driver for the
   SPEAr as well.
 
 - The removal of the remnants of the Nomadik driver
   from the arch/arm tree and fusion of that into
   the Nomadik driver and platform data header files.
 
 - Some local movement in the Marvell MVEBU drivers,
   these now have their own subdirectory.
 
 - The addition of a chunk of code to gpiolib under
   drivers/gpio to register gpio-to-pin range mappings
   from the GPIO side of things. This has been
   requested by Grant Likely and is now implemented,
   it is particularly useful for device tree work.
 
 Then we have incremental updates all over the place,
 many of these are cleanups and fixes from Axel Lin
 who has done a great job of removing minor mistakes
 and compilation annoyances.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJQupLkAAoJEEEQszewGV1z8ykP/3yLi5hb3QstajrL3jvrHcqN
 7sc4uW1/9pCa6802nBw7qOfIGxgTAriGtAIePdtGhIkij6TLyyWfvlmUz3iJkeZ5
 4nYy69yOdNMeBGvhXBkBD4K4lL3NGZ9eR+S1rgWY0J3Y+a5upibJeaXxmYBayjBH
 bN/OiK77zaKv91zKSZ4YW9WCzrjn2E0w1mDRcWdffcyrNplY8qm/G2iXBT+UoCLa
 UoR1zxG9nqF+nQ8mL+dVtVjlHsUcj0NEp34HQrUQ8ACOaEIiSI/zDe7afOC38Iy7
 EUTV4IwKeKJyTnAN/QSzbTXF41CR/Qbihubo6sUrbAmyJXLnybVotd4Inh4ca7II
 c2TPV89tSnJWwDSizHwbY3sXIVw8ojmjYMr1ib0Z9GBGyoij1va5WqCJ4iIzTzuc
 imvDSz8ctuuxo6iOQs3smUaHXGz1V+3zvQ5v+Ioc1h9mN2LVKNa6NjmFNZmeFHLa
 44zIes51DUXizaRobOffjoTIlUkAdwYQUpRtq0hvQtgYTyUIeXzfzCNzDoT6bhK3
 VhLn4c4apETER6KtYCPu8PtxM/yyopwUj95WvnPK2fu/m+1B26jUVawomWfRtCQF
 kuovLCTTemn04jWWl3r0JovE/tVcgBrpxTYi6Z4RPY7PuD4sQ477DeM2x3DWZPQQ
 MHveLGA87735XKZkqQRR
 =rUOP
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-for-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pinctrl changes from Linus Walleij:
 "These are the first and major pinctrl changes for the v3.8 merge
  cycle.  Some of this is used as merge base for other trees so I better
  be early on the trigger.

  As can be seen from the diffstat the major changes are:

  - A big conversion of the AT91 pinctrl driver and the associated ACKed
    platform changes under arch/arm/max-at91 and its device trees.  This
    has been coordinated with the AT91 maintainers to go in through the
    pinctrl tree.

  - A larger chunk of changes to the SPEAr drivers and the addition of
    the "plgpio" driver for the SPEAr as well.

  - The removal of the remnants of the Nomadik driver from the arch/arm
    tree and fusion of that into the Nomadik driver and platform data
    header files.

  - Some local movement in the Marvell MVEBU drivers, these now have
    their own subdirectory.

  - The addition of a chunk of code to gpiolib under drivers/gpio to
    register gpio-to-pin range mappings from the GPIO side of things.
    This has been requested by Grant Likely and is now implemented, it
    is particularly useful for device tree work.

  Then we have incremental updates all over the place, many of these are
  cleanups and fixes from Axel Lin who has done a great job of removing
  minor mistakes and compilation annoyances."

* tag 'pinctrl-for-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (114 commits)
  ARM: mmp: select PINCTRL for ARCH_MMP
  pinctrl: Drop selecting PINCONF for MMP2, PXA168 and PXA910
  pinctrl: pinctrl-single: Fix error check condition
  pinctrl: SPEAr: Update error check for unsigned variables
  gpiolib: Fix use after free in gpiochip_add_pin_range
  gpiolib: rename pin range arguments
  pinctrl: single: support gpio request and free
  pinctrl: generic: add input schmitt disable parameter
  pinctrl/u300/coh901: stop spawning pinctrl from GPIO
  pinctrl/u300/coh901: let the gpio_chip register the range
  pinctrl: add function to retrieve range from pin
  gpiolib: return any error code from range creation
  pinctrl: make range registration defer properly
  gpiolib: rename find_pinctrl_*
  gpiolib: let gpiochip_add_pin_range() specify offset
  ARM: at91: pm9g45: add mmc support
  ARM: at91: Animeo IP: add mmc support
  ARM: at91: dt: add mmc pinctrl for Atmel reference boards
  ARM: at91: dt: at91sam9: add mmc pinctrl support
  ARM: at91/dts: add nodes for atmel hsmci controllers for atmel boards
  ...
2012-12-11 11:21:33 -08:00
Linus Torvalds
a8936db7c2 New driver: DA9055
Added/improved support for new chips in existing drivers: Z650/670, N550/570,
 ADS7830, AMD 16h family
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQxslVAAoJEMsfJm/On5mBCCIP/2zEvtwW2nEjlqlBIuIaKsW5
 +8nGOZvyd9Td0ApuC/b4UgzG4BYkOuiHFyUJGwJBcEMetNJqr650TRfS9RrLsi1w
 vOrCsIgjioWpzCEwD8EkgHLXFNk9lXYK9jlkpl434WuPvSUKwN4QE46XtHoSfAye
 NUJBj9hwDv87P68lcg7MhEyZxEPzcInPTtXMGFOiPprAKHQHeDJReSnqx8oKbmBb
 5ZD+zy+Byx1UyX1lqc77xjaInvGGW1OfUkljJb7iFdlXTqyNe0g2Chn07PMF5dJJ
 7stNCQ/z1l+Mo4mmgvGTgHcjooeZL+HpD3XQG0rERCV0X8prKHV2ctGPEQUoCt6c
 OpV08OgsaLii+2x7+otL6svyD22FoFFxS5UP1EkYZPyGekf/C5I9Wh9f77xJ+hes
 PNki3TrnNUzVAZKYOH9JgW2MJ9oYjvKkskZoSytnEoPAs1fS4rJ0a5bQf2sryXdT
 xsDw57fdjqK7A9hPzc/lIae6TYK0iMnQ4wO9nWD7ftQlH6AOT5O4vYAbwoCEks/Q
 6Bv3XCaBkt1GbWyCC+aXUWjvSdU0A3N/H7jaocwR9vcgoJmfwinJi10hVcgE1Kfu
 JVKQMvDFe4jV1j8hiQhGPSjxe+9suoaVBrWFB51W4DyDfJZQPOFd6+3om9/vGDb3
 Y3opAqGVV7IsenVtPif8
 =HQlU
 -----END PGP SIGNATURE-----

Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon updates from Guenter Roeck:
 "New driver: DA9055

  Added/improved support for new chips in existing drivers: Z650/670,
  N550/570, ADS7830, AMD 16h family"

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (da9055) Fix chan_mux[DA9055_ADC_ADCIN3] setting
  hwmon: DA9055 HWMON driver
  hwmon: (coretemp) List TjMax for Z650/670 and N550/570
  hwmon: (coretemp) Drop N4xx, N5xx, D4xx, D5xx CPUs from tjmax table
  hwmon: (coretemp) Use model table instead of if/else to identify CPU models
  hwmon: da9052: Use da9052_reg_update for rmw operations
  hwmon: (coretemp) Drop dependency on PCI for TjMax detection on Atom CPUs
  hwmon: (ina2xx) use module_i2c_driver to simplify the code
  hwmon: (ads7828) add support for ADS7830
  hwmon: (ads7828) driver cleanup
  x86,AMD: Power driver support for AMD's family 16h processors
2012-12-11 11:20:34 -08:00
Linus Torvalds
11b84c5857 MMC highlights for 3.8:
Core:
  - Expose access to the eMMC RPMB ("Replay Protected Memory Block") area
    by extending the existing mmc_block ioctl.
  - Add SDIO powered-suspend DT properties to the core MMC DT binding.
  - Add no-1-8-v DT flag for boards where the SD controller reports that it
    supports 1.8V but the board itself has no way to switch to 1.8V.
  - More work on switching to 1.8V UHS support using a vqmmc regulator.
  - Fix up a case where the slot-gpio helper may fail to reset the host
    controller properly if a card was removed during a transfer.
  - Fix several cases where a broken device could cause an infinite loop
    while we wait for a register to update.
 
 Drivers:
  - at91-mci: Remove obsolete driver, atmel-mci handles these devices now.
  - sdhci-dove: Allow using GPIOs for card-detect notifications.
  - sdhci-esdhc: Fix for recovering from ADMA errors on broken silicon.
  - sdhci-s3c: Add pinctrl support.
  - wmt-sdmmc: New driver for WonderMedia SD/MMC controllers.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQxpwoAAoJEHNBYZ7TNxYMir0P/in5xv6vVq+06hMb9JTA2+M0
 +eSgNAoSQt3Zn3dTxGtcaMbrnVvegMD2sRmnHncSKoi6iUgWh7RhuT7o8vp2zgJe
 KbuOF/OF5+RSnLwgCMcOCjauhBtHO7s+NPNbUH4DMN2/xQ7fpiY+QIsxrKWtImyO
 E4v62zRm/6aa1IfGDgYTAlX8Y1QfEQhH55OYrF9UzU1m7TKsSGxYTXwx+LeLQZQ0
 j6HO+FVlmu5jbZ4V69z3qNIdiklRGQBE7D+7cJqW6btv7x4oLmyygEoNA8M8ncxX
 g5BD5oEt9oxW5OW8hYcEJ/flDErgekZekH8n+fZoNVDVq8gMJxHOV73mnN+wxRnd
 lRGLhniVOzl4cU+ObcYTBzdUV+2yR3sRHSK8nwtb+HFATetM70gBlpoWni1+sGr5
 qvFhnpzNrj0xCwoQpLerwrvCwPpyko9uxZQwO51HqhugFwWlJuTqQsnCgj+Q4MDE
 CRUq4R6TCH7oMGeGin0qyD7jyigDnh7g4pPNQmuIYJSLsDwvinisz98MQ0XC8szc
 Z1H9YXKch2woLHLPFWzNyAIlJWHXakjk5vb9uSQmkUIto/y7q+LiGVCc5uvny3rU
 pJmda+U/8R8YmSWS7pUi6OmPRfcnPuAIllEwo3lBypVpdP/fcZ6aribrbGEo6DLb
 XLFIHnl3O9/sjfUcMl+L
 =oSX3
 -----END PGP SIGNATURE-----

Merge tag 'mmc-updates-for-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc

Pull MMC updates from Chris Ball:
 "MMC highlights for 3.8:

  Core:
   - Expose access to the eMMC RPMB ("Replay Protected Memory Block")
     area by extending the existing mmc_block ioctl.
   - Add SDIO powered-suspend DT properties to the core MMC DT binding.
   - Add no-1-8-v DT flag for boards where the SD controller reports
     that it supports 1.8V but the board itself has no way to switch to
     1.8V.
   - More work on switching to 1.8V UHS support using a vqmmc regulator.
   - Fix up a case where the slot-gpio helper may fail to reset the host
     controller properly if a card was removed during a transfer.
   - Fix several cases where a broken device could cause an infinite
     loop while we wait for a register to update.

  Drivers:
   - at91-mci: Remove obsolete driver, atmel-mci handles these devices
     now.
   - sdhci-dove: Allow using GPIOs for card-detect notifications.
   - sdhci-esdhc: Fix for recovering from ADMA errors on broken silicon.
   - sdhci-s3c: Add pinctrl support.
   - wmt-sdmmc: New driver for WonderMedia SD/MMC controllers."

* tag 'mmc-updates-for-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (65 commits)
  mmc: sdhci: implement the .card_event() method
  mmc: extend the slot-gpio card-detection to use host's .card_event() method
  mmc: add a card-event host operation
  mmc: sdhci-s3c: Fix compilation warning
  mmc: sdhci-pci: Enable SDHCI_CAN_DO_HISPD for Ricoh SDHCI controller
  mmc: sdhci-dove: allow GPIOs to be used for card detection on Dove
  mmc: sdhci-dove: use two-stage initialization for sdhci-pltfm
  mmc: sdhci-dove: use devm_clk_get()
  mmc: eSDHC: Recover from ADMA errors
  mmc: dw_mmc: remove duplicated buswidth code
  mmc: dw_mmc: relocate where dw_mci_setup_bus() is called from
  mmc: Limit MMC speed to 52MHz if not HS200
  mmc: dw_mmc: use devres functions in dw_mmc
  mmc: sh_mmcif: remove unneeded clock connection ID
  mmc: sh_mobile_sdhi: remove unneeded clock connection ID
  mmc: sh_mobile_sdhi: fix clock frequency printing
  mmc: Remove redundant null check before kfree in bus.c
  mmc: Remove redundant null check before kfree in sdio_bus.c
  mmc: sdhci-imx-esdhc: use more devm_* functions
  mmc: dt: add no-1-8-v device tree flag
  ...
2012-12-11 11:19:09 -08:00
Eric Paris
0a6b6bd591 fsnotify: make fasync generic for both inotify and fanotify
inotify is supposed to support async signal notification when information
is available on the inotify fd.  This patch moves that support to generic
fsnotify functions so it can be used by all notification mechanisms.

Signed-off-by: Eric Paris <eparis@redhat.com>
2012-12-11 13:44:36 -05:00
Lino Sanfilippo
6960b0d909 fsnotify: change locking order
On Mon, Aug 01, 2011 at 04:38:22PM -0400, Eric Paris wrote:
>
> I finally built and tested a v3.0 kernel with these patches (I know I'm
> SOOOOOO far behind).  Not what I hoped for:
>
> > [  150.937798] VFS: Busy inodes after unmount of tmpfs. Self-destruct in 5 seconds.  Have a nice day...
> > [  150.945290] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
> > [  150.946012] IP: [<ffffffff810ffd58>] shmem_free_inode+0x18/0x50
> > [  150.946012] PGD 2bf9e067 PUD 2bf9f067 PMD 0
> > [  150.946012] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> > [  150.946012] CPU 0
> > [  150.946012] Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables ext4 jbd2 crc16 joydev ata_piix i2c_piix4 pcspkr uinput ipv6 autofs4 usbhid [last unloaded: scsi_wait_scan]
> > [  150.946012]
> > [  150.946012] Pid: 2764, comm: syscall_thrash Not tainted 3.0.0+ #1 Red Hat KVM
> > [  150.946012] RIP: 0010:[<ffffffff810ffd58>]  [<ffffffff810ffd58>] shmem_free_inode+0x18/0x50
> > [  150.946012] RSP: 0018:ffff88002c2e5df8  EFLAGS: 00010282
> > [  150.946012] RAX: 000000004e370d9f RBX: 0000000000000000 RCX: ffff88003a029438
> > [  150.946012] RDX: 0000000033630a5f RSI: 0000000000000000 RDI: ffff88003491c240
> > [  150.946012] RBP: ffff88002c2e5e08 R08: 0000000000000000 R09: 0000000000000000
> > [  150.946012] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003a029428
> > [  150.946012] R13: ffff88003a029428 R14: ffff88003a029428 R15: ffff88003499a610
> > [  150.946012] FS:  00007f5a05420700(0000) GS:ffff88003f600000(0000) knlGS:0000000000000000
> > [  150.946012] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [  150.946012] CR2: 0000000000000070 CR3: 000000002a662000 CR4: 00000000000006f0
> > [  150.946012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  150.946012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > [  150.946012] Process syscall_thrash (pid: 2764, threadinfo ffff88002c2e4000, task ffff88002bfbc760)
> > [  150.946012] Stack:
> > [  150.946012]  ffff88003a029438 ffff88003a029428 ffff88002c2e5e38 ffffffff81102f76
> > [  150.946012]  ffff88003a029438 ffff88003a029598 ffffffff8160f9c0 ffff88002c221250
> > [  150.946012]  ffff88002c2e5e68 ffffffff8115e9be ffff88002c2e5e68 ffff88003a029438
> > [  150.946012] Call Trace:
> > [  150.946012]  [<ffffffff81102f76>] shmem_evict_inode+0x76/0x130
> > [  150.946012]  [<ffffffff8115e9be>] evict+0x7e/0x170
> > [  150.946012]  [<ffffffff8115ee40>] iput_final+0xd0/0x190
> > [  150.946012]  [<ffffffff8115ef33>] iput+0x33/0x40
> > [  150.946012]  [<ffffffff81180205>] fsnotify_destroy_mark_locked+0x145/0x160
> > [  150.946012]  [<ffffffff81180316>] fsnotify_destroy_mark+0x36/0x50
> > [  150.946012]  [<ffffffff81181937>] sys_inotify_rm_watch+0x77/0xd0
> > [  150.946012]  [<ffffffff815aca52>] system_call_fastpath+0x16/0x1b
> > [  150.946012] Code: 67 4a 00 b8 e4 ff ff ff eb aa 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 10 48 89 1c 24 4c 89 64 24 08 48 8b 9f 40 05 00 00
> > [  150.946012]  83 7b 70 00 74 1c 4c 8d a3 80 00 00 00 4c 89 e7 e8 d2 5d 4a
> > [  150.946012] RIP  [<ffffffff810ffd58>] shmem_free_inode+0x18/0x50
> > [  150.946012]  RSP <ffff88002c2e5df8>
> > [  150.946012] CR2: 0000000000000070
>
> Looks at aweful lot like the problem from:
> http://www.spinics.net/lists/linux-fsdevel/msg46101.html
>

I tried to reproduce this bug with your test program, but without success.
However, if I understand correctly, this occurs since we dont hold any locks when
we call iput() in mark_destroy(), right?
With the patches you tested, iput() is also not called within any lock, since the
groups mark_mutex is released temporarily before iput() is called.  This is, since
the original codes behaviour is similar.
However since we now have a mutex as the biggest lock, we can do what you
suggested (http://www.spinics.net/lists/linux-fsdevel/msg46107.html) and
call iput() with the mutex held to avoid the race.
The patch below implements this. It uses nested locking to avoid deadlock in case
we do the final iput() on an inode which still holds marks and thus would take
the mutex again when calling fsnotify_inode_delete() in destroy_inode().

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2012-12-11 13:44:36 -05:00
Lino Sanfilippo
64c20d2a20 fsnotify: dont put marks on temporary list when clearing marks by group
In clear_marks_by_group_flags() the mark list of a group is iterated and the
marks are put on a temporary list.
Since we introduced fsnotify_destroy_mark_locked() we dont need the temp list
any more and are able to remove the marks while the mark list is iterated and
the mark list mutex is held.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2012-12-11 13:44:36 -05:00
Lino Sanfilippo
d5a335b845 fsnotify: introduce locked versions of fsnotify_add_mark() and fsnotify_remove_mark()
This patch introduces fsnotify_add_mark_locked() and fsnotify_remove_mark_locked()
which are essentially the same as fsnotify_add_mark() and fsnotify_remove_mark() but
assume that the caller has already taken the groups mark mutex.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2012-12-11 13:44:36 -05:00
Lino Sanfilippo
e2a29943e9 fsnotify: pass group to fsnotify_destroy_mark()
In fsnotify_destroy_mark() dont get the group from the passed mark anymore,
but pass the group itself as an additional parameter to the function.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2012-12-11 13:44:36 -05:00
Lino Sanfilippo
986ab09807 fsnotify: use a mutex instead of a spinlock to protect a groups mark list
Replaces the groups mark_lock spinlock with a mutex. Using a mutex instead
of a spinlock results in more flexibility (i.e it allows to sleep while the
lock is held).

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2012-12-11 13:29:46 -05:00
Lino Sanfilippo
9861295204 fsnotify: introduce fsnotify_get_group()
Introduce fsnotify_get_group() which increments the reference counter of a group.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2012-12-11 13:29:44 -05:00
Lino Sanfilippo
d8153d4d8b inotify, fanotify: replace fsnotify_put_group() with fsnotify_destroy_group()
Currently in fsnotify_put_group() the ref count of a group is decremented and if
it becomes 0 fsnotify_destroy_group() is called. Since a groups ref count is only
at group creation set to 1 and never increased after that a call to fsnotify_put_group()
always results in a call to fsnotify_destroy_group().
With this patch fsnotify_destroy_group() is called directly.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2012-12-11 13:29:43 -05:00
Eric Dumazet
f8e8f97c11 net: fix a race in gro_cell_poll()
Dmitry Kravkov reported packet drops for GRE packets since GRO support
was added.

There is a race in gro_cell_poll() because we call napi_complete()
without any synchronization with a concurrent gro_cells_receive()

Once bug was triggered, we queued packets but did not schedule NAPI
poll.

We can fix this issue using the spinlock protected the napi_skbs queue,
as we have to hold it to perform skb dequeue anyway.

As we open-code skb_dequeue(), we no longer need to mask IRQS, as both
producer and consumer run under BH context.

Bug added in commit c9e6bc644e (net: add gro_cells infrastructure)

Reported-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-11 12:49:53 -05:00
Guennadi Liakhovetski
93c667ca25 of: *node argument to of_parse_phandle_with_args should be const
The "struct device_node *" argument of of_parse_phandle_with_args() can
be const. Making this change makes it explicit that the function will
not modify a node.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
[grant.likely: Resolved conflict with previous patch modifying of_parse_phandle()]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2012-12-11 17:30:16 +00:00
Guennadi Liakhovetski
1daa8a0b20 of/i2c: add dummy inline functions for when CONFIG_OF_I2C(_MODULE) isn't defined
If CONFIG_OF_I2C and CONFIG_OF_I2C_MODULE are undefined no declaration of
of_find_i2c_device_by_node and of_find_i2c_adapter_by_node will be
available. Add dummy inline functions to avoid compilation breakage.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2012-12-11 17:30:16 +00:00
Ingo Molnar
4fc3f1d66b mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable
rmap_walk_anon() and try_to_unmap_anon() appears to be too
careful about locking the anon vma: while it needs protection
against anon vma list modifications, it does not need exclusive
access to the list itself.

Transforming this exclusive lock to a read-locked rwsem removes
a global lock from the hot path of page-migration intense
threaded workloads which can cause pathological performance like
this:

    96.43%        process 0  [kernel.kallsyms]  [k] perf_trace_sched_switch
                  |
                  --- perf_trace_sched_switch
                      __schedule
                      schedule
                      schedule_preempt_disabled
                      __mutex_lock_common.isra.6
                      __mutex_lock_slowpath
                      mutex_lock
                     |
                     |--50.61%-- rmap_walk
                     |          move_to_new_page
                     |          migrate_pages
                     |          migrate_misplaced_page
                     |          __do_numa_page.isra.69
                     |          handle_pte_fault
                     |          handle_mm_fault
                     |          __do_page_fault
                     |          do_page_fault
                     |          page_fault
                     |          __memset_sse2
                     |          |
                     |           --100.00%-- worker_thread
                     |                     |
                     |                      --100.00%-- start_thread
                     |
                      --49.39%-- page_lock_anon_vma
                                try_to_unmap_anon
                                try_to_unmap
                                migrate_pages
                                migrate_misplaced_page
                                __do_numa_page.isra.69
                                handle_pte_fault
                                handle_mm_fault
                                __do_page_fault
                                do_page_fault
                                page_fault
                                __memset_sse2
                                |
                                 --100.00%-- worker_thread
                                           start_thread

With this change applied the profile is now nicely flat
and there's no anon-vma related scheduling/blocking.

Rename anon_vma_[un]lock() => anon_vma_[un]lock_write(),
to make it clearer that it's an exclusive write-lock in
that case - suggested by Rik van Riel.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Turner <pjt@google.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:43:00 +00:00
Ingo Molnar
5a505085f0 mm/rmap: Convert the struct anon_vma::mutex to an rwsem
Convert the struct anon_vma::mutex to an rwsem, which will help
in solving a page-migration scalability problem. (Addressed in
a separate patch.)

The conversion is simple and straightforward: in every case
where we mutex_lock()ed we'll now down_write().

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Turner <pjt@google.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:43:00 +00:00
Mel Gorman
220018d388 mm: numa: Add THP migration for the NUMA working set scanning fault case build fix
Commit "Add THP migration for the NUMA working set scanning fault case"
breaks the build because HPAGE_PMD_SHIFT and HPAGE_PMD_MASK defined to
explode without CONFIG_TRANSPARENT_HUGEPAGE:

mm/migrate.c: In function 'migrate_misplaced_transhuge_page_put':
mm/migrate.c:1549: error: call to '__build_bug_failed' declared with attribute error: BUILD_BUG failed
mm/migrate.c:1564: error: call to '__build_bug_failed' declared with attribute error: BUILD_BUG failed
mm/migrate.c:1566: error: call to '__build_bug_failed' declared with attribute error: BUILD_BUG failed
mm/migrate.c:1573: error: call to '__build_bug_failed' declared with attribute error: BUILD_BUG failed
mm/migrate.c:1606: error: call to '__build_bug_failed' declared with attribute error: BUILD_BUG failed
mm/migrate.c:1648: error: call to '__build_bug_failed' declared with attribute error: BUILD_BUG failed

CONFIG_NUMA_BALANCING allows compilation without enabling transparent
hugepages, so define the dummy function for such a configuration and only
define migrate_misplaced_transhuge_page_put() when transparent hugepages
are enabled.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:58 +00:00
Mel Gorman
b32967ff10 mm: numa: Add THP migration for the NUMA working set scanning fault case.
Note: This is very heavily based on a patch from Peter Zijlstra with
	fixes from Ingo Molnar, Hugh Dickins and Johannes Weiner.  That patch
	put a lot of migration logic into mm/huge_memory.c where it does
	not belong. This version puts tries to share some of the migration
	logic with migrate_misplaced_page.  However, it should be noted
	that now migrate.c is doing more with the pagetable manipulation
	than is preferred. The end result is barely recognisable so as
	before, the signed-offs had to be removed but will be re-added if
	the original authors are ok with it.

Add THP migration for the NUMA working set scanning fault case.

It uses the page lock to serialize. No migration pte dance is
necessary because the pte is already unmapped when we decide
to migrate.

[dhillf@gmail.com: Fix memory leak on isolation failure]
[dhillf@gmail.com: Fix transfer of last_nid information]
Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:57 +00:00
Mel Gorman
5bca230353 mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node
Due to the fact that migrations are driven by the CPU a task is running
on there is no point tracking NUMA faults until one task runs on a new
node. This patch tracks the first node used by an address space. Until
it changes, PTE scanning is disabled and no NUMA hinting faults are
trapped. This should help workloads that are short-lived, do not care
about NUMA placement or have bound themselves to a single node.

This takes advantage of the logic in "mm: sched: numa: Implement slow
start for working set sampling" to delay when the checks are made. This
will take advantage of processes that set their CPU and node bindings
early in their lifetime. It will also potentially allow any initial load
balancing to take place.

Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:56 +00:00
Mel Gorman
1a687c2e9a mm: sched: numa: Control enabling and disabling of NUMA balancing
This patch adds Kconfig options and kernel parameters to allow the
enabling and disabling of automatic NUMA balancing. The existance
of such a switch was and is very important when debugging problems
related to transparent hugepages and we should have the same for
automatic NUMA placement.

Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:55 +00:00
Mel Gorman
b8593bfda1 mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate
The PTE scanning rate and fault rates are two of the biggest sources of
system CPU overhead with automatic NUMA placement.  Ideally a proper policy
would detect if a workload was properly placed, schedule and adjust the
PTE scanning rate accordingly. We do not track the necessary information
to do that but we at least know if we migrated or not.

This patch scans slower if a page was not migrated as the result of a
NUMA hinting fault up to sysctl_numa_balancing_scan_period_max which is
now higher than the previous default. Once every minute it will reset
the scanner in case of phase changes.

This is hilariously crude and the numbers are arbitrary. Workloads will
converge quite slowly in comparison to what a proper policy should be able
to do. On the plus side, we will chew up less CPU for workloads that have
no need for automatic balancing.

Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:55 +00:00
Mel Gorman
57e0a03091 mm: numa: Introduce last_nid to the page frame
This patch introduces a last_nid field to the page struct. This is used
to build a two-stage filter in the next patch that is aimed at
mitigating a problem whereby pages migrate to the wrong node when
referenced by a process that was running off its home node.

Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:52 +00:00
Mel Gorman
e14808b49f mm: numa: Rate limit setting of pte_numa if node is saturated
If there are a large number of NUMA hinting faults and all of them
are resulting in migrations it may indicate that memory is just
bouncing uselessly around. NUMA balancing cost is likely exceeding
any benefit from locality. Rate limit the PTE updates if the node
is migration rate-limited. As noted in the comments, this distorts
the NUMA faulting statistics.

Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:51 +00:00
Andrea Arcangeli
8177a420ed mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting
This defines the per-node data used by Migrate On Fault in order to
rate limit the migration. The rate limiting is applied independently
to each destination node.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:50 +00:00
Mel Gorman
5606e3877a mm: numa: Migrate on reference policy
This is the simplest possible policy that still does something of note.
When a pte_numa is faulted, it is moved immediately. Any replacement
policy must at least do better than this and in all likelihood this
policy regresses normal workloads.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:48 +00:00
Mel Gorman
03c5a6e163 mm: numa: Add pte updates, hinting and migration stats
It is tricky to quantify the basic cost of automatic NUMA placement in a
meaningful manner. This patch adds some vmstats that can be used as part
of a basic costing model.

u    = basic unit = sizeof(void *)
Ca   = cost of struct page access = sizeof(struct page) / u
Cpte = Cost PTE access = Ca
Cupdate = Cost PTE update = (2 * Cpte) + (2 * Wlock)
	where Cpte is incurred twice for a read and a write and Wlock
	is a constant representing the cost of taking or releasing a
	lock
Cnumahint = Cost of a minor page fault = some high constant e.g. 1000
Cpagerw = Cost to read or write a full page = Ca + PAGE_SIZE/u
Ci = Cost of page isolation = Ca + Wi
	where Wi is a constant that should reflect the approximate cost
	of the locking operation
Cpagecopy = Cpagerw + (Cpagerw * Wnuma) + Ci + (Ci * Wnuma)
	where Wnuma is the approximate NUMA factor. 1 is local. 1.2
	would imply that remote accesses are 20% more expensive

Balancing cost = Cpte * numa_pte_updates +
		Cnumahint * numa_hint_faults +
		Ci * numa_pages_migrated +
		Cpagecopy * numa_pages_migrated

Note that numa_pages_migrated is used as a measure of how many pages
were isolated even though it would miss pages that failed to migrate. A
vmstat counter could have been added for it but the isolation cost is
pretty marginal in comparison to the overall cost so it seemed overkill.

The ideal way to measure automatic placement benefit would be to count
the number of remote accesses versus local accesses and do something like

	benefit = (remote_accesses_before - remove_access_after) * Wnuma

but the information is not readily available. As a workload converges, the
expection would be that the number of remote numa hints would reduce to 0.

	convergence = numa_hint_faults_local / numa_hint_faults
		where this is measured for the last N number of
		numa hints recorded. When the workload is fully
		converged the value is 1.

This can measure if the placement policy is converging and how fast it is
doing it.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:48 +00:00
Peter Zijlstra
4b96a29ba8 mm: sched: numa: Implement slow start for working set sampling
Add a 1 second delay before starting to scan the working set of
a task and starting to balance it amongst nodes.

[ note that before the constant per task WSS sampling rate patch
  the initial scan would happen much later still, in effect that
  patch caused this regression. ]

The theory is that short-run tasks benefit very little from NUMA
placement: they come and go, and they better stick to the node
they were started on. As tasks mature and rebalance to other CPUs
and nodes, so does their NUMA placement have to change and so
does it start to matter more and more.

In practice this change fixes an observable kbuild regression:

   # [ a perf stat --null --repeat 10 test of ten bzImage builds to /dev/shm ]

   !NUMA:
   45.291088843 seconds time elapsed                                          ( +-  0.40% )
   45.154231752 seconds time elapsed                                          ( +-  0.36% )

   +NUMA, no slow start:
   46.172308123 seconds time elapsed                                          ( +-  0.30% )
   46.343168745 seconds time elapsed                                          ( +-  0.25% )

   +NUMA, 1 sec slow start:
   45.224189155 seconds time elapsed                                          ( +-  0.25% )
   45.160866532 seconds time elapsed                                          ( +-  0.17% )

and it also fixes an observable perf bench (hackbench) regression:

   # perf stat --null --repeat 10 perf bench sched messaging

   -NUMA:

   -NUMA:                  0.246225691 seconds time elapsed                   ( +-  1.31% )
   +NUMA no slow start:    0.252620063 seconds time elapsed                   ( +-  1.13% )

   +NUMA 1sec delay:       0.248076230 seconds time elapsed                   ( +-  1.35% )

The implementation is simple and straightforward, most of the patch
deals with adding the /proc/sys/kernel/numa_balancing_scan_delay_ms tunable
knob.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
[ Wrote the changelog, ran measurements, tuned the default. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:47 +00:00
Peter Zijlstra
6e5fb223e8 mm: sched: numa: Implement constant, per task Working Set Sampling (WSS) rate
Previously, to probe the working set of a task, we'd use
a very simple and crude method: mark all of its address
space PROT_NONE.

That method has various (obvious) disadvantages:

 - it samples the working set at dissimilar rates,
   giving some tasks a sampling quality advantage
   over others.

 - creates performance problems for tasks with very
   large working sets

 - over-samples processes with large address spaces but
   which only very rarely execute

Improve that method by keeping a rotating offset into the
address space that marks the current position of the scan,
and advance it by a constant rate (in a CPU cycles execution
proportional manner). If the offset reaches the last mapped
address of the mm then it then it starts over at the first
address.

The per-task nature of the working set sampling functionality in this tree
allows such constant rate, per task, execution-weight proportional sampling
of the working set, with an adaptive sampling interval/frequency that
goes from once per 100ms up to just once per 8 seconds.  The current
sampling volume is 256 MB per interval.

As tasks mature and converge their working set, so does the
sampling rate slow down to just a trickle, 256 MB per 8
seconds of CPU time executed.

This, beyond being adaptive, also rate-limits rarely
executing systems and does not over-sample on overloaded
systems.

[ In AutoNUMA speak, this patch deals with the effective sampling
  rate of the 'hinting page fault'. AutoNUMA's scanning is
  currently rate-limited, but it is also fundamentally
  single-threaded, executing in the knuma_scand kernel thread,
  so the limit in AutoNUMA is global and does not scale up with
  the number of CPUs, nor does it scan tasks in an execution
  proportional manner.

  So the idea of rate-limiting the scanning was first implemented
  in the AutoNUMA tree via a global rate limit. This patch goes
  beyond that by implementing an execution rate proportional
  working set sampling rate that is not implemented via a single
  global scanning daemon. ]

[ Dan Carpenter pointed out a possible NULL pointer dereference in the
  first version of this patch. ]

Based-on-idea-by: Andrea Arcangeli <aarcange@redhat.com>
Bug-Found-By: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
[ Wrote changelog and fixed bug. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:46 +00:00
Peter Zijlstra
cbee9f88ec mm: numa: Add fault driven placement and migration
NOTE: This patch is based on "sched, numa, mm: Add fault driven
	placement and migration policy" but as it throws away all the policy
	to just leave a basic foundation I had to drop the signed-offs-by.

This patch creates a bare-bones method for setting PTEs pte_numa in the
context of the scheduler that when faulted later will be faulted onto the
node the CPU is running on.  In itself this does nothing useful but any
placement policy will fundamentally depend on receiving hints on placement
from fault context and doing something intelligent about it.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:45 +00:00
Mel Gorman
a720094ded mm: mempolicy: Hide MPOL_NOOP and MPOL_MF_LAZY from userspace for now
The use of MPOL_NOOP and MPOL_MF_LAZY to allow an application to
explicitly request lazy migration is a good idea but the actual
API has not been well reviewed and once released we have to support it.
For now this patch prevents an application using the services. This
will need to be revisited.

Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:44 +00:00
Mel Gorman
4b10e7d562 mm: mempolicy: Implement change_prot_numa() in terms of change_protection()
This patch converts change_prot_numa() to use change_protection(). As
pte_numa and friends check the PTE bits directly it is necessary for
change_protection() to use pmd_mknuma(). Hence the required
modifications to change_protection() are a little clumsy but the
end result is that most of the numa page table helpers are just one or
two instructions.

Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:44 +00:00
Lee Schermerhorn
b24f53a0be mm: mempolicy: Add MPOL_MF_LAZY
NOTE: Once again there is a lot of patch stealing and the end result
	is sufficiently different that I had to drop the signed-offs.
	Will re-add if the original authors are ok with that.

This patch adds another mbind() flag to request "lazy migration".  The
flag, MPOL_MF_LAZY, modifies MPOL_MF_MOVE* such that the selected
pages are marked PROT_NONE. The pages will be migrated in the fault
path on "first touch", if the policy dictates at that time.

"Lazy Migration" will allow testing of migrate-on-fault via mbind().
Also allows applications to specify that only subsequently touched
pages be migrated to obey new policy, instead of all pages in range.
This can be useful for multi-threaded applications working on a
large shared data area that is initialized by an initial thread
resulting in all pages on one [or a few, if overflowed] nodes.
After PROT_NONE, the pages in regions assigned to the worker threads
will be automatically migrated local to the threads on 1st touch.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:43 +00:00
Mel Gorman
4daae3b4b9 mm: mempolicy: Use _PAGE_NUMA to migrate pages
Note: Based on "mm/mpol: Use special PROT_NONE to migrate pages" but
	sufficiently different that the signed-off-bys were dropped

Combine our previous _PAGE_NUMA, mpol_misplaced and migrate_misplaced_page()
pieces into an effective migrate on fault scheme.

Note that (on x86) we rely on PROT_NONE pages being !present and avoid
the TLB flush from try_to_unmap(TTU_MIGRATION). This greatly improves the
page-migration performance.

Based-on-work-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:42 +00:00
Peter Zijlstra
7039e1dbec mm: migrate: Introduce migrate_misplaced_page()
Note: This was originally based on Peter's patch "mm/migrate: Introduce
	migrate_misplaced_page()" but borrows extremely heavily from Andrea's
	"autonuma: memory follows CPU algorithm and task/mm_autonuma stats
	collection". The end result is barely recognisable so signed-offs
	had to be dropped. If original authors are ok with it, I'll
	re-add the signed-off-bys.

Add migrate_misplaced_page() which deals with migrating pages from
faults.

Based-on-work-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Based-on-work-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Based-on-work-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:41 +00:00
Lee Schermerhorn
771fb4d806 mm: mempolicy: Check for misplaced page
This patch provides a new function to test whether a page resides
on a node that is appropriate for the mempolicy for the vma and
address where the page is supposed to be mapped.  This involves
looking up the node where the page belongs.  So, the function
returns that node so that it may be used to allocated the page
without consulting the policy again.

A subsequent patch will call this function from the fault path.
Because of this, I don't want to go ahead and allocate the page, e.g.,
via alloc_page_vma() only to have to free it if it has the correct
policy.  So, I just mimic the alloc_page_vma() node computation
logic--sort of.

Note:  we could use this function to implement a MPOL_MF_STRICT
behavior when migrating pages to match mbind() mempolicy--e.g.,
to ensure that pages in an interleaved range are reinterleaved
rather than left where they are when they reside on any page in
the interleave nodemask.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
[ Added MPOL_F_LAZY to trigger migrate-on-fault;
  simplified code now that we don't have to bother
  with special crap for interleaved ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:41 +00:00
Lee Schermerhorn
d3a710337b mm: mempolicy: Add MPOL_NOOP
This patch augments the MPOL_MF_LAZY feature by adding a "NOOP" policy
to mbind().  When the NOOP policy is used with the 'MOVE and 'LAZY
flags, mbind() will map the pages PROT_NONE so that they will be
migrated on the next touch.

This allows an application to prepare for a new phase of operation
where different regions of shared storage will be assigned to
worker threads, w/o changing policy.  Note that we could just use
"default" policy in this case.  However, this also allows an
application to request that pages be migrated, only if necessary,
to follow any arbitrary policy that might currently apply to a
range of pages, without knowing the policy, or without specifying
multiple mbind()s for ranges with different policies.

[ Bug in early version of mpol_parse_str() reported by Fengguang Wu. ]

Bug-Reported-by: Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:40 +00:00
Peter Zijlstra
479e2802d0 mm: mempolicy: Make MPOL_LOCAL a real policy
Make MPOL_LOCAL a real and exposed policy such that applications that
relied on the previous default behaviour can explicitly request it.

Requested-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:39 +00:00
Mel Gorman
d10e63f294 mm: numa: Create basic numa page hinting infrastructure
Note: This patch started as "mm/mpol: Create special PROT_NONE
	infrastructure" and preserves the basic idea but steals *very*
	heavily from "autonuma: numa hinting page faults entry points" for
	the actual fault handlers without the migration parts.	The end
	result is barely recognisable as either patch so all Signed-off
	and Reviewed-bys are dropped. If Peter, Ingo and Andrea are ok with
	this version, I will re-add the signed-offs-by to reflect the history.

In order to facilitate a lazy -- fault driven -- migration of pages, create
a special transient PAGE_NUMA variant, we can then use the 'spurious'
protection faults to drive our migrations from.

The meaning of PAGE_NUMA depends on the architecture but on x86 it is
effectively PROT_NONE. Actual PROT_NONE mappings will not generate these
NUMA faults for the reason that the page fault code checks the permission on
the VMA (and will throw a segmentation fault on actual PROT_NONE mappings),
before it ever calls handle_mm_fault.

[dhillf@gmail.com: Fix typo]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:39 +00:00
Andrea Arcangeli
0b9d705297 mm: numa: Support NUMA hinting page faults from gup/gup_fast
Introduce FOLL_NUMA to tell follow_page to check
pte/pmd_numa. get_user_pages must use FOLL_NUMA, and it's safe to do
so because it always invokes handle_mm_fault and retries the
follow_page later.

KVM secondary MMU page faults will trigger the NUMA hinting page
faults through gup_fast -> get_user_pages -> follow_page ->
handle_mm_fault.

Other follow_page callers like KSM should not use FOLL_NUMA, or they
would fail to get the pages if they use follow_page instead of
get_user_pages.

[ This patch was picked up from the AutoNUMA tree. ]

Originally-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
[ ported to this tree. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:37 +00:00
Andrea Arcangeli
be3a728427 mm: numa: pte_numa() and pmd_numa()
Implement pte_numa and pmd_numa.

We must atomically set the numa bit and clear the present bit to
define a pte_numa or pmd_numa.

Once a pte or pmd has been set as pte_numa or pmd_numa, the next time
a thread touches a virtual address in the corresponding virtual range,
a NUMA hinting page fault will trigger. The NUMA hinting page fault
will clear the NUMA bit and set the present bit again to resolve the
page fault.

The expectation is that a NUMA hinting page fault is used as part
of a placement policy that decides if a page should remain on the
current node or migrated to a different node.

Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:36 +00:00
Mel Gorman
397487db69 mm: compaction: Add scanned and isolated counters for compaction
Compaction already has tracepoints to count scanned and isolated pages
but it requires that ftrace be enabled and if that information has to be
written to disk then it can be disruptive. This patch adds vmstat counters
for compaction called compact_migrate_scanned, compact_free_scanned and
compact_isolated.

With these counters, it is possible to define a basic cost model for
compaction. This approximates of how much work compaction is doing and can
be compared that with an oprofile showing TLB misses and see if the cost of
compaction is being offset by THP for example. Minimally a compaction patch
can be evaluated in terms of whether it increases or decreases cost. The
basic cost model looks like this

Fundamental unit u:	a word	sizeof(void *)

Ca  = cost of struct page access = sizeof(struct page) / u

Cmc = Cost migrate page copy = (Ca + PAGE_SIZE/u) * 2
Cmf = Cost migrate failure   = Ca * 2
Ci  = Cost page isolation    = (Ca + Wi)
	where Wi is a constant that should reflect the approximate
	cost of the locking operation.

Csm = Cost migrate scanning = Ca
Csf = Cost free    scanning = Ca

Overall cost =	(Csm * compact_migrate_scanned) +
	      	(Csf * compact_free_scanned)    +
	      	(Ci  * compact_isolated)	+
		(Cmc * pgmigrate_success)	+
		(Cmf * pgmigrate_failed)

Where the values are read from /proc/vmstat.

This is very basic and ignores certain costs such as the allocation cost
to do a migrate page copy but any improvement to the model would still
use the same vmstat counters.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:28:35 +00:00
Mel Gorman
7b2a2d4a18 mm: migrate: Add a tracepoint for migrate_pages
The pgmigrate_success and pgmigrate_fail vmstat counters tells the user
about migration activity but not the type or the reason. This patch adds
a tracepoint to identify the type of page migration and why the page is
being migrated.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:28:35 +00:00
Mel Gorman
5647bc293a mm: compaction: Move migration fail/success stats to migrate.c
The compact_pages_moved and compact_pagemigrate_failed events are
convenient for determining if compaction is active and to what
degree migration is succeeding but it's at the wrong level. Other
users of migration may also want to know if migration is working
properly and this will be particularly true for any automated
NUMA migration. This patch moves the counters down to migration
with the new events called pgmigrate_success and pgmigrate_fail.
The compact_blocks_moved counter is removed because while it was
useful for debugging initially, it's worthless now as no meaningful
conclusions can be drawn from its value.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:28:35 +00:00
Peter Zijlstra
7da4d641c5 mm: Count the number of pages affected in change_protection()
This will be used for three kinds of purposes:

 - to optimize mprotect()

 - to speed up working set scanning for working set areas that
   have not been touched

 - to more accurately scan per real working set

No change in functionality from this patch.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-12-11 14:28:34 +00:00
Rik van Riel
2c3cf556b2 x86/mm: Introduce pte_accessible()
We need pte_present to return true for _PAGE_PROTNONE pages, to indicate that
the pte is associated with a page.

However, for TLB flushing purposes, we would like to know whether the pte
points to an actually accessible page.  This allows us to skip remote TLB
flushes for pages that are not actually accessible.

Fill in this method for x86 and provide a safe (but slower) method
on other architectures.

Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixed-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-66p11te4uj23gevgh4j987ip@git.kernel.org
[ Added Linus's review fixes. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-12-11 14:28:34 +00:00
Mauro Carvalho Chehab
77c53d0b56 Merge branch 'for_3.8-rc1' into v4l_for_linus
* for_3.8-rc1: (243 commits)
  [media] omap3isp: Replace cpu_is_omap3630() with ISP revision check
  [media] omap3isp: Prepare/unprepare clocks before/after enable/disable
  [media] omap3isp: preview: Add support for 8-bit formats at the sink pad
  [media] omap3isp: Replace printk with dev_*
  [media] omap3isp: Find source pad from external entity
  [media] omap3isp: Configure CSI-2 phy based on platform data
  [media] omap3isp: Add PHY routing configuration
  [media] omap3isp: Add CSI configuration registers from control block to ISP resources
  [media] omap3isp: Remove unneeded module memory address definitions
  [media] omap3isp: Use monotonic timestamps for statistics buffers
  [media] uvcvideo: Fix control value clamping for unsigned integer controls
  [media] uvcvideo: Mark first output terminal as default video node
  [media] uvcvideo: Add VIDIOC_[GS]_PRIORITY support
  [media] uvcvideo: Return -ENOTTY for unsupported ioctls
  [media] uvcvideo: Set device_caps in VIDIOC_QUERYCAP
  [media] uvcvideo: Don't fail when an unsupported format is requested
  [media] uvcvideo: Return -EACCES when trying to access a read/write-only control
  [media] uvcvideo: Set error_idx properly for extended controls API failures
  [media] rtl28xxu: add NOXON DAB/DAB+ USB dongle rev 2
  [media] fc2580: write some registers conditionally
  ...
2012-12-11 11:28:37 -02:00
Christoph Lameter
3c58346525 slab: Simplify bootstrap
The nodelists field in kmem_cache is pointing to the first unused
object in the array field when bootstrap is complete.

A problem with the current approach is that the statically sized
kmem_cache structure use on boot can only contain NR_CPUS entries.
If the number of nodes plus the number of cpus is greater then we
would overwrite memory following the kmem_cache_boot definition.

Increase the size of the array field to ensure that also the node
pointers fit into the array field.

Once we do that we no longer need the kmem_cache_nodelists
array and we can then also use that structure elsewhere.

Acked-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-12-11 12:14:27 +02:00
Vitaly Andrianov
4009793e15 drivers: cma: represent physical addresses as phys_addr_t
This commit changes the CMA early initialization code to use phys_addr_t
for representing physical addresses instead of unsigned long.

Without this change, among other things, dma_declare_contiguous() simply
discards any memory regions whose address is not representable as unsigned
long.

This is a problem on 32-bit PAE machines where unsigned long is 32-bit
but physical address space is larger.

Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2012-12-11 09:28:09 +01:00
Mike Turquette
7c045a55c9 clk: introduce optional disable_unused callback
Some gate clocks have special needs which must be handled during the
disable-unused clocks sequence.  These needs might be driven by software
due to the fact that we're disabling a clock outside of the normal
clk_disable path and a clk's enable_count will not be accurate.  On the
other hand a specific hardware programming sequence might need to be
followed for this corner case.

This change is needed for the upcoming OMAP port to the common clock
framework.  Specifically, it is undesirable to treat the disable-unused
path identically to the normal clk_disable path since other software
layers are involved.  In this case OMAP's clockdomain code throws WARNs
and bails early due to the clock's enable_count being set to zero.  A
custom callback mitigates this problem nicely.

Cc: Paul Walmsley <paul@pwsan.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
2012-12-10 22:35:02 -08:00
Namjae Jeon
457d08ee4f f2fs: introduce accessor to retrieve number of dentry slots
Simplify code by providing the accessor macro to retrieve the
number of dentry slots for a given filename length.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
2012-12-11 13:43:45 +09:00
Jaegeuk Kim
25ca923b2a f2fs: fix endian conversion bugs reported by sparse
This patch should resolve the bugs reported by the sparse tool.
Initial reports were written by "kbuild test robot" managed by fengguang.wu.

In my local machines, I've tested also by running:
> make C=2 CF="-D__CHECK_ENDIAN__"

Accordingly, I've found lots of warnings and bugs related to the endian
conversion. And I've fixed all at this moment.

Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-12-11 13:43:42 +09:00
Jaegeuk Kim
39a53e0ce0 f2fs: add superblock and major in-memory structure
This adds the following major in-memory structures in f2fs.

- f2fs_sb_info:
  contains f2fs-specific information, two special inode pointers for node and
  meta address spaces, and orphan inode management.

- f2fs_inode_info:
  contains vfs_inode and other fs-specific information.

- f2fs_nm_info:
  contains node manager information such as NAT entry cache, free nid list,
  and NAT page management.

- f2fs_node_info:
  represents a node as node id, inode number, block address, and its version.

- f2fs_sm_info:
  contains segment manager information such as SIT entry cache, free segment
  map, current active logs, dirty segment management, and segment utilization.
  The specific structures are sit_info, free_segmap_info, dirty_seglist_info,
  curseg_info.

In addition, add F2FS_SUPER_MAGIC in magic.h.

Signed-off-by: Chul Lee <chur.lee@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-12-11 13:43:40 +09:00
Jaegeuk Kim
dd31866b0d f2fs: add on-disk layout
This adds a header file describing the on-disk layout of f2fs.

Signed-off-by: Changman Lee <cm224.lee@samsung.com>
Signed-off-by: Chul Lee <chur.lee@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-12-11 13:43:40 +09:00
Mark Brown
8e24a6e696 Merge remote-tracking branch 'regmap/topic/table' into regmap-next 2012-12-11 12:39:32 +09:00
Mark Brown
db760fbecd Merge remote-tracking branch 'regmap/topic/lock' into regmap-next 2012-12-11 12:39:30 +09:00
Mark Brown
4d348e6e0a Merge remote-tracking branch 'regmap/topic/domain' into regmap-next 2012-12-11 12:39:29 +09:00
Mark Brown
bcf86687d6 Merge remote-tracking branch 'regmap/topic/debugfs' into regmap-next 2012-12-11 12:39:20 +09:00
Russell King
0b99cb7310 Merge branches 'cache-l2x0', 'fixes', 'hdrs', 'misc', 'mmci', 'vic' and 'warnings' into for-next 2012-12-11 00:20:18 +00:00
Bjorn Helgaas
1cb73f8c47 Merge branch 'pci/mjg-pci-roms-from-efi' into next
* pci/mjg-pci-roms-from-efi:
  PCI: Use phys_addr_t for physical ROM address
2012-12-10 16:20:12 -07:00
Gabor Juhos
ab5c4f71d8 ath9k: allow to load EEPROM content via firmware API
The calibration data for devices w/o a separate
EEPROM chip can be specified via the 'eeprom_data'
field of 'ath9k_platform_data'. The 'eeprom_data'
is usually filled from board specific setup
functions. It is easy if the EEPROM data is mapped
to the memory, but it can be complicated if it is
stored elsewhere.

The patch adds support for loading of the EEPROM
data via the firmware API to avoid this limitation.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-12-10 15:49:57 -05:00
Linus Torvalds
caf491916b Revert "revert "Revert "mm: remove __GFP_NO_KSWAPD""" and associated damage
This reverts commits a50915394f and
d7c3b937bd.

This is a revert of a revert of a revert.  In addition, it reverts the
even older i915 change to stop using the __GFP_NO_KSWAPD flag due to the
original commits in linux-next.

It turns out that the original patch really was bogus, and that the
original revert was the correct thing to do after all.  We thought we
had fixed the problem, and then reverted the revert, but the problem
really is fundamental: waking up kswapd simply isn't the right thing to
do, and direct reclaim sometimes simply _is_ the right thing to do.

When certain allocations fail, we simply should try some direct reclaim,
and if that fails, fail the allocation.  That's the right thing to do
for THP allocations, which can easily fail, and the GPU allocations want
to do that too.

So starting kswapd is sometimes simply wrong, and removing the flag that
said "don't start kswapd" was a mistake.  Let's hope we never revisit
this mistake again - and certainly not this many times ;)

Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-10 11:03:05 -08:00
Bjorn Helgaas
dbd3fc3345 PCI: Use phys_addr_t for physical ROM address
Use phys_addr_t rather than "void *" for physical memory address.
This removes casts and fixes a "cast from pointer to integer of different
size" warning on ppc44x_defconfig.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-12-10 11:24:42 -07:00
Vivien Didelot
759f5f3752 gpio: add TS-5500 DIO blocks support
Technologic Systems TS-5500 provides digital I/O lines exposed through
pin blocks. On this platform, there are three of them, named DIO1, DIO2
and LCD port, that may be used as a DIO block.

The TS-5500 pin blocks are described in the product's wiki:
http://wiki.embeddedarm.com/wiki/TS-5500#Digital_I.2FO

This driver is not limited to the TS-5500 blocks. It can be extended to
support similar boards pin blocks, such as on the TS-5600.

This patch is the V2 of the previous https://lkml.org/lkml/2012/9/25/671
with corrections suggested by Linus Walleij.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Jerome Oufella <jerome.oufella@savoirfairelinux.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2012-12-10 11:23:25 +01:00
Maarten Lankhorst
97a875cbdf drm/ttm: remove no_wait_reserve, v3
All items on the lru list are always reservable, so this is a stupid
thing to keep. Not only that, it is used in a way which would
guarantee deadlocks if it were ever to be set to block on reserve.

This is a lot of churn, but mostly because of the removal of the
argument which can be nested arbitrarily deeply in many places.

No change of code in this patch except removal of the no_wait_reserve
argument, the previous patch removed the use of no_wait_reserve.

v2:
 - Warn if -EBUSY is returned on reservation, all objects on the list
   should be reservable. Adjusted patch slightly due to conflicts.
v3:
 - Focus on no_wait_reserve removal only.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-10 20:21:30 +10:00
Dave Airlie
1a1494def7 Merge branch 'drm-next-3.8' of git://people.freedesktop.org/~agd5f/linux into drm-next
Alex writes:
Pretty minor -next pull request.  We some additional new bits waiting
internally for release.  Hopefully Monday we can get at least some of
them out.  The others will probably take a few more weeks.

Highlights of the current request:
- ELD registers for passing audio information to the sound hardware
- Handle GPUVM page faults more gracefully
- Misc fixes

Merge radeon test
* 'drm-next-3.8' of git://people.freedesktop.org/~agd5f/linux: (483 commits)
  drm/radeon: bump driver version for new info ioctl requests
  drm/radeon: fix eDP clk and lane setup for scaled modes
  drm/radeon: add new INFO ioctl requests
  drm/radeon/dce32+: use fractional fb dividers for high clocks
  drm/radeon: use cached memory when evicting for vram on non agp
  drm/radeon: add a CS flag END_OF_FRAME
  drm/radeon: stop page faults from hanging the system (v2)
  drm/radeon/dce4/5: add registers for ELD handling
  drm/radeon/dce3.2: add registers for ELD handling
  radeon: fix pll/ctrc mapping on dce2 and dce3 hardware
  Linux 3.7-rc7
  powerpc/eeh: Do not invalidate PE properly
  Revert "drm/i915: enable rc6 on ilk again"
  ALSA: hda - Fix build without CONFIG_PM
  of/address: sparc: Declare of_iomap as an extern function for sparc again
  PM / QoS: fix wrong error-checking condition
  bnx2x: remove redundant warning log
  vxlan: fix command usage in its doc
  8139cp: revert "set ring address before enabling receiver"
  MPI: Fix compilation on MIPS with GCC 4.4 and newer
  ...

Conflicts:
	drivers/gpu/drm/exynos/exynos_drm_encoder.c
	drivers/gpu/drm/exynos/exynos_drm_fbdev.c
	drivers/gpu/drm/nouveau/core/engine/disp/nv50.c
2012-12-10 20:03:58 +10:00
Takashi Iwai
97768a8e65 ASoC: Updates for v3.8
Some incremental updates, nothing too exciting.  The biggest block here
 is the __dev annotation removal stuff from Bill, everything else is the
 usual driver-specific stuff - a combination of fixes and development.
 
 There will be at least more more set of fixes to come but I wanted to
 get these out ready for the merge window to make sure Bill's stuff makes
 it in.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQxK9SAAoJELSic+t+oim9MNEP/RzdU9/vHK2xeZ8O7SsIC4T+
 TMIzNLNyjMGA8qOJXmd5iwdTE/Cxh7OEYJvG/qUony8XipjdGC6V4e+Ybbl55Kc8
 mf/5fytyTX3+1eZ0uYHVrTMGcSfZ7qelmUXbIASo4jiqZUm2Uo/jar10Omy5EG/J
 mn7dtt67bOoiWugMhBEeKslwDSOQFBL2qcMbkWVRfqpvCLPqT7gg12OfQbrfb/ZV
 Ucj9zVMblel3ziWkElKxgm1Sfo60ExyekbnR11yLlSe8NfX6ky+46lUj3SEwCIzv
 8sknZYwGdGgb0bgNMzeqwUtPS1smCH46/5UVul6of9ox9DafKklfu01jdRg27zTj
 c95YQDDw3CtQap8FXjKIB5pn3zRl8pK7zYllZdEdUF0tYhaFSrCt566nXLfuZ3Fq
 jVkpwQe4u/oCmDlBpZUGlzyv0Vo7xarm79q2RYvgpNPN310mlufKjcJyhm5kAxdp
 6RTKcxtW6IrB8r5MmkBcyLiRoVjt/3e1dK0lOgp66NjSmo+GPSi7Yf4KKjTS6dDL
 kJ8Tiv9xI2daY0ZKwNu/3ifQPx3c2KLw9aoEb/R9RMe+LbulZu8DJcm0xpc1jYWP
 x63MSLfYwCO373dIuobv7O4PuI5hu11tArIYosBuq2vUuyQ9ymqGdsquYMXM10DA
 FKHsSz3oC3jMr4nhsUyt
 =Ywvf
 -----END PGP SIGNATURE-----

Merge tag 'asoc-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-next

ASoC: Updates for v3.8

Some incremental updates, nothing too exciting.  The biggest block here
is the __dev annotation removal stuff from Bill, everything else is the
usual driver-specific stuff - a combination of fixes and development.

There will be at least more more set of fixes to come but I wanted to
get these out ready for the merge window to make sure Bill's stuff makes
it in.
2012-12-10 09:00:45 +01:00
Mark Brown
f4244c68ff Merge remote-tracking branch 'regulator/topic/tol' into regulator-next 2012-12-10 12:43:22 +09:00
Mark Brown
4247bfe20a Merge remote-tracking branch 'regulator/topic/stub' into regulator-next 2012-12-10 12:43:20 +09:00
Mark Brown
adca48f7c6 Merge remote-tracking branch 'regulator/topic/min' into regulator-next 2012-12-10 12:43:00 +09:00
Mark Brown
9e21867073 Merge remote-tracking branch 'regulator/topic/max8997' into regulator-next 2012-12-10 12:43:00 +09:00
Mark Brown
f675649e70 Merge remote-tracking branch 'regulator/topic/max8973' into regulator-next 2012-12-10 12:42:59 +09:00
Mark Brown
1f9cc5f771 Merge remote-tracking branch 'regulator/topic/hotplug' into regulator-next 2012-12-10 12:42:55 +09:00
Mark Brown
db58e0270c Merge remote-tracking branch 'regulator/topic/da9055' into regulator-next 2012-12-10 12:42:53 +09:00
Mark Brown
6234427eb8 Merge remote-tracking branch 'regulator/topic/change' into regulator-next 2012-12-10 12:42:52 +09:00
Lars-Peter Clausen
9d2951bcd9 asm-generic/mmu.h: Add support for FDPIC
No-MMU architectures often have support for FDPIC binaries. FDPIC support
requires two additional fields in the mm_context_t struct. This patch adds these
fields to the generic mm_context_t definition if support for FDPIC binaries is
enabled. This allows to use the generic mmu.h for a few more architectures.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-12-09 23:14:14 +01:00
Lars-Peter Clausen
20154fd370 asm-generic/mmu.h: Remove unused vmlist field from mm_context_t
Nothing is using the vmlist field in mm_context_t anymore. It has been removed
from the non-generic versions over 3 years ago 8feae1311 ("NOMMU: Make VMAs per
MM as for MMU-mode linux").

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-12-09 23:14:10 +01:00
Marcelo Tosatti
d2ff4fc557 Merge branch 'for-upstream' of https://github.com/agraf/linux-2.6 into queue
* 'for-upstream' of https://github.com/agraf/linux-2.6: (28 commits)
  KVM: PPC: booke: Get/set guest EPCR register using ONE_REG interface
  KVM: PPC: bookehv: Add EPCR support in mtspr/mfspr emulation
  KVM: PPC: bookehv: Add guest computation mode for irq delivery
  KVM: PPC: Make EPCR a valid field for booke64 and bookehv
  KVM: PPC: booke: Extend MAS2 EPN mask for 64-bit
  KVM: PPC: e500: Mask MAS2 EPN high 32-bits in 32/64 tlbwe emulation
  KVM: PPC: Mask ea's high 32-bits in 32/64 instr emulation
  KVM: PPC: e500: Add emulation helper for getting instruction ea
  KVM: PPC: bookehv64: Add support for interrupt handling
  KVM: PPC: bookehv: Remove GET_VCPU macro from exception handler
  KVM: PPC: booke: Fix get_tb() compile error on 64-bit
  KVM: PPC: e500: Silence bogus GCC warning in tlb code
  KVM: PPC: Book3S HV: Handle guest-caused machine checks on POWER7 without panicking
  KVM: PPC: Book3S HV: Improve handling of local vs. global TLB invalidations
  MAINTAINERS: Add git tree link for PPC KVM
  KVM: PPC: Book3S PR: MSR_DE doesn't exist on Book 3S
  KVM: PPC: Book3S PR: Fix VSX handling
  KVM: PPC: Book3S PR: Emulate PURR, SPURR and DSCR registers
  KVM: PPC: Book3S HV: Don't give the guest RW access to RO pages
  KVM: PPC: Book3S HV: Report correct HPT entry index when reading HPT
  ...
2012-12-09 18:44:10 -02:00
Mark Brown
06f1c66324 Merge remote-tracking branch 'asoc/topic/wm8994' into asoc-next 2012-12-10 00:22:37 +09:00
Mark Brown
719454d213 Merge remote-tracking branch 'asoc/topic/wm2200' into asoc-next 2012-12-10 00:22:24 +09:00
Mark Brown
c0324fb3a1 Merge remote-tracking branch 'asoc/topic/tlv320aic32x4' into asoc-next 2012-12-10 00:22:20 +09:00
Mark Brown
ceb8ef5e6d Merge remote-tracking branch 'asoc/topic/samsung' into asoc-next 2012-12-10 00:22:17 +09:00
Mark Brown
d576741839 Merge remote-tracking branch 'asoc/topic/omap' into asoc-next 2012-12-10 00:22:16 +09:00
Mark Brown
954f497f71 Merge remote-tracking branch 'asoc/topic/fsi' into asoc-next 2012-12-10 00:22:08 +09:00
Mark Brown
1bd202e4c7 Merge remote-tracking branch 'asoc/topic/davinci' into asoc-next 2012-12-10 00:22:07 +09:00
Mark Brown
f20eca1c06 Merge remote-tracking branch 'asoc/topic/cs4271' into asoc-next 2012-12-10 00:22:04 +09:00
Mark Brown
93ac820df5 Merge remote-tracking branch 'asoc/topic/atmel' into asoc-next 2012-12-10 00:22:02 +09:00
Mark Brown
daa5ab9e0d Merge remote-tracking branch 'asoc/topic/arizona' into asoc-next 2012-12-10 00:22:00 +09:00
Mark Brown
be2f6f5a78 mfd: wm5102: Add readback of DSP status 3 register
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2012-12-09 15:40:52 +01:00
Jason Wang
986a4f4d45 virtio_net: multiqueue support
This patch adds the multiqueue (VIRTIO_NET_F_MQ) support to virtio_net
driver. VIRTIO_NET_F_MQ capable device could allow the driver to do packet
transmission and reception through multiple queue pairs and does the packet
steering to get better performance. By default, one one queue pair is used, user
could change the number of queue pairs by ethtool in the next patch.

When multiple queue pairs is used and the number of queue pairs is equal to the
number of vcpus. Driver does the following optimizations to implement per-cpu
virt queue pairs:

- select the txq based on the smp processor id.
- smp affinity hint to the cpu that owns the queue pairs.

This could be used with the flow steering support of the device to guarantee the
packets of a single flow is handled by the same cpu.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-09 00:30:55 -05:00
Joseph Gasparakis
6a674e9c75 net: Add support for hardware-offloaded encapsulation
This patch adds support in the kernel for offloading in the NIC Tx and Rx
checksumming for encapsulated packets (such as VXLAN and IP GRE).

For Tx encapsulation offload, the driver will need to set the right bits
in netdev->hw_enc_features. The protocol driver will have to set the
skb->encapsulation bit and populate the inner headers, so the NIC driver will
use those inner headers to calculate the csum in hardware.

For Rx encapsulation offload, the driver will need to set again the
skb->encapsulation flag and the skb->ip_csum to CHECKSUM_UNNECESSARY.
In that case the protocol driver should push the decapsulated packet up
to the stack, again with CHECKSUM_UNNECESSARY. In ether case, the protocol
driver should set the skb->encapsulation flag back to zero. Finally the
protocol driver should have NETIF_F_RXCSUM flag set in its features.

Signed-off-by: Joseph Gasparakis <joseph.gasparakis@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-09 00:20:28 -05:00
Ingo Molnar
cc1b39dbf9 Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core
Pull ftrace updates from Steve Rostedt.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-12-08 15:54:35 +01:00
Ingo Molnar
7e0dd574cd Merge branch 'uprobes/core' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc into perf/core
Pull uprobes fixes, cleanups and preparation for the ARM port from Oleg Nesterov.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-12-08 15:51:10 +01:00
Ingo Molnar
38130ec087 Some more cputime cleanups:
* Get rid of underscores polluting the vtime namespace
 
 * Consolidate context switch and tick handling
 
 * Improve debuggability by detecting irq unsafe callers
 
 Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQq52nAAoJEIUkVEdQjox3ZRoP/RuDC59hNGu3rR0ERMM3TqW3
 SIaMSHlHQh3h8P8OASpRqBb9s0BoWD0l3xZ68TEACnRdLS50Rre2P0SSxqpkdbnL
 cj0+I7gmbxKa9c9zpm+mn1TvL2bEhg6hkWCMK9jn2SSBl33cOqKUGUfy8Gx0nryc
 q+cOZrXSgMvYKCixGubCqsTl8MKs9CrpyrLSYtFUiHFVWREPfndS9M9BB5yfKHL1
 t9qmdb5WRq2NpU6apoZBMBdPQcmQr5WswLpbhoTocpvCiEmt5RZGkSDOawPa1DHP
 2SPM7fGZIDrXCMW/g9d2mt43j/HxS9LYu9lToZCbbMqehe2Bf5jYqO1Kwi7FhedR
 NSofoXbW/j589+7I+pN66lo0pctNWxd59YDvLw22SqUFcBEUSmypM6eUwbrbVUg7
 /H0a8T/5bPwx2ukNrCW0+Zsd9X3If4K290j4lNOMLki9ikYG6IXfGw1GMwsiyFSo
 LNSnDs0ekovvWOAg1iRq8DW8j/TWoZuZUSRME2LdCde9SbkMEGgWaNYCwNLMenie
 6jZHar7SfpdRDPP6NCY85jMy5MRbyN3mzSFhMfqMKQgmFNd7ay7oRKppIkwT+qkD
 VozCvdPmCxd+orNMbWINDAhNY5RUlcPj/Em8Mue1U152rpjfNt/WZOfmujmLwNW2
 /RPQtHo+F7w7KhbylFpx
 =cL3t
 -----END PGP SIGNATURE-----

Merge tag 'sched-cputime-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into sched/core

Pull more cputime cleanups from Frederic Weisbecker:

 * Get rid of underscores polluting the vtime namespace

 * Consolidate context switch and tick handling

 * Improve debuggability by detecting irq unsafe callers

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-12-08 15:44:43 +01:00
Ingo Molnar
e783377e93 Cputime cleanups on reader side:
* Improve naming and code location
 
 * Consolidate adjustment code
 
 * Comment the adjustement code
 
 Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQt5oaAAoJEIUkVEdQjox3hNwP/2QP7p9BHCPwGWenIi4aVUWH
 tlDLwWvQE919YPYL4AUgz4b9f4G7U7dbBozIJRxhB0rjqrbXU6PDvVCIwVyDH2xQ
 mTp5qdqyysgzqgZ7q0t27zLfHEANRcH8Tnrqj2XustqvdYcIzZKZeNkFsF3QRiDw
 utIEmE8A9mBnWDP7O4fDmo8onHNUmJc50Y0c/WJW7fbtq5aCh2vn87efV4GYGNjk
 e1qZuLRWdZYXkDnO6zqD5tUe/kB0ioPzXXyBkYAHXCMhCpkMDu7c18N+IrY80kBb
 vBQqeAGlpUuXnJ/MDFazqqbmezBYhnTIbnojyWO4ONzi2z6L3K9F1/zukM4WtvLv
 RNDF4MS7smFjyXXXfliIGOhvI5C5O9bosPOzBtvwHSYrnS5KGL8fv8N8tXixqytW
 nX5NEcjfCZXpNpm4TELcDyAvOrVMFe2CQwKgLBPSY1zRch34nJi9G55uKKSjg1xd
 Z1aDbVZFNt9R3ozV1rVaptNzagEa/023bvmnB8IiuA9oh6rNZOHhsc/lo1T2VaeO
 PhJqD50JPbJyycJ1m0pIW8iVSUxfIvJtICEHgVSCPH5A58PsKFr+8ELs+InTPTDt
 11V7dxHAmspar1CO1mqYMMIS4VKgPfwNI6zuaO+JlmU4nMB42y8WAZn/lzMyafQE
 Uswa6UTBBiU159HNzgDh
 =FRxY
 -----END PGP SIGNATURE-----

Merge tag 'cputime-adjustment-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into sched/core

Pull cputime cleanups from Frederic Weisbecker:

 * Improve naming and code location

 * Consolidate adjustment code

 * Comment the adjustement code

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-12-08 15:31:07 +01:00
Ingo Molnar
f0b9abfb04 Merge branch 'linus' into perf/core
Conflicts:
	tools/perf/Makefile
	tools/perf/builtin-test.c
	tools/perf/perf.h
	tools/perf/tests/parse-events.c
	tools/perf/util/evsel.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-12-08 15:25:06 +01:00
Alex Deucher
2e1a7674f6 drm/radeon: add new INFO ioctl requests
Add requests to get the number of shader engines (SE) and
the number of SH per SE.  These are needed for geometry
and tesselation shaders in the 3D driver as well as setting
up PA_SC_RASTER_CONFIG on SI asics.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-07 19:48:22 -05:00
Marek Olšák
57f5708383 drm/radeon: add a CS flag END_OF_FRAME
No version bump is required because setting the flag on older DRM has
no effect.

This only reserves the bit and doesn't use it. I assume we will use it
for buffer eviction heuristics.

Signed-off-by: Marek Olšák <maraeo@gmail.com>
2012-12-07 19:48:20 -05:00
Rafael J. Wysocki
8ab788f002 Merge branch 'acpi-dev-pm'
* acpi-dev-pm:
  ACPI / PM: Fix header of acpi_dev_pm_detach() in acpi.h
2012-12-07 23:14:11 +01:00
Rafael J. Wysocki
bf58cdffac Merge branch 'pm-devfreq'
* pm-devfreq: (23 commits)
  PM / devfreq: remove compiler error with module governors (2)
  PM / devfreq: Fix return value in devfreq_remove_governor()
  PM / devfreq: Fix incorrect argument in error message
  PM / devfreq: missing rcu_read_lock() added for find_device_opp()
  PM / devfreq: remove compiler error when a governor is module
  PM / devfreq: exynos4_bus.c: Fixed an alignment of the func call args.
  PM / devfreq: Add sysfs node to expose available governors
  PM / devfreq: allow sysfs governor node to switch governor
  PM / devfreq: governors: add GPL module license and allow module build
  PM / devfreq: map devfreq drivers to governor using name
  PM / devfreq: register governors with devfreq framework
  PM / devfreq: provide hooks for governors to be registered
  PM / devfreq: export update_devfreq
  PM / devfreq: Add sysfs node for representing frequency transition information.
  PM / devfreq: Add sysfs node to expose available frequencies
  PM / devfreq: documentation cleanups for devfreq header
  PM / devfreq: Use devm_* functions in exynos4_bus.c
  PM / devfreq: make devfreq_class static
  PM / devfreq: fix sscanf handling for writable sysfs entries
  PM / devfreq: kernel-doc typo corrections
  ...
2012-12-07 23:13:36 +01:00
Eric Dumazet
c3c7c254b2 net: gro: fix possible panic in skb_gro_receive()
commit 2e71a6f808 (net: gro: selective flush of packets) added
a bug for skbs using frag_list. This part of the GRO stack is rarely
used, as it needs skb not using a page fragment for their skb->head.

Most drivers do use a page fragment, but some of them use GFP_KERNEL
allocations for the initial fill of their RX ring buffer.

napi_gro_flush() overwrite skb->prev that was used for these skb to
point to the last skb in frag_list.

Fix this using a separate field in struct napi_gro_cb to point to the
last fragment.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-07 14:39:29 -05:00
Yuchung Cheng
93b174ad71 tcp: bug fix Fast Open client retransmission
If SYN-ACK partially acks SYN-data, the client retransmits the
remaining data by tcp_retransmit_skb(). This increments lost recovery
state variables like tp->retrans_out in Open state. If loss recovery
happens before the retransmission is acked, it triggers the WARN_ON
check in tcp_fastretrans_alert(). For example: the client sends
SYN-data, gets SYN-ACK acking only ISN, retransmits data, sends
another 4 data packets and get 3 dupacks.

Since the retransmission is not caused by network drop it should not
update the recovery state variables. Further the server may return a
smaller MSS than the cached MSS used for SYN-data, so the retranmission
needs a loop. Otherwise some data will not be retransmitted until timeout
or other loss recovery events.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-07 14:39:28 -05:00
Cong Wang
ee07c6e7a6 bridge: export multicast database via netlink
V5: fix two bugs pointed out by Thomas
    remove seq check for now, mark it as TODO

V4: remove some useless #include
    some coding style fix

V3: drop debugging printk's
    update selinux perm table as well

V2: drop patch 1/2, export ifindex directly
    Redesign netlink attributes
    Improve netlink seq check
    Handle IPv6 addr as well

This patch exports bridge multicast database via netlink
message type RTM_GETMDB. Similar to fdb, but currently bridge-specific.
We may need to support modify multicast database too (RTM_{ADD,DEL}MDB).

(Thanks to Thomas for patient reviews)

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-07 14:32:52 -05:00
Thomas Graf
45122ca26c sctp: Add RCU protection to assoc->transport_addr_list
peer.transport_addr_list is currently only protected by sk_sock
which is inpractical to acquire for procfs dumping purposes.

This patch adds RCU protection allowing for the procfs readers to
enter RCU read-side critical sections.

Modification of the list continues to be serialized via sk_lock.

V2: Use list_del_rcu() in sctp_association_free() to be safe
    Skip transports marked dead when dumping for procfs

Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-07 14:15:04 -05:00
Bjorn Helgaas
27e1c8ee01 Merge branch 'pci/bjorn-pcie-cap' into next
* pci/bjorn-pcie-cap:
  ath9k: Use standard #defines for PCIe Capability ASPM fields
  iwlwifi: Use standard #defines for PCIe Capability ASPM fields
  iwlwifi: collapse wrapper for pcie_capability_read_word()
  iwlegacy: Use standard #defines for PCIe Capability ASPM fields
  iwlegacy: collapse wrapper for pcie_capability_read_word()
  cxgb3: Use standard #defines for PCIe Capability ASPM fields
  PCI: Add standard PCIe Capability Link ASPM field names
  PCI/portdrv: Use PCI Express Capability accessors
  PCI: Use standard PCIe Capability Link register field names
  PCI: Add and use standard PCI-X Capability register names
2012-12-07 12:11:52 -07:00
Guennadi Liakhovetski
9f1fb60a23 mmc: add a card-event host operation
Some hosts need to perform additional actions upon card insertion or
ejection. Add a host operation to be called from card detection handlers.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Reviewed-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-12-07 13:55:31 -05:00
Bjorn Helgaas
7508320678 PCI: Add standard PCIe Capability Link ASPM field names
Add standard #defines for ASPM fields in PCI Express Link Capability and
Link Control registers.

Previously we used PCIE_LINK_STATE_L0S and PCIE_LINK_STATE_L1 directly, but
these are defined for the Linux ASPM interfaces, e.g.,
pci_disable_link_state(), and only coincidentally match the actual register
bits.  PCIE_LINK_STATE_CLKPM, also part of that interface, does not match
the register bit.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
2012-12-07 11:18:31 -07:00
John W. Linville
8024dc1910 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2012-12-07 13:03:50 -05:00
trem
6ded7cd605 net: phy: smsc: force all capable mode if the phy is started in powerdown mode
A SMSC PHY in power down mode can't be used.
If a SMSC PHY is in this mode in the config_init
stage, the mode "all capable" is set. So the PHY
could then be used.

Signed-off-by: Philippe Reynes <tremyfr@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-07 12:48:00 -05:00
Grant Likely
7730cba2a5 Linux 3.7-rc8
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEcBAABAgAGBQJQvPxHAAoJEHm+PkMAQRiGkFUIAJz761Kp4J4Nj/wrv5ZHGQso
 MHRbzMkSfRNz6lGCkgxS61ydYKtrV2vuE6VH8HriGlLkI8Lj7MaQTXvYSdj/O0zy
 yV/2H5R3s7n5JZTw3g3eOf3K33tL6xhwd4tYHI7QHjdzSzQyaNhuUuNhxrlT95iv
 twNetm0tyhpf76TurRzF14hLUaShVRXT/FrqWK9wgmGjg7Ij0xp+UFNkeUGUwbeF
 3HMJ98fdd0VD/W8qF5GZr3USks4C+NKtXEya8zQKc59XumKCiRJZmbE6JsJlp+OP
 CsHs7ZaNlInvPcKTFzkNs8ThYWC/NHBqLO5tX5UphW4qFSS39EmHd8igrwXLPaI=
 =RS1F
 -----END PGP SIGNATURE-----

Merge tag 'v3.7-rc8' into spi/next

Linux 3.7-rc8
2012-12-07 17:02:47 +00:00
Tomi Valkeinen
348be69d30 OMAPDSS: export dispc functions
Export DISPC functions.

Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2012-12-07 17:06:00 +02:00
Tomi Valkeinen
eda3427363 OMAPDSS: export dss_feat functions
Export dss_features related functions.

Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2012-12-07 17:05:59 +02:00
Tomi Valkeinen
a97a963475 OMAPDSS: export dss_mgr_ops functions
Export dss_mgr_ops related functions.

Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2012-12-07 17:05:59 +02:00
Tomi Valkeinen
549acbe7a3 OMAPDSS: move omap_dispc_wait_for_irq_interruptible_timeout to dispc-compat.c
We have two functions to wait for a dispc interrupt:

int omap_dispc_wait_for_irq_timeout(u32 irqmask, unsigned long timeout);
int omap_dispc_wait_for_irq_interruptible_timeout(u32 irqmask,

Of these, the former is not used at all, and can be removed. The latter
is only used by the compat layer, and can be moved to the compat layer
code.

Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2012-12-07 17:05:56 +02:00
Tomi Valkeinen
8dd2491a42 OMAPDSS: add omapdss_compat_init()
Add two new exported functions, omapdss_compat_init and
omapdss_compat_uninit, which are to be used by omapfb, omap_vout to
enable compatibility mode for omapdss. The functions are called by
omapdss internally for now, and moved to other drivers later.

The compatibility mode is implemented fully in the following patches.
For now, enabling compat mode only sets up the private data in apply.c.

Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2012-12-07 17:05:53 +02:00
Ingo Molnar
222e82bef4 Merge branch 'linus' into sched/core
Pick up the autogroups fix and other fixes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-12-07 12:15:33 +01:00
Lamarque V. Souza
4529eefad0 HID: hidp: fallback to input session properly if hid is blacklisted
This patch against kernel 3.7.0-rc8 fixes a kernel oops when turning on the
bluetooth mouse with id 0458:0058 [1].

The mouse in question supports both input and hid sessions, however it is
blacklisted in drivers/hid/hid-core.c so the input session is one that should
be used. Long ago (around kernel 3.0.0) some changes in the bluetooth
subsystem made the kernel do not fallback to input session when hid session is
not supported or blacklisted. This patch restore that behaviour by making the
kernel try the input session if hid_add_device returns ENODEV.

The patch exports hid_ignore() from hid-core.c so that it can be used in the
bluetooth subsystem.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=39882

Signed-off-by: Lamarque V. Souza <lamarque@gmail.com>
Acked-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-12-07 11:12:27 +01:00
Kuninori Morimoto
805f864ebe gpio: pcf857x: use client->irq for gpio_to_irq()
6e20a0a429
(gpio: pcf857x: enable gpio_to_irq() support)
added gpio_to_irq() support on pcf857x driver,
but it used pdata->irq.
This patch modifies driver to use client->irq instead of it.
It modifies kzm9g board platform settings,
and device probe information too.
This patch is tested on kzm9g board

Reported-by: Christian Engelmayer <christian.engelmayer@frequentis.com>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2012-12-07 09:16:12 +01:00
Daniel Mack
d0c6c482f6 ASoC: McASP: remove unused variables
codec_fmt and sample_rate variables are unused in both snd_platform_data
and davinci_audio_dev, so drop them.

Signed-off-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-12-07 14:46:56 +09:00
Peter Ujfalusi
eb044c4844 ASoC: omap-twl4030: Update the header file to support more boards
The common machine driver will be able to support new boards where the voice
port is also in use.
At the same time add possibility to fine tune the connections from twl4030
on the board: jack detection GPIO, input/output selection

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-12-07 12:54:31 +09:00
Bjorn Helgaas
72e1e868ca Merge branch 'pci/mjg-pci-roms-from-efi' into next
* pci/mjg-pci-roms-from-efi:
  x86: Use PCI setup data
  PCI: Add support for non-BAR ROMs
  PCI: Add pcibios_add_device
  EFI: Stash ROMs if they're not in the PCI BAR
2012-12-06 14:37:32 -07:00
Hauke Mehrtens
bde327eff8 ssb: register watchdog driver
Register the watchdog driver to the system if it is a SoC. Using the
watchdog on a non SoC device, like a PCI card, will make the PCI
card die when the timeout expired, but starting it again is not
supported by ssb.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-12-06 14:58:58 -05:00
Hauke Mehrtens
9f640a6376 ssb: extif: add methods for watchdog driver
The watchdog driver wants to set the watchdog timeout in ms and not in
ticks, add a method converting ms to ticks before setting the watchdog
register. Return the ticks or millisecond the timer was set to in case
the provided value was bigger than the max allowed value.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-12-06 14:58:58 -05:00
Hauke Mehrtens
7280b51a29 ssb: extif: add check for max value before setting watchdog register
Prevent the watchdog register on the extif core to be set to a too
high value.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-12-06 14:58:57 -05:00
Hauke Mehrtens
7ffbffe37d ssb: add methods for watchdog driver
The watchdog driver wants to set the watchdog timeout in ms and not in
ticks, which is depending on the SoC type and the clock.
Calculate the number of ticks per millisecond and provide two functions
for the watchdog driver. Also return the ticks or millisecond the timer
was set to in case the provided value was bigger than the max allowed
value.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-12-06 14:58:57 -05:00
Hauke Mehrtens
a4855f39d4 bcma: register watchdog driver
Register the watchdog driver to the system if this is a SoC. Using the
watchdog on a non SoC device, like a PCIe card, will make the PCIe
card die when the timeout expired, but starting it again is not
supported by bcma.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-12-06 14:58:57 -05:00
Hauke Mehrtens
a22a3114a8 bcma: add methods for watchdog driver
The watchdog driver wants to set the watchdog timeout in ms and not in
ticks, which is depending on the SoC type and the clock.
Calculate the number of ticks per millisecond and provide two functions
for the watchdog driver. Also return the ticks or millisecond the timer
was set to in case the provided value was bigger than the max allowed
value.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-12-06 14:58:56 -05:00
Hauke Mehrtens
bc245cc36c ssb/bcma: add common header for watchdog
This adds a common header for watchdog functions, so a watchdog driver
just needs to use this and could provide watchdog functionality for ssb
and bcma based SoCs. Patches for a watchdog driver using this interface
will be send later.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-12-06 14:58:56 -05:00
John W. Linville
403e16731f Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Conflicts:
	drivers/net/wireless/mwifiex/sta_ioctl.c
	net/mac80211/scan.c
2012-12-06 14:58:41 -05:00
Mel Gorman
18a2f371f5 tmpfs: fix shared mempolicy leak
This fixes a regression in 3.7-rc, which has since gone into stable.

Commit 00442ad04a ("mempolicy: fix a memory corruption by refcount
imbalance in alloc_pages_vma()") changed get_vma_policy() to raise the
refcount on a shmem shared mempolicy; whereas shmem_alloc_page() went
on expecting alloc_page_vma() to drop the refcount it had acquired.
This deserves a rework: but for now fix the leak in shmem_alloc_page().

Hugh: shmem_swapin() did not need a fix, but surely it's clearer to use
the same refcounting there as in shmem_alloc_page(), delete its onstack
mempolicy, and the strange mpol_cond_copy() and __mpol_cond_copy() -
those were invented to let swapin_readahead() make an unknown number of
calls to alloc_pages_vma() with one mempolicy; but since 00442ad04a,
alloc_pages_vma() has kept refcount in balance, so now no problem.

Reported-and-tested-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-06 11:56:43 -08:00
Daniel Drake
6a66180a25 mmc: sdhci: add quirk for lack of 1.8v support
The OLPC XO-1.75 laptop includes a SDHCI controller which is 1.8v
capable, and it truthfully reports so in its capabilities. This
alternate voltage is used for driving new "UHS-I" SD cards at their
full speed.

However, what the controller doesn't know is that the motherboard
physically doesn't have a 1.8v supply available.

Add a quirk so that systems such as this one can override disable
1.8v support, adding support for UHS-I cards (by running them at
3.3v).

This avoids a problem where the system would first try to run the
card at 1.8v, fail, and then not be able to fully reset the card
to retry at the normal 3.3v voltage.

This is more appropriate than using the MISSING_CAPS quirk, which
is intended for cases where the SDHCI controller is actually lying
about its capabilities, and would force us to somehow override both
caps words from another source.

Signed-off-by: Daniel Drake <dsd@laptop.org>
Reviewed-by: Philip Rakity <prakity@nvidia.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-12-06 13:55:04 -05:00
Abhilash Kesavan
ab269128a2 mmc: dw_mmc: Add sdio power bindings
Add dt-based retrieval of host sdio pm capabilities. Based on
the dt based discovery do a bus init in the resume function.

Signed-off-by: Olof Johansson <olofj@chromium.org>
Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-12-06 13:54:54 -05:00
Kevin Liu
7c52d7bb87 mmc: sdhci-pxav3: add quirks2
Acked-by: Zhangfei Gao <zhangfei.gao@marvell.com>
Signed-off-by: Kevin Liu <kliu5@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-12-06 13:54:51 -05:00
Loic Pallardy
67c79db8d9 mmc: core: Add mmc_set_blockcount feature
Provide support for automatically sending Set Block Count
(CMD23) messages. Used at least for RPMB support.

Signed-off-by: Alex Macro <alex.macro@stericsson.com>
Signed-off-by: Loic Pallardy <loic.pallardy@stericsson.com>
Reviewed-by: Namjae Jeon <linkinjeon@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Johan Rudholm <johan.rudholm@stericsson.com>
Acked-by: Krishna Konda <kkonda@codeaurora.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-12-06 13:54:48 -05:00
Loic Pallardy
090d25fe22 mmc: core: Expose access to RPMB partition
Following JEDEC standard, if the mmc supports RPMB partition,
a new interface is created and exposed via /dev/block.
Users will be able to access RPMB partition using standard
mmc IOCTL commands.

Signed-off-by: Alex Macro <alex.macro@stericsson.com>
Signed-off-by: Loic Pallardy <loic.pallardy@stericsson.com>
Reviewed-by: Namjae Jeon <linkinjeon@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Johan Rudholm <johan.rudholm@stericsson.com>
Acked-by: Krishna Konda <kkonda@codeaurora.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-12-06 13:54:46 -05:00
Kevin Liu
ed9dbb6eff mmc: host: Make UHS timing values fully unique
Both of MMC_TIMING_LEGACY and MMC_TIMING_UHS_SDR12 are defined
to 0. And ios->timing is set to MMC_TIMING_LEGACY during power up.
But set_ios can't distinguish these two timing if host support
spec 3.0. Just adjust timing values to be different can resolve
this issue without any other impact.

Reviewed-by: Girish K S <girish.shivananjappa@linaro.org>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Kevin Liu <kliu5@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-12-06 13:54:46 -05:00
Lee Jones
5f1a4dd037 mmc: Standardise capability type
There are discrepancies with regards to how MMC capabilities
are carried throughout the subsystem. Let's standardise them
to eliminate any confusion.

Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-12-06 13:54:44 -05:00
Fabio Estevam
d6ed91aff6 mmc: mxs-mmc: Remove platform data
All MXS users have been converted to device tree and the board files have
been removed.

No need to keep platform data in the driver.

Also move bus_width declaration in the beggining of mxs_mmc_probe() to
avoid: 'warning: ISO C90 forbids mixed declarations and code'.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Marek Vasut <marex@denx.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-12-06 13:54:44 -05:00
Grant Likely
a34fc82e23 Merge branch 'spi-next' from git://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc.git
Pull in the changes Mark has queued up for SPI
2012-12-06 13:58:31 +00:00
Bart Van Assche
80729beb33 bsg: Remove unused function bsg_goose_queue()
The function bsg_goose_queue() does not have any in-tree callers,
so let's remove it.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-12-06 14:33:02 +01:00
Bart Van Assche
24faf6f604 block: Make blk_cleanup_queue() wait until request_fn finished
Some request_fn implementations, e.g. scsi_request_fn(), unlock
the queue lock internally. This may result in multiple threads
executing request_fn for the same queue simultaneously. Keep
track of the number of active request_fn calls and make sure that
blk_cleanup_queue() waits until all active request_fn invocations
have finished. A block driver may start cleaning up resources
needed by its request_fn as soon as blk_cleanup_queue() finished,
so blk_cleanup_queue() must wait for all outstanding request_fn
invocations to finish.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reported-by: Chanho Min <chanho.min@lge.com>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-12-06 14:33:00 +01:00
Bart Van Assche
c246e80d86 block: Avoid that request_fn is invoked on a dead queue
A block driver may start cleaning up resources needed by its
request_fn as soon as blk_cleanup_queue() finished, so request_fn
must not be invoked after draining finished. This is important
when blk_run_queue() is invoked without any requests in progress.
As an example, if blk_drain_queue() and scsi_run_queue() run in
parallel, blk_drain_queue() may have finished all requests after
scsi_run_queue() has taken a SCSI device off the starved list but
before that last function has had a chance to run the queue.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Chanho Min <chanho.min@lge.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-12-06 14:32:01 +01:00
Bart Van Assche
3f3299d5c0 block: Rename queue dead flag
QUEUE_FLAG_DEAD is used to indicate that queuing new requests must
stop. After this flag has been set queue draining starts. However,
during the queue draining phase it is still safe to invoke the
queue's request_fn, so QUEUE_FLAG_DYING is a better name for this
flag.

This patch has been generated by running the following command
over the kernel source tree:

git grep -lEw 'blk_queue_dead|QUEUE_FLAG_DEAD' |
    xargs sed -i.tmp -e 's/blk_queue_dead/blk_queue_dying/g'      \
        -e 's/QUEUE_FLAG_DEAD/QUEUE_FLAG_DYING/g';                \
sed -i.tmp -e "s/QUEUE_FLAG_DYING$(printf \\t)*5/QUEUE_FLAG_DYING$(printf \\t)5/g" \
    include/linux/blkdev.h;                                       \
sed -i.tmp -e 's/ DEAD/ DYING/g' -e 's/dead queue/a dying queue/' \
    -e 's/Dead queue/A dying queue/' block/blk-core.c

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Chanho Min <chanho.min@lge.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-12-06 14:30:58 +01:00
Johannes Berg
01331040e6 wireless: fix VHT max AMPDU exponent definition
This is really a 3-bit field, not a single bit,
so declare a mask and shift. Also fix hwsim, it
advertises the maximum possible.

While at it reindent all the defines using tabs
instead of spaces.

Change-Id: I7cd81c0d72f76deb5010aba5bfa3dd312006e898
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-12-06 14:02:51 +01:00
Nadia Yvette Chambers
6d49e352ae propagate name change to comments in kernel source
I've legally changed my name with New York State, the US Social Security
Administration, et al. This patch propagates the name change and change
in initials and login to comments in the kernel source as well.

Signed-off-by: Nadia Yvette Chambers <nyc@holomorphy.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-12-06 10:39:54 +01:00
Jussi Kivilinna
044ab52578 crypto: cast5/cast6 - move lookup tables to shared module
CAST5 and CAST6 both use same lookup tables, which can be moved shared module
'cast_common'.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-12-06 17:16:26 +08:00
Marek Szyprowski
d1e7de3007 regulators: add regulator_can_change_voltage() function
Introduce a regulator_can_change_voltage() function for the subsytems or
drivers which might check if applying voltage change is possible and use
special workaround code when the driver is used with fixed regulators or
regulators with disabled ability to change the voltage.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2012-12-06 15:13:34 +09:00
Dave Airlie
8de9e41775 Merge branch 'connector-to-object-prop' of git://github.com/robclark/kernel-omap4 into drm-next
* 'connector-to-object-prop' of git://github.com/robclark/kernel-omap4:
  drm: remove legacy drm_connector_property fxns
  drm/nouveau: drm_connector_property -> drm_object_property
  drm/i915: One more drm_connector_property -> drm_object_property
  drm/i2c: drm_connector_property -> drm_object_property
  drm/vmwgfx: drm_connector_property -> drm_object_property
  drm/udl: drm_connector_property -> drm_object_property
  drm/shmob: drm_connector_property -> drm_object_property
  drm/radeon: drm_connector_property -> drm_object_property
  drm/gma500: drm_connector_property -> drm_object_property
2012-12-06 14:08:09 +10:00
David Woodhouse
cf66bb93e0 byteorder: allow arch to opt to use GCC intrinsics for byteswapping
Since GCC 4.4, there have been __builtin_bswap32() and __builtin_bswap16()
intrinsics. A __builtin_bswap16() came a little later (4.6 for PowerPC,
48 for other platforms).

By using these instead of the inline assembler that most architectures
have in their __arch_swabXX() macros, we let the compiler see what's
actually happening. The resulting code should be at least as good, and
much *better* in the cases where it can be combined with a nearby load
or store, using a load-and-byteswap or store-and-byteswap instruction
(e.g. lwbrx/stwbrx on PowerPC, movbe on Atom).

When GCC is sufficiently recent *and* the architecture opts in to using
the intrinsics by setting CONFIG_ARCH_USE_BUILTIN_BSWAP, they will be
used in preference to the __arch_swabXX() macros. An architecture which
does not set ARCH_USE_BUILTIN_BSWAP will continue to use its own
hand-crafted macros.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
2012-12-06 01:22:31 +00:00
Paul Mackerras
a2932923cc KVM: PPC: Book3S HV: Provide a method for userspace to read and write the HPT
A new ioctl, KVM_PPC_GET_HTAB_FD, returns a file descriptor.  Reads on
this fd return the contents of the HPT (hashed page table), writes
create and/or remove entries in the HPT.  There is a new capability,
KVM_CAP_PPC_HTAB_FD, to indicate the presence of the ioctl.  The ioctl
takes an argument structure with the index of the first HPT entry to
read out and a set of flags.  The flags indicate whether the user is
intending to read or write the HPT, and whether to return all entries
or only the "bolted" entries (those with the bolted bit, 0x10, set in
the first doubleword).

This is intended for use in implementing qemu's savevm/loadvm and for
live migration.  Therefore, on reads, the first pass returns information
about all HPTEs (or all bolted HPTEs).  When the first pass reaches the
end of the HPT, it returns from the read.  Subsequent reads only return
information about HPTEs that have changed since they were last read.
A read that finds no changed HPTEs in the HPT following where the last
read finished will return 0 bytes.

The format of the data provides a simple run-length compression of the
invalid entries.  Each block of data starts with a header that indicates
the index (position in the HPT, which is just an array), the number of
valid entries starting at that index (may be zero), and the number of
invalid entries following those valid entries.  The valid entries, 16
bytes each, follow the header.  The invalid entries are not explicitly
represented.

Signed-off-by: Paul Mackerras <paulus@samba.org>
[agraf: fix documentation]
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-12-06 01:33:57 +01:00
Alexander Graf
914daba865 KVM: Distangle eventfd code from irqchip
The current eventfd code assumes that when we have eventfd, we also have
irqfd for in-kernel interrupt delivery. This is not necessarily true. On
PPC we don't have an in-kernel irqchip yet, but we can still support easily
support eventfd.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-12-06 01:33:49 +01:00
Trond Myklebust
c05eecf636 SUNRPC: Don't allow low priority tasks to pre-empt higher priority ones
Currently, the priority queues attempt to be 'fair' to lower priority
tasks by scheduling them after a certain number of higher priority tasks
have run. The problem is that both the transport send queue and
the NFSv4.1 session slot queue have strong ordering requirements.

This patch therefore removes the fairness code in favour of strong
ordering of task priorities.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:53 +01:00
Trond Myklebust
62ae082d88 NFSv4: Reorder the XDR structures to put sequence at the top, not bottom
Pre-condition for optimising the slot allocation and reintroducing FIFO
behaviour.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:52 +01:00
Trond Myklebust
8fe72bac8d NFSv4: Clean up handling of privileged operations
Privileged rpc calls are those that are run by the state recovery thread,
in cases where we're trying to recover the system after a server reboot
or a network partition. In those cases, we want to fence off all other
rpc calls (see nfs4_begin_drain_session()) so that they don't end up
using stateids or clientids that are in the process of being recovered.

Prior to this patch, we had to set up special callback functions in
order to declare an rpc call as being privileged.
By adding a new field to the sequence arguments, this patch simplifies
things considerably, and allows us to declare the rpc call as privileged
before it is run.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:50 +01:00
Trond Myklebust
76e697ba7e NFSv4.1: Move slot table and session struct definitions to nfs4session.h
Clean up. Gather NFSv4.1 slot definitions in fs/nfs/nfs4session.h.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:46 +01:00
Trond Myklebust
c34309a45e NFS: Remove unused function slot_idx
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:46 +01:00
Trond Myklebust
87dda67e73 NFSv4.1: Allow SEQUENCE to resize the slot table on the fly
Instead of an array of slots, use a singly linked list of slots that
can be dynamically appended to or shrunk.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:42 +01:00
Trond Myklebust
97e548a93d NFSv4.1: Support dynamic resizing of the session slot table
Allow the server to control the size of the session slot table
by adjusting the value of sr_target_max_slots in the reply to the
SEQUENCE operation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:42 +01:00
Trond Myklebust
da0507b7c9 NFSv4.1: Reset the sequence number for slots that have been deallocated
When the server tells us that it is dynamically resizing the session
replay cache, we should reset the sequence number for those slots
that have been deallocated.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:17 +01:00
Trond Myklebust
464ee9f966 NFSv4.1: Ensure that the client tracks the server target_highest_slotid
Dynamic slot allocation in NFSv4.1 depends on the client being able to
track the server's target value for the highest slotid in the
slot table.  See the reference in Section 2.10.6.1 of RFC5661.

To avoid ordering problems in the case where 2 SEQUENCE replies contain
conflicting updates to this target value, we also introduce a generation
counter, to track whether or not an RPC containing a SEQUENCE operation
was launched before or after the last update.

Also rename the nfs4_slot_table target_max_slots field to
'target_highest_slotid' to avoid confusion with a slot
table size or number of slots.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:29:47 +01:00
Alexander Shiyan
161b96c383 spi/clps711x: New SPI master driver
This patch add new driver for CLPS711X SPI master controller.
Due to platform limitations driver supports only 8 bit transfer mode.
Chip select control is handled via GPIO.

Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2012-12-05 23:14:38 +00:00
Matthew Garrett
84c1b80e32 PCI: Add support for non-BAR ROMs
Platforms may provide their own mechanisms for obtaining ROMs. Add support
for using data provided by the platform in that case.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Seth Forshee <seth.forshee@canonical.com>
2012-12-05 14:38:26 -07:00
Matthew Garrett
eca0d4676d PCI: Add pcibios_add_device
Platforms may want to provide architecture-specific functionality during
PCI enumeration. Add a pcibios_add_device() call that architectures can
override to do so.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Seth Forshee <seth.forshee@canonical.com>
2012-12-05 14:38:26 -07:00
Matthew Garrett
dd5fc854de EFI: Stash ROMs if they're not in the PCI BAR
EFI provides support for providing PCI ROMs via means other than the ROM
BAR. This support vanishes after we've exited boot services, so add support
for stashing copies of the ROMs in setup_data if they're not otherwise
available.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Seth Forshee <seth.forshee@canonical.com>
2012-12-05 14:33:26 -07:00
David S. Miller
c2d3babfaf bridge: implement multicast fast leave
V3: make it a flag
V2: make the toggle per-port

Fast leave allows bridge to immediately stops the multicast
traffic on the port receives IGMP Leave when IGMP snooping is enabled,
no timeouts are observed.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-12-05 16:24:45 -05:00
Bjorn Helgaas
7793eeabc8 PCI: Add and use standard PCI-X Capability register names
Add and use #defines for PCI-X Capability registers and fields.
Note that the PCI-X Capability has a different layout for
type 0 (endpoint) and type 1 (bridge) devices.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-12-05 13:51:17 -07:00
Jeff Moyer
8fa72d234d bdi: add a user-tunable cpu_list for the bdi flusher threads
In realtime environments, it may be desirable to keep the per-bdi
flusher threads from running on certain cpus.  This patch adds a
cpu_list file to /sys/class/bdi/* to enable this.  The default is to tie
the flusher threads to the same numa node as the backing device (though
I could be convinced to make it a mask of all cpus to avoid a change in
behaviour).

Thanks to Jeremy Eder for the original idea.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-12-05 20:17:21 +01:00
Vivien Didelot
46d7846292 hwmon: (ads7828) driver cleanup
As there is no reliable way to identify the chip, it is preferable to
remove the detect callback, to avoid misdetection.

Module parameters are not worth it here, so let's get rid of them and
add an ads7828_platform_data structure instead.

Clean the code by removing unused macros, fixing coding style issues,
avoiding function prototypes and using convenient macros such as
module_i2c_driver().

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2012-12-05 10:55:54 -08:00
Michael S. Tsirkin
01f2188037 kvm: add kvm_set_irq_inatomic
Add an API to inject IRQ from atomic context.
Return EWOULDBLOCK if impossible (e.g. for multicast).
Only MSI is supported ATM.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2012-12-05 15:10:45 +02:00
Rafael J. Wysocki
d71f2f8882 ACPI / PM: Fix header of acpi_dev_pm_detach() in acpi.h
The header of acpi_dev_pm_detach() in include/linux/acpi.h has an
incorrect return type, which should be void.  Fix that.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-12-05 11:59:22 +01:00
Stanislaw Gruszka
5b632fe85e mac80211: introduce IEEE80211_HW_TEARDOWN_AGGR_ON_BAR_FAIL
Commit f0425beda4 "mac80211: retry sending
failed BAR frames later instead of tearing down aggr" caused regression
on rt2x00 hardware (connection hangs). This regression was fixed by
commit be03d4a45c "rt2x00: Don't let
mac80211 send a BAR when an AMPDU subframe fails". But the latter
commit caused yet another problem reported in
https://bugzilla.kernel.org/show_bug.cgi?id=42828#c22

After long discussion in this thread:
http://mid.gmane.org/20121018075615.GA18212@redhat.com
and testing various alternative solutions, which failed on one or other
setup, we have no other good fix for the issues like just revert both
mentioned earlier commits.

To do not affect other hardware which benefit from commit
f0425beda4, instead of reverting it,
introduce flag that when used will restore mac80211 behaviour before
the commit.

Cc: stable@vger.kernel.org
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
[replaced link with mid.gmane.org that has message-id]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-12-05 09:53:31 +01:00
Nicholas Bellinger
0ff8754981 target: Add link_magic for fabric allow_link destination target_items
This patch adds [dev,lun]_link_magic value assignment + checks within generic
target_fabric_port_link() and target_fabric_mappedlun_link() code to ensure
destination config_item *target_item sent from configfs_symlink() ->
config_item_operations->allow_link() is the underlying se_device->dev_group
and se_lun->lun_group that we expect to symlink.

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2012-12-05 00:11:36 -08:00
Nicolas Dichtel
9a68ac72a4 ipmr/ip6mr: report origin of mfc entry into rtnl msg
A mfc entry can be static or not (added via the mroute_sk socket). The patch
reports MFC_STATIC flag into rtm_protocol by setting rtm_protocol to
RTPROT_STATIC or RTPROT_MROUTED.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-04 13:08:11 -05:00
Nicolas Dichtel
adfa85e45d ipmr/ip6mr: advertise mfc stats via rtnetlink
These statistics can be checked only via /proc/net/ip_mr_cache or
SIOCGETSGCNT[_IN6] and thus only for the table RT_TABLE_DEFAULT.
Advertising them via rtnetlink allows to get statistics for all cache entries,
whatever the table is.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-04 13:08:10 -05:00
Nicolas Dichtel
d67b8c616b netconf: advertise mc_forwarding status
This patch advertise the MC_FORWARDING status for IPv4 and IPv6.
This field is readonly, only multicast engine in the kernel updates it.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-04 13:08:10 -05:00
David S. Miller
e8ad1a8fab Merge branch 'master' of git://1984.lsi.us.es/nf-next
Pablo Neira Ayuso says:

====================
* Remove limitation in the maximum number of supported sets in ipset.
  Now ipset automagically increments the number of slots in the array
  of sets by 64 new spare slots, from Jozsef Kadlecsik.

* Partially remove the generic queue infrastructure now that ip_queue
  is gone. Its only client is nfnetlink_queue now, from Florian
  Westphal.

* Add missing attribute policy checkings in ctnetlink, from Florian
  Westphal.

* Automagically kill conntrack entries that use the wrong output
  interface for the masquerading case in case of routing changes,
  from Jozsef Kadlecsik.

* Two patches two improve ct object traceability. Now ct objects are
  always placed in any of the existing lists. This allows us to dump
  the content of unconfirmed and dying conntracks via ctnetlink as
  a way to provide more instrumentation in case you suspect leaks,
  from myself.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-04 13:01:19 -05:00
Linus Torvalds
df2fc246c8 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module fixes from Rusty Russell:
 "Module signing build fixes for blackfin and metag"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  modsign: add symbol prefix to certificate list
  linux/kernel.h: define SYMBOL_PREFIX
2012-12-04 09:32:12 -08:00
J. Bruce Fields
8af345f58a svcrpc: track rpc data length separately from sk_tcplen
Keep a separate field, sk_datalen, that tracks only the data contained
in a fragment, not including the fragment header.

For now, this is always just max(0, sk_tcplen - 4), but after we allow
multiple fragments sk_datalen will accumulate the total rpc data size
while sk_tcplen only tracks progress receiving the current fragment.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-04 07:49:14 -05:00
J. Bruce Fields
cc248d4b1d svcrpc: don't byte-swap sk_reclen in place
Byte-swapping in place is always a little dubious.

Let's instead define this field to always be big-endian, and do the
swapping on demand where we need it.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-04 07:47:23 -05:00
Inki Dae
2a3098ff6c drm/exynos: add userptr feature for g2d module
This patch adds userptr feautre for G2D module.

The userptr means user space address allocated by malloc().
And the purpose of this feature is to make G2D's dma able
to access the user space region.

To user this feature, user should flag G2D_BUF_USRPTR to
offset variable of struct drm_exynos_g2d_cmd and fill
struct drm_exynos_g2d_userptr with user space address
and size for it and then should set a pointer to
drm_exynos_g2d_userptr object to data variable of struct
drm_exynos_g2d_cmd. The last bit of offset variable is used
to check if the cmdlist's buffer type is userptr or not.
If userptr, the g2d driver gets user space address and size
and then gets pages through get_user_pages().
(another case is counted as gem handle)

Below is sample codes:

static void set_cmd(struct drm_exynos_g2d_cmd *cmd,
		unsigned long offset, unsigned long data)
{
	cmd->offset = offset;
	cmd->data = data;
}

static int solid_fill_test(int x, int y, unsigned long userptr)
{
	struct drm_exynos_g2d_cmd cmd_gem[5];
	struct drm_exynos_g2d_userptr g2d_userptr;
	unsigned int gem_nr = 0;
	...

	g2d_userptr.userptr = userptr;
	g2d_userptr.size = x * y * 4;

	set_cmd(&cmd_gem[gem_nr++], DST_BASE_ADDR_REG |
					G2D_BUF_USERPTR,
			(unsigned long)&g2d_userptr);
	...
}

int main(int argc, char **argv)
{
	unsigned long addr;
	...

	addr = malloc(x * y * 4);
	...

	solid_fill_test(x, y, addr);
	...
}

And next, the pages are mapped with iommu table and the device
address is set to cmdlist so that G2D's dma can access it.
As you may know, the pages from get_user_pages() are pinned.
In other words, they CAN NOT be migrated and also swapped out.
So the dma access would be safe.

But the use of userptr feature has performance overhead so
this patch also has memory pool to the userptr feature.
Please, assume that user sends cmdlist filled with userptr
and size every time to g2d driver, and the get_user_pages
funcion will be called every time.

The memory pool has maximum 64MB size and the userptr that
user had ever sent, is holded in the memory pool.
This meaning is that if the userptr from user is same as one
in the memory pool, device address to the userptr in the memory
pool is set to cmdlist.

And last, the pages from get_user_pages() will be freed once
user calls free() and the dma access is completed. Actually,
get_user_pages() takes 2 reference counts if the user process
has never accessed user region allocated by malloc(). Then, if
the user calls free(), the page reference count becomes 1 and
becomes 0 with put_page() call. And the reverse holds as well.
This means how the pages backed are used by dma and freed.

This patch is based on "drm/exynos: add iommu support for g2d",
	https://patchwork.kernel.org/patch/1629481/

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-12-04 14:46:01 +09:00
Michael S. Tsirkin
5d09710925 tun: only queue packets on device
Historically tun supported two modes of operation:
- in default mode, a small number of packets would get queued
  at the device, the rest would be queued in qdisc
- in one queue mode, all packets would get queued at the device

This might have made sense up to a point where we made the
queue depth for both modes the same and set it to
a huge value (500) so unless the consumer
is stuck the chance of losing packets is small.

Thus in practice both modes behave the same, but the
default mode has some problems:
- if packets are never consumed, fragments are never orphaned
  which cases a DOS for sender using zero copy transmit
- overrun errors are hard to diagnose: fifo error is incremented
  only once so you can not distinguish between
  userspace that is stuck and a transient failure,
  tcpdump on the device does not show any traffic

Userspace solves this simply by enabling IFF_ONE_QUEUE
but there seems to be little point in not doing the
right thing for everyone, by default.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-12-03 15:07:36 -05:00