Commit Graph

53964 Commits

Author SHA1 Message Date
Peter Zijlstra
cbee9f88ec mm: numa: Add fault driven placement and migration
NOTE: This patch is based on "sched, numa, mm: Add fault driven
	placement and migration policy" but as it throws away all the policy
	to just leave a basic foundation I had to drop the signed-offs-by.

This patch creates a bare-bones method for setting PTEs pte_numa in the
context of the scheduler that when faulted later will be faulted onto the
node the CPU is running on.  In itself this does nothing useful but any
placement policy will fundamentally depend on receiving hints on placement
from fault context and doing something intelligent about it.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:45 +00:00
Mel Gorman
a720094ded mm: mempolicy: Hide MPOL_NOOP and MPOL_MF_LAZY from userspace for now
The use of MPOL_NOOP and MPOL_MF_LAZY to allow an application to
explicitly request lazy migration is a good idea but the actual
API has not been well reviewed and once released we have to support it.
For now this patch prevents an application using the services. This
will need to be revisited.

Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:44 +00:00
Mel Gorman
4b10e7d562 mm: mempolicy: Implement change_prot_numa() in terms of change_protection()
This patch converts change_prot_numa() to use change_protection(). As
pte_numa and friends check the PTE bits directly it is necessary for
change_protection() to use pmd_mknuma(). Hence the required
modifications to change_protection() are a little clumsy but the
end result is that most of the numa page table helpers are just one or
two instructions.

Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:44 +00:00
Lee Schermerhorn
b24f53a0be mm: mempolicy: Add MPOL_MF_LAZY
NOTE: Once again there is a lot of patch stealing and the end result
	is sufficiently different that I had to drop the signed-offs.
	Will re-add if the original authors are ok with that.

This patch adds another mbind() flag to request "lazy migration".  The
flag, MPOL_MF_LAZY, modifies MPOL_MF_MOVE* such that the selected
pages are marked PROT_NONE. The pages will be migrated in the fault
path on "first touch", if the policy dictates at that time.

"Lazy Migration" will allow testing of migrate-on-fault via mbind().
Also allows applications to specify that only subsequently touched
pages be migrated to obey new policy, instead of all pages in range.
This can be useful for multi-threaded applications working on a
large shared data area that is initialized by an initial thread
resulting in all pages on one [or a few, if overflowed] nodes.
After PROT_NONE, the pages in regions assigned to the worker threads
will be automatically migrated local to the threads on 1st touch.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:43 +00:00
Mel Gorman
4daae3b4b9 mm: mempolicy: Use _PAGE_NUMA to migrate pages
Note: Based on "mm/mpol: Use special PROT_NONE to migrate pages" but
	sufficiently different that the signed-off-bys were dropped

Combine our previous _PAGE_NUMA, mpol_misplaced and migrate_misplaced_page()
pieces into an effective migrate on fault scheme.

Note that (on x86) we rely on PROT_NONE pages being !present and avoid
the TLB flush from try_to_unmap(TTU_MIGRATION). This greatly improves the
page-migration performance.

Based-on-work-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:42 +00:00
Peter Zijlstra
7039e1dbec mm: migrate: Introduce migrate_misplaced_page()
Note: This was originally based on Peter's patch "mm/migrate: Introduce
	migrate_misplaced_page()" but borrows extremely heavily from Andrea's
	"autonuma: memory follows CPU algorithm and task/mm_autonuma stats
	collection". The end result is barely recognisable so signed-offs
	had to be dropped. If original authors are ok with it, I'll
	re-add the signed-off-bys.

Add migrate_misplaced_page() which deals with migrating pages from
faults.

Based-on-work-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Based-on-work-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Based-on-work-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:41 +00:00
Lee Schermerhorn
771fb4d806 mm: mempolicy: Check for misplaced page
This patch provides a new function to test whether a page resides
on a node that is appropriate for the mempolicy for the vma and
address where the page is supposed to be mapped.  This involves
looking up the node where the page belongs.  So, the function
returns that node so that it may be used to allocated the page
without consulting the policy again.

A subsequent patch will call this function from the fault path.
Because of this, I don't want to go ahead and allocate the page, e.g.,
via alloc_page_vma() only to have to free it if it has the correct
policy.  So, I just mimic the alloc_page_vma() node computation
logic--sort of.

Note:  we could use this function to implement a MPOL_MF_STRICT
behavior when migrating pages to match mbind() mempolicy--e.g.,
to ensure that pages in an interleaved range are reinterleaved
rather than left where they are when they reside on any page in
the interleave nodemask.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
[ Added MPOL_F_LAZY to trigger migrate-on-fault;
  simplified code now that we don't have to bother
  with special crap for interleaved ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:41 +00:00
Lee Schermerhorn
d3a710337b mm: mempolicy: Add MPOL_NOOP
This patch augments the MPOL_MF_LAZY feature by adding a "NOOP" policy
to mbind().  When the NOOP policy is used with the 'MOVE and 'LAZY
flags, mbind() will map the pages PROT_NONE so that they will be
migrated on the next touch.

This allows an application to prepare for a new phase of operation
where different regions of shared storage will be assigned to
worker threads, w/o changing policy.  Note that we could just use
"default" policy in this case.  However, this also allows an
application to request that pages be migrated, only if necessary,
to follow any arbitrary policy that might currently apply to a
range of pages, without knowing the policy, or without specifying
multiple mbind()s for ranges with different policies.

[ Bug in early version of mpol_parse_str() reported by Fengguang Wu. ]

Bug-Reported-by: Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:40 +00:00
Peter Zijlstra
479e2802d0 mm: mempolicy: Make MPOL_LOCAL a real policy
Make MPOL_LOCAL a real and exposed policy such that applications that
relied on the previous default behaviour can explicitly request it.

Requested-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:39 +00:00
Mel Gorman
d10e63f294 mm: numa: Create basic numa page hinting infrastructure
Note: This patch started as "mm/mpol: Create special PROT_NONE
	infrastructure" and preserves the basic idea but steals *very*
	heavily from "autonuma: numa hinting page faults entry points" for
	the actual fault handlers without the migration parts.	The end
	result is barely recognisable as either patch so all Signed-off
	and Reviewed-bys are dropped. If Peter, Ingo and Andrea are ok with
	this version, I will re-add the signed-offs-by to reflect the history.

In order to facilitate a lazy -- fault driven -- migration of pages, create
a special transient PAGE_NUMA variant, we can then use the 'spurious'
protection faults to drive our migrations from.

The meaning of PAGE_NUMA depends on the architecture but on x86 it is
effectively PROT_NONE. Actual PROT_NONE mappings will not generate these
NUMA faults for the reason that the page fault code checks the permission on
the VMA (and will throw a segmentation fault on actual PROT_NONE mappings),
before it ever calls handle_mm_fault.

[dhillf@gmail.com: Fix typo]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:39 +00:00
Andrea Arcangeli
0b9d705297 mm: numa: Support NUMA hinting page faults from gup/gup_fast
Introduce FOLL_NUMA to tell follow_page to check
pte/pmd_numa. get_user_pages must use FOLL_NUMA, and it's safe to do
so because it always invokes handle_mm_fault and retries the
follow_page later.

KVM secondary MMU page faults will trigger the NUMA hinting page
faults through gup_fast -> get_user_pages -> follow_page ->
handle_mm_fault.

Other follow_page callers like KSM should not use FOLL_NUMA, or they
would fail to get the pages if they use follow_page instead of
get_user_pages.

[ This patch was picked up from the AutoNUMA tree. ]

Originally-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
[ ported to this tree. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:42:37 +00:00
Andrea Arcangeli
be3a728427 mm: numa: pte_numa() and pmd_numa()
Implement pte_numa and pmd_numa.

We must atomically set the numa bit and clear the present bit to
define a pte_numa or pmd_numa.

Once a pte or pmd has been set as pte_numa or pmd_numa, the next time
a thread touches a virtual address in the corresponding virtual range,
a NUMA hinting page fault will trigger. The NUMA hinting page fault
will clear the NUMA bit and set the present bit again to resolve the
page fault.

The expectation is that a NUMA hinting page fault is used as part
of a placement policy that decides if a page should remain on the
current node or migrated to a different node.

Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
2012-12-11 14:42:36 +00:00
Mel Gorman
397487db69 mm: compaction: Add scanned and isolated counters for compaction
Compaction already has tracepoints to count scanned and isolated pages
but it requires that ftrace be enabled and if that information has to be
written to disk then it can be disruptive. This patch adds vmstat counters
for compaction called compact_migrate_scanned, compact_free_scanned and
compact_isolated.

With these counters, it is possible to define a basic cost model for
compaction. This approximates of how much work compaction is doing and can
be compared that with an oprofile showing TLB misses and see if the cost of
compaction is being offset by THP for example. Minimally a compaction patch
can be evaluated in terms of whether it increases or decreases cost. The
basic cost model looks like this

Fundamental unit u:	a word	sizeof(void *)

Ca  = cost of struct page access = sizeof(struct page) / u

Cmc = Cost migrate page copy = (Ca + PAGE_SIZE/u) * 2
Cmf = Cost migrate failure   = Ca * 2
Ci  = Cost page isolation    = (Ca + Wi)
	where Wi is a constant that should reflect the approximate
	cost of the locking operation.

Csm = Cost migrate scanning = Ca
Csf = Cost free    scanning = Ca

Overall cost =	(Csm * compact_migrate_scanned) +
	      	(Csf * compact_free_scanned)    +
	      	(Ci  * compact_isolated)	+
		(Cmc * pgmigrate_success)	+
		(Cmf * pgmigrate_failed)

Where the values are read from /proc/vmstat.

This is very basic and ignores certain costs such as the allocation cost
to do a migrate page copy but any improvement to the model would still
use the same vmstat counters.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:28:35 +00:00
Mel Gorman
7b2a2d4a18 mm: migrate: Add a tracepoint for migrate_pages
The pgmigrate_success and pgmigrate_fail vmstat counters tells the user
about migration activity but not the type or the reason. This patch adds
a tracepoint to identify the type of page migration and why the page is
being migrated.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:28:35 +00:00
Mel Gorman
5647bc293a mm: compaction: Move migration fail/success stats to migrate.c
The compact_pages_moved and compact_pagemigrate_failed events are
convenient for determining if compaction is active and to what
degree migration is succeeding but it's at the wrong level. Other
users of migration may also want to know if migration is working
properly and this will be particularly true for any automated
NUMA migration. This patch moves the counters down to migration
with the new events called pgmigrate_success and pgmigrate_fail.
The compact_blocks_moved counter is removed because while it was
useful for debugging initially, it's worthless now as no meaningful
conclusions can be drawn from its value.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
2012-12-11 14:28:35 +00:00
Peter Zijlstra
7da4d641c5 mm: Count the number of pages affected in change_protection()
This will be used for three kinds of purposes:

 - to optimize mprotect()

 - to speed up working set scanning for working set areas that
   have not been touched

 - to more accurately scan per real working set

No change in functionality from this patch.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-12-11 14:28:34 +00:00
Rik van Riel
2c3cf556b2 x86/mm: Introduce pte_accessible()
We need pte_present to return true for _PAGE_PROTNONE pages, to indicate that
the pte is associated with a page.

However, for TLB flushing purposes, we would like to know whether the pte
points to an actually accessible page.  This allows us to skip remote TLB
flushes for pages that are not actually accessible.

Fill in this method for x86 and provide a safe (but slower) method
on other architectures.

Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixed-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-66p11te4uj23gevgh4j987ip@git.kernel.org
[ Added Linus's review fixes. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-12-11 14:28:34 +00:00
Linus Torvalds
0cad3ff404 Merge branch 'akpm' (Fixes from Andrew)
Merge misc fixes from Andrew Morton.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (12 patches)
  revert "mm: fix-up zone present pages"
  tmpfs: change final i_blocks BUG to WARNING
  tmpfs: fix shmem_getpage_gfp() VM_BUG_ON
  mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address
  mm: revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures"
  rapidio: fix kernel-doc warnings
  swapfile: fix name leak in swapoff
  memcg: fix hotplugged memory zone oops
  mips, arc: fix build failure
  memcg: oom: fix totalpages calculation for memory.swappiness==0
  mm: fix build warning for uninitialized value
  mm: add anon_vma_lock to validate_mm()
2012-11-16 15:26:38 -08:00
Andrew Morton
5576646f3c revert "mm: fix-up zone present pages"
Revert commit 7f1290f2f2 ("mm: fix-up zone present pages")

That patch tried to fix a issue when calculating zone->present_pages,
but it caused a regression on 32bit systems with HIGHMEM.  With that
change, reset_zone_present_pages() resets all zone->present_pages to
zero, and fixup_zone_present_pages() is called to recalculate
zone->present_pages when the boot allocator frees core memory pages into
buddy allocator.  Because highmem pages are not freed by bootmem
allocator, all highmem zones' present_pages becomes zero.

Various options for improving the situation are being discussed but for
now, let's return to the 3.6 code.

Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Randy Dunlap
2ca3cb50ed rapidio: fix kernel-doc warnings
Fix rapidio kernel-doc warnings:

  Warning(drivers/rapidio/rio.c:415): No description found for parameter 'local'
  Warning(drivers/rapidio/rio.c:415): Excess function parameter 'lstart' description in 'rio_map_inb_region'
  Warning(include/linux/rio.h:290): No description found for parameter 'switches'
  Warning(include/linux/rio.h:290): No description found for parameter 'destid_table'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Acked-by: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
Hugh Dickins
bea8c150a7 memcg: fix hotplugged memory zone oops
When MEMCG is configured on (even when it's disabled by boot option),
when adding or removing a page to/from its lru list, the zone pointer
used for stats updates is nowadays taken from the struct lruvec.  (On
many configurations, calculating zone from page is slower.)

But we have no code to update all the lruvecs (per zone, per memcg) when
a memory node is hotadded.  Here's an extract from the oops which
results when running numactl to bind a program to a newly onlined node:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000f60
  IP:  __mod_zone_page_state+0x9/0x60
  Pid: 1219, comm: numactl Not tainted 3.6.0-rc5+ #180 Bochs Bochs
  Process numactl (pid: 1219, threadinfo ffff880039abc000, task ffff8800383c4ce0)
  Call Trace:
    __pagevec_lru_add_fn+0xdf/0x140
    pagevec_lru_move_fn+0xb1/0x100
    __pagevec_lru_add+0x1c/0x30
    lru_add_drain_cpu+0xa3/0x130
    lru_add_drain+0x2f/0x40
   ...

The natural solution might be to use a memcg callback whenever memory is
hotadded; but that solution has not been scoped out, and it happens that
we do have an easy location at which to update lruvec->zone.  The lruvec
pointer is discovered either by mem_cgroup_zone_lruvec() or by
mem_cgroup_page_lruvec(), and both of those do know the right zone.

So check and set lruvec->zone in those; and remove the inadequate
attempt to set lruvec->zone from lruvec_init(), which is called before
NODE_DATA(node) has been allocated in such cases.

Ah, there was one exceptionr.  For no particularly good reason,
mem_cgroup_force_empty_list() has its own code for deciding lruvec.
Change it to use the standard mem_cgroup_zone_lruvec() and
mem_cgroup_get_lru_size() too.  In fact it was already safe against such
an oops (the lru lists in danger could only be empty), but we're better
proofed against future changes this way.

I've marked this for stable (3.6) since we introduced the problem in 3.5
(now closed to stable); but I have no idea if this is the only fix
needed to get memory hotadd working with memcg in 3.6, and received no
answer when I enquired twice before.

Reported-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 14:33:04 -08:00
David Rientjes
fa0cbbf145 mm, oom: reintroduce /proc/pid/oom_adj
This is mostly a revert of 01dc52ebdf ("oom: remove deprecated oom_adj")
from Davidlohr Bueso.

It reintroduces /proc/pid/oom_adj for backwards compatibility with earlier
kernels.  It simply scales the value linearly when /proc/pid/oom_score_adj
is written.

The major difference is that its scheduled removal is no longer included
in Documentation/feature-removal-schedule.txt.  We do warn users with a
single printk, though, to suggest the more powerful and supported
/proc/pid/oom_score_adj interface.

Reported-by: Artem S. Tashkinov <t.artem@lycos.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-16 10:15:35 -08:00
Linus Torvalds
f4bcd79c88 ARM: SoC fixes for 3.7
We've been sitting on this longer than we meant to due to travel and
 other activities, but the number of patches is luckily not that high.
 
 Biggest changes are from a batch of OMAP bugfixes, but there are a
 few for the broader set of SoCs too (bcm2835, pxa, highbank, tegra,
 at91 and i.MX).
 
 The OMAP patches contain some fixes for MUSB/PHY on omap4 which
 ends up being a bit on the large side but needed for legacy (non-DT)
 platforms. Beyond that there are a handful of hwmod/pm changes.
 
 So, fairly noncontroversial stuff all in all, and as usual around this
 time the fixes are well targeted at specific problems.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQpmQOAAoJEIwa5zzehBx37+YP/jzcKS1pU5Wf73GYIUcCPqO0
 iRwLziexucUkWXnqIIPLedUJ8Dze/8q1tSTnQ7/JSXy9SYtJf651aj5OAo8w/cXO
 d58+y1S+VTsFHbppKfQHbGpYq2n2f4PPvrL24ftp40OmomVl/ktqOB4fDN1/YuAw
 lfTeo0v8MNfyVni5ij21rCNS17IC0Tl4Mfth8zIILWe6qdqajpla7CoO7ppmUM08
 OAPi6NJL/l8vqfqNtGuk2x9cOca0jY1rdib/rfrL1LxrtLT//NP0d6h+wKaSxLWm
 Qvr9nAsnmZNV0pFnYjVxfSMwM6Gf2SBh0QG3lF0akwfe3bEXqfnG5muAtWEhpTlt
 MVx54UgKSWVBgfBH8/SsDkJ3UydxNO5XjHz9YOix1Sj390J2zpP3E24Y0vmYYaNn
 c2seHeH4SckMZ1mddZgy1NT8y7/zaXQ2OSRLyVigJ9wjKmduuOW013BpKUlHFI9E
 Uzh1GLpWe2Cwl3wKWlHRlhgp0NzpWEHhrPW254rOfai8P9xeFMMQH8eqUlSWZbDN
 GAzqUWZ4/F6NxPV1GPrVCZjrA9IYKKtZ5GkPwC1FibC9Kfyb0rp9kr7KM4mzWUY3
 7MxpFatMS4rdbJk5lXCQ1+EIIWF6xdNbURCyMAnoiaowMzdv+LkfBIW+17fDPDgI
 WAu7+QCKTQd+i/bvMsKh
 =aKWl
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "We've been sitting on this longer than we meant to due to travel and
  other activities, but the number of patches is luckily not that high.

  Biggest changes are from a batch of OMAP bugfixes, but there are a few
  for the broader set of SoCs too (bcm2835, pxa, highbank, tegra, at91
  and i.MX).

  The OMAP patches contain some fixes for MUSB/PHY on omap4 which ends
  up being a bit on the large side but needed for legacy (non-DT)
  platforms.  Beyond that there are a handful of hwmod/pm changes.

  So, fairly noncontroversial stuff all in all, and as usual around this
  time the fixes are well targeted at specific problems."

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: imx: ehci: fix host power mask bit
  ARM i.MX: fix error-valued pointer dereference in clk_register_gate2()
  ARM: at91/usbh: fix overcurrent gpio setup
  ARM: at91/AT91SAM9G45: fix crypto peripherals irq issue due to sparse irq support
  ARM: boot: Fix usage of kecho
  ARM: OMAP: ocp2scp: create omap device for ocp2scp
  ARM: OMAP4: add _dev_attr_ to ocp2scp for representing usb_phy
  drivers: bus: ocp2scp: add pdata support
  irqchip: irq-bcm2835: Add terminating entry for of_device_id table
  ARM: highbank: retry wfi on reset request
  ARM: OMAP4: PM: fix regulator name for VDD_MPU
  ARM: OMAP4: hwmod data: do not enable or reset the McPDM during kernel init
  ARM: OMAP2+: hwmod: add flag to prevent hwmod code from touching IP block during init
  ARM: dt: tegra: fix length of pad control and mux registers
  ARM: OMAP: hwmod: wait for sysreset complete after enabling hwmod
  ARM: OMAP2+: clockdomain: Fix OMAP4 ISS clk domain to support only SWSUP
  ARM: pxa/spitz_pm: Fix hang when resuming from STR
  ARM: pxa: hx4700: Fix backlight PWM device number
  ARM: OMAP2+: PM: add missing newline to VC warning message
2012-11-16 10:08:45 -08:00
Arnd Bergmann
57260e4088 ARM i.MX fixes for 3.7-rc
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABCAAGBQJQplk7AAoJEPFlmONMx+ezYSMP/2oG4bGXA3O0ktyTOC2VA63v
 VEoElBItx1ZdwKmVOMOQfH5IHHFUnOxrIBB27+67qr22/gfjlrUftpVMkiwpNhqP
 itYkPZWynkJ+1mZMpV7nJC9PEuyKdw5FKQJovsrsGYkwOfXfylx8kJWJH+zLwXRg
 wLcTsb33U+H+zyWz2TCLr8+SHEiepVcRrsBgjDp1PlIK1eMhZPztdQWVvD6zZjej
 vnB4ea+VUE8Q6HS7CwtBf++u97hsehD7ZZ1raEllhdiXi6SXtK0ARMKk0gSalQXU
 mg/tV2yU3blFZFFcebPG1Mdfow6E+5xzrboCJwB0X3LnkHkdBsUr5KA4AzqedDt9
 KmQqXj1vsLpDQaGMrT+icRkZmKgeBnNb84IxU69WJiEiStJeUNz7WWtGhmGcMn7L
 T9H33gaJbrYqcVj2NSiHP1XbHG0NKv79PgogzfPxQdR0oLXCVpFdw1cgy898bNFj
 Oche3MG1w9p2hTfDZbQpivJOSZWqh92ug8/yTPFR8EwoM6WHhROv13E1GdO22LF0
 QPQOcXtyiYsPlPB02rqpkfAC49At5o03i3IGcT5wn4wTbyEa8BQl5ZyDwAXg2ZAi
 MBMskQ2V4u5sbf8Ok0Vj7ji/OLgKz/k7D3IfQ6CyBaOtILIF6X96Mh+sWGsX82sj
 QTwkEhTr/Tgk5wtivn8i
 =3SGE
 -----END PGP SIGNATURE-----

Merge tag 'imx-fixes-rc' of git://git.pengutronix.de/git/imx/linux-2.6 into fixes

From Sascha Hauer <s.hauer@pengutronix.de>:

ARM i.MX fixes for 3.7-rc

* tag 'imx-fixes-rc' of git://git.pengutronix.de/git/imx/linux-2.6:
  ARM: imx: ehci: fix host power mask bit
  ARM i.MX: fix error-valued pointer dereference in clk_register_gate2()

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2012-11-16 16:42:59 +01:00
Igor Mazanov
93532c8a48 clk: remove inline usage from clk-provider.h
Users of GCC 4.7 have reported compiler errors due to having inline
applied to function declarations in clk-provider.h.  The definitions
exist in drivers/clk/clk.c.  An example error:

In file included from arch/arm/mach-omap2/clockdomain.c:25:0:
arch/arm/mach-omap2/clockdomain.c: In function ‘clkdm_clk_disable’:
include/linux/clk-provider.h:338:12: error: inlining failed in call to always_inline ‘__clk_get_enable_count’: function body not available
arch/arm/mach-omap2/clockdomain.c:1001:28: error: called from here
make[1]: *** [arch/arm/mach-omap2/clockdomain.o] Error 1
make: *** [arch/arm/mach-omap2] Error 2

This patch removes the use of inline from include/linux/clk-provider.h
but keeps the function definitions in drivers/clk/clk.c as inlined since
they are one-liners.

Signed-off-by: Igor Mazanov <i.mazanov@gmail.com>
Acked-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
[mturquette@linaro.org: improved subject, added changelog]
2012-11-15 11:38:34 -08:00
Linus Torvalds
b251f0f399 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Bug fixes galore, mostly in drivers as is often the case:

  1) USB gadget and cdc_eem drivers need adjustments to their frame size
     lengths in order to handle VLANs correctly.  From Ian Coolidge.

  2) TIPC and several network drivers erroneously call tasklet_disable
     before tasklet_kill, fix from Xiaotian Feng.

  3) r8169 driver needs to apply the WOL suspend quirk to more chipsets,
     fix from Cyril Brulebois.

  4) Fix multicast filters on RTL_GIGA_MAC_VER_35 r8169 chips, from
     Nathan Walp.

  5) FDB netlink dumps should use RTM_NEWNEIGH as the message type, not
     zero.  From John Fastabend.

  6) Fix smsc95xx tx checksum offload on big-endian, from Steve
     Glendinning.

  7) __inet_diag_dump() needs to repsect and report the error value
     returned from inet_diag_lock_handler() rather than ignore it.
     Otherwise if an inet diag handler is not available for a particular
     protocol, we essentially report success instead of giving an error
     indication.  Fix from Cyrill Gorcunov.

  8) When the QFQ packet scheduler sees TSO/GSO packets it does not
     handle things properly, and in fact ends up corrupting it's
     datastructures as well as mis-schedule packets.  Fix from Paolo
     Valente.

  9) Fix oopser in skb_loop_sk(), from Eric Leblond.

  10) CXGB4 passes partially uninitialized datastructures in to FW
      commands, fix from Vipul Pandya.

  11) When we send unsolicited ipv6 neighbour advertisements, we should
      send them to the link-local allnodes multicast address, as per
      RFC4861.  Fix from Hannes Frederic Sowa.

  12) There is some kind of bug in the usbnet's kevent deferral
      mechanism, but more immediately when it triggers an uncontrolled
      stream of kernel messages spam the log.  Rate limit the error log
      message triggered when this problem occurs, as sending thousands
      of error messages into the kernel log doesn't help matters at all,
      and in fact makes further diagnosis more difficult.

      From Steve Glendinning.

  13) Fix gianfar restore from hibernation, from Wang Dongsheng.

  14) The netlink message attribute sizes are wrong in the ipv6 GRE
      driver, it was using the size of ipv4 addresses instead of ipv6
      ones :-) Fix from Nicolas Dichtel."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  gre6: fix rtnl dump messages
  gianfar: ethernet vanishes after restoring from hibernation
  usbnet: ratelimit kevent may have been dropped warnings
  ipv6: send unsolicited neighbour advertisements to all-nodes
  net: usb: cdc_eem: Fix rx skb allocation for 802.1Q VLANs
  usb: gadget: g_ether: fix frame size check for 802.1Q
  cxgb4: Fix initialization of SGE_CONTROL register
  isdn: Make CONFIG_ISDN depend on CONFIG_NETDEVICES
  cxgb4: Initialize data structures before using.
  af-packet: fix oops when socket is not present
  pkt_sched: enable QFQ to support TSO/GSO
  net: inet_diag -- Return error code if protocol handler is missed
  net: bnx2x: Fix typo in bnx2x driver
  smsc95xx: fix tx checksum offload for big endian
  rtnetlink: Use nlmsg type RTM_NEWNEIGH from dflt fdb dump
  ptp: update adjfreq callback description
  r8169: allow multicast packets on sub-8168f chipset.
  r8169: Fix WoL on RTL8168d/8111d.
  drivers/net: use tasklet_kill in device remove/close process
  tipc: do not use tasklet_disable before tasklet_kill
2012-11-10 22:03:49 +01:00
Linus Torvalds
2b1768f39a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
 "Several build/bug fixes for sparc, including:

  1) Configuring a mix of static vs.  modular sparc64 crypto modules
     didn't work, remove an ill-conceived attempt to only have to build
     the device match table for these drivers once to fix the problem.

     Reported by Meelis Roos.

  2) Make the montgomery multiple/square and mpmul instructions actually
     usable in 32-bit tasks.  Essentially this involves providing 32-bit
     userspace with a way to use a 64-bit stack when it needs to.

  3) Our sparc64 atomic backoffs don't yield cpu strands properly on
     Niagara chips.  Use pause instruction when available to achieve
     this, otherwise use a benign instruction we know blocks the strand
     for some time.

  4) Wire up kcmp

  5) Fix the build of various drivers by removing the unnecessary
     blocking of OF_GPIO when SPARC.

  6) Fix unintended regression wherein of_address_to_resource stopped
     being provided.  Fix from Andreas Larsson.

  7) Fix NULL dereference in leon_handle_ext_irq(), also from Andreas
     Larsson."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc64: Fix build with mix of modular vs. non-modular crypto drivers.
  sparc: Support atomic64_dec_if_positive properly.
  of/address: sparc: Declare of_address_to_resource() as an extern function for sparc again
  sparc32, leon: Check for existent irq_map entry in leon_handle_ext_irq
  sparc: Add sparc support for platform_get_irq()
  sparc: Allow OF_GPIO on sparc.
  qlogicpti: Fix build warning.
  sparc: Wire up sys_kcmp.
  sparc64: Improvde documentation and readability of atomic backoff code.
  sparc64: Use pause instruction when available.
  sparc64: Fix cpu strand yielding.
  sparc64: Make montmul/montsqr/mpmul usable in 32-bit threads.
2012-11-10 21:58:34 +01:00
Linus Torvalds
0020dd0b8c Bug-fixes:
* Fix compile issues on ARM.
  * Fix hypercall fallback code for old hypervisors.
  * Print out which HVM parameter failed if it fails.
  * Fix idle notifier call after irq_enter.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJQnQdGAAoJEFjIrFwIi8fJPBAIAMX1HRx3udqhv7fziynZvFTb
 hj47XYIJHOK7P4fK7vZoSNgMHjL6LW5cUqC8VN67G3zUSkX9JYFsPBj6v4bWn+rG
 b9CS+MW7hS80LGbbqkh1F+YSEfZ863RlF9PPX2acaHTw49MlIgIqwhxIo6hy+Nm6
 thu6SlbEIJkSUdhbYMOAmy5aH/3+UuuQg+oq3P7mzV8fZjEihnrrF0NlT4wOZK1o
 gsfrKYKJLVT526W9PF/L23/A/MCHMpvjNStpaDLOGNjV9sBMpJI8JRax6+657+q1
 0kXvN5mAwTKWOaXBl4LEC9R8n1IKB91TgOY6HJAcXkb1eoP5KAeNSmU8RbsZ2T0=
 =XZ+0
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.7-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Pull Xen fixes from Konrad Rzeszutek Wilk:
 "There are three ARM compile fixes (we forgot to export certain
  functions and if the drivers are built as an module - we go belly-up).

  There is also an mismatch of irq_enter() / exit_idle() calls sequence
  which were fixed some time ago in other piece of codes, but failed to
  appear in the Xen code.

  Lastly a fix for to help in the field with troubleshooting in case we
  cannot get the appropriate parameter and also fallback code when
  working with very old hypervisors."

Bug-fixes:
 - Fix compile issues on ARM.
 - Fix hypercall fallback code for old hypervisors.
 - Print out which HVM parameter failed if it fails.
 - Fix idle notifier call after irq_enter.

* tag 'stable/for-linus-3.7-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/arm: Fix compile errors when drivers are compiled as modules (export more).
  xen/arm: Fix compile errors when drivers are compiled as modules.
  xen/generic: Disable fallback build on ARM.
  xen/events: fix RCU warning, or Call idle notifier after irq_enter()
  xen/hvm: If we fail to fetch an HVM parameter print out which flag it is.
  xen/hypercall: fix hypercall fallback code for very old hypervisors
2012-11-10 06:56:21 +01:00
Andreas Larsson
0bce04be44 of/address: sparc: Declare of_address_to_resource() as an extern function for sparc again
This bug-fix makes sure that of_address_to_resource is defined extern for sparc
so that the sparc-specific implementation of of_address_to_resource() is once
again used when including include/linux/of_address.h in a sparc context. A
number of drivers in mainline relies on this function working for sparc.

The bug was introduced in a850a75544, "of/address:
add empty static inlines for !CONFIG_OF". Contrary to that commit title, the
static inlines are added for !CONFIG_OF_ADDRESS, and CONFIG_OF_ADDRESS is never
defined for sparc. This is good behavior for the other functions in
include/linux/of_address.h, as the extern functions defined in
drivers/of/address.c only gets linked when OF_ADDRESS is configured. However,
for of_address_to_resource there exists a sparc-specific implementation in
arch/sparc/arch/sparc/kernel/of_device_common.c

Solution suggested by: Sam Ravnborg <sam@ravnborg.org>

Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-09 16:30:50 -08:00
Linus Torvalds
a4275153cc MMC fixes for 3.7-rc5:
- sdhci: fix a NULL dereference at resume-time, seen on OLPC XO-4
  - sdhci: fix against 3.7-rc1 for UHS modes without a vqmmc regulator
  - sdhci-of-esdhc: disable CMD23 on boards where it's broken
  - sdhci-s3c: fix against 3.7-rc1 for card detection with runtime PM
  - dw_mmc, omap_hsmmc: fix potential NULL derefs, compiler warnings
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQnF8PAAoJEHNBYZ7TNxYMoe0P/0W9UtyoJ9UJy/QWQDU1vf8F
 c358UBmV3VkzlzKNEtwxjStwvHf3XcCqF1A0z1GAP5efhSQtb4aZxBYtNcA2FfJ0
 eOnrQaqi8hvy82hBHOqBTDnXA7HDRHCzb1U9x3XL5nvEXi6NLXhxiGO16PzkYDH2
 2JCxVwMpcEHSl+fVQVt+XR55HG53u3Qu17T2jKD8ys7TkEPIukch0rHsbvgri4r+
 k8OXIbxa66UMVePfdDH5inHt9jNfloBe0mc+xbIqfU740NcCH8pS9SuIX0U8Mevh
 Vmc6vzqp2X+ccuUCO+1fWw/neXfMDmyeXVFKGsoNG5LqDkLQv/58OMCAYS2DOcMX
 DhGcAujwKAlmnD0qvb1OvtppihFLfhFnn2pFbJGkynm7Lbv24fCNobkxhS9gadbr
 jrxjiZh7CkTzPz+gh61q5P50MCJRo7VhZUd8Omg7ygtWKJ/NAuf/j/BXerxMq4m8
 p7shlNgbJKTp9oezjsaKdE/pAuF38sFyp9KBGscqoWSHNpfL/U1vLyK1BQteCkYp
 j09x+dZ2aPlnPRvIu+29D4ZDR06nPxVlCqPd8iuB0POEVW9lydiuK5v0F616x1TI
 lh+wl7Ao8ufr8Zz+1ybuPFnTdGq2btRN7aCGj/48m2qOq0stHMkTKFN6jU3huZ+d
 +fZuZnrSsJ0Bm3V1aWUG
 =bi4C
 -----END PGP SIGNATURE-----

Merge tag 'mmc-fixes-for-3.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc

Pull MMC fixes from Chris Ball:
 - sdhci: fix a NULL dereference at resume-time, seen on OLPC XO-4
 - sdhci: fix against 3.7-rc1 for UHS modes without a vqmmc regulator
 - sdhci-of-esdhc: disable CMD23 on boards where it's broken
 - sdhci-s3c: fix against 3.7-rc1 for card detection with runtime PM
 - dw_mmc, omap_hsmmc: fix potential NULL derefs, compiler warnings

* tag 'mmc-fixes-for-3.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: sdhci-s3c: fix the card detection in runtime-pm
  mmc: sdhci-s3c: use clk_prepare_enable and clk_disable_unprepare
  mmc: dw_mmc: constify dw_mci_idmac_ops in exynos back-end
  mmc: dw_mmc: fix modular build for exynos back-end
  mmc: sdhci: fix NULL dereference in sdhci_request() tuning
  mmc: sdhci: fix IS_ERR() checking of regulator_get()
  mmc: fix sdhci-dove probe/removal
  mmc: sh_mmcif: fix use after free
  mmc: sdhci-pci: fix 'Invalid iomem size' error message condition
  mmc: mxcmmc: Fix MODULE_ALIAS
  mmc: omap_hsmmc: fix NULL pointer dereference for dt boot
  mmc: omap_hsmmc: fix host reference after mmc_free_host
  mmc: dw_mmc: fix multiple drv_data NULL dereferences
  mmc: dw_mmc: enable controller interrupt before calling mmc_start_host
  mmc: sdhci-of-esdhc: disable CMD23 for some Freescale SoCs
  mmc: dw_mmc: remove _dev_info compile warning
  mmc: dw_mmc: convert the variable type of irq
2012-11-09 21:32:33 +01:00
Andrew Morton
a80a6b85b4 revert "epoll: support for disabling items, and a self-test app"
Revert commit 03a7beb55b ("epoll: support for disabling items, and a
self-test app") pending resolution of the issues identified by Michael
Kerrisk, copied below.

We'll revisit this for 3.8.

: I've taken a look at this patch as it currently stands in 3.7-rc1, and
: done a bit of testing. (By the way, the test program
: tools/testing/selftests/epoll/test_epoll.c does not compile...)
:
: There are one or two places where the behavior seems a little strange,
: so I have a question or two at the end of this mail. But other than
: that, I want to check my understanding so that the interface can be
: correctly documented.
:
: Just to go though my understanding, the problem is the following
: scenario in a multithreaded application:
:
: 1. Multiple threads are performing epoll_wait() operations,
:    and maintaining a user-space cache that contains information
:    corresponding to each file descriptor being monitored by
:    epoll_wait().
:
: 2. At some point, a thread wants to delete (EPOLL_CTL_DEL)
:    a file descriptor from the epoll interest list, and
:    delete the corresponding record from the user-space cache.
:
: 3. The problem with (2) is that some other thread may have
:    previously done an epoll_wait() that retrieved information
:    about the fd in question, and may be in the middle of using
:    information in the cache that relates to that fd. Thus,
:    there is a potential race.
:
: 4. The race can't solved purely in user space, because doing
:    so would require applying a mutex across the epoll_wait()
:    call, which would of course blow thread concurrency.
:
: Right?
:
: Your solution is the EPOLL_CTL_DISABLE operation. I want to
: confirm my understanding about how to use this flag, since
: the description that has accompanied the patches so far
: has been a bit sparse
:
: 0. In the scenario you're concerned about, deleting a file
:    descriptor means (safely) doing the following:
:    (a) Deleting the file descriptor from the epoll interest list
:        using EPOLL_CTL_DEL
:    (b) Deleting the corresponding record in the user-space cache
:
: 1. It's only meaningful to use this EPOLL_CTL_DISABLE in
:    conjunction with EPOLLONESHOT.
:
: 2. Using EPOLL_CTL_DISABLE without using EPOLLONESHOT in
:    conjunction is a logical error.
:
: 3. The correct way to code multithreaded applications using
:    EPOLL_CTL_DISABLE and EPOLLONESHOT is as follows:
:
:    a. All EPOLL_CTL_ADD and EPOLL_CTL_MOD operations should
:       should EPOLLONESHOT.
:
:    b. When a thread wants to delete a file descriptor, it
:       should do the following:
:
:       [1] Call epoll_ctl(EPOLL_CTL_DISABLE)
:       [2] If the return status from epoll_ctl(EPOLL_CTL_DISABLE)
:           was zero, then the file descriptor can be safely
:           deleted by the thread that made this call.
:       [3] If the epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY,
:           then the descriptor is in use. In this case, the calling
:           thread should set a flag in the user-space cache to
:           indicate that the thread that is using the descriptor
:           should perform the deletion operation.
:
: Is all of the above correct?
:
: The implementation depends on checking on whether
: (events & ~EP_PRIVATE_BITS) == 0
: This replies on the fact that EPOLL_CTL_AD and EPOLL_CTL_MOD always
: set EPOLLHUP and EPOLLERR in the 'events' mask, and EPOLLONESHOT
: causes those flags (as well as all others in ~EP_PRIVATE_BITS) to be
: cleared.
:
: A corollary to the previous paragraph is that using EPOLL_CTL_DISABLE
: is only useful in conjunction with EPOLLONESHOT. However, as things
: stand, one can use EPOLL_CTL_DISABLE on a file descriptor that does
: not have EPOLLONESHOT set in 'events' This results in the following
: (slightly surprising) behavior:
:
: (a) The first call to epoll_ctl(EPOLL_CTL_DISABLE) returns 0
:     (the indicator that the file descriptor can be safely deleted).
: (b) The next call to epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY.
:
: This doesn't seem particularly useful, and in fact is probably an
: indication that the user made a logic error: they should only be using
: epoll_ctl(EPOLL_CTL_DISABLE) on a file descriptor for which
: EPOLLONESHOT was set in 'events'. If that is correct, then would it
: not make sense to return an error to user space for this case?

Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: "Paton J. Lewis" <palewis@adobe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-09 06:41:46 +01:00
Arnd Bergmann
8e2b36ea6e mmc: dw_mmc: constify dw_mci_idmac_ops in exynos back-end
The of_device_id match data is now marked as const and
must not be modified. This changes the dw_mmc to mark
all pointers passing the dw_mci_drv_data or dw_mci_dma_ops
structures as const, and also marks the static definitions
as const.

drivers/mmc/host/dw_mmc-exynos.c: In function 'dw_mci_exynos_probe':
drivers/mmc/host/dw_mmc-exynos.c:234:11: warning: assignment discards 'const' qualifier from pointer target type [enabled by default]

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Abraham <thomas.abraham@linaro.org>
Cc: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-11-07 15:02:55 -05:00
Jerry Huang
63ef5d8c28 mmc: sdhci-of-esdhc: disable CMD23 for some Freescale SoCs
CMD23 causes lots of errors in kernel on some freescale SoCs
(P1020, P1021, P1022, P1024, P1025 and P4080) when MMC card used,
which is because these controllers does not support CMD23,
even on the SoCs which declares CMD23 is supported.
Therefore, we'll not use CMD23.

Signed-off-by: Jerry Huang <Chang-Ming.Huang@freescale.com>
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Acked-by: Anton Vorontsov <cbouatmailru@gmail.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-11-07 14:55:29 -05:00
Seungwon Jeon
d676188e44 mmc: dw_mmc: convert the variable type of irq
Even though platform_get_irq returns error, 'host->irq'
always has an unsigned value. Less-than-zero comparison
of an unsigned value is never true. Type of 'unsigned int'
will be changed for 'int'.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Acked-by: Will Newton <will.newton@imgtec.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2012-11-07 14:50:16 -05:00
Kishon Vijay Abraham I
0133370f93 drivers: bus: ocp2scp: add pdata support
ocp2scp was not having pdata support which makes *musb* fail for non-dt
boot in OMAP platform. The pdata will have information about the devices
that is connected to ocp2scp. ocp2scp driver will now make use of this
information to create the devices that is attached to ocp2scp.

This is needed to fix MUSB regression caused by commit c9e4412a
(arm: omap: phy: remove unused functions from omap-phy-internal.c)

Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Acked-by: Felipe Balbi <balbi@ti.com>
[tony@atomide.com: updated comments for regression info]
Signed-off-by: Tony Lindgren <tony@atomide.com>
2012-11-07 09:35:53 -08:00
Konrad Rzeszutek Wilk
6d877e6b85 xen/hvm: If we fail to fetch an HVM parameter print out which flag it is.
Makes it easier to troubleshoot in the field.

Acked-by: Ian Campbell <ian.campbell@citrix.com>
[v1: Use macro per Ian's suggestion]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-11-07 10:40:33 -05:00
Jacob Keller
87f4d7c1d3 ptp: update adjfreq callback description
This patch updates the adjfreq callback description to include a note that the
delta in ppb is always relative to the base frequency, and not to the current
frequency of the hardware clock.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
CC: stable@vger.kernel.org [v3.5+]
CC: Richard Cochran <richard.cochran@gmail.com>
CC: John Stultz <john.stultz@linaro.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-03 15:27:07 -04:00
Linus Torvalds
0f89a5733a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "First post-Sandy pull request"

 1) Fix antenna gain handling and initialization of chan->max_reg_power
    in wireless, from Felix Fietkau.

 2) Fix nexthop handling in H.232 conntrack helper, from Julian
    Anastasov.

 3) Only process 80211 mesh config header in certain kinds of frames,
    from Javier Cardona.

 4) 80211 management frame header length needs to be validated, from
    Johannes Berg.

 5) Don't access free'd SKBs in ath9k driver, from Felix Fietkay.

 6) Test for permanent state correctly in VXLAN driver, from Stephen
    Hemminger.

 7) BNX2X bug fixes from Yaniv Rosner and Dmitry Kravkov.

 8) Fix off by one errors in bonding, from Nikolay ALeksandrov.

 9) Fix divide by zero in TCP-Illinois congestion control.  From Jesper
    Dangaard Brouer.

10) TCP metrics code says "Yo dawg, I heard you like sizeof, so I did a
    sizeof of a sizeof, so you can size your size" Fix from Julian
    Anastasov.

11) Several drivers do mdiobus_free without first doing an
    mdiobus_unregister leading to stray pointer references.  Fix from
    Peter Senna Tschudin.

12) Fix OOPS in l2tp_eth_create() error path, it's another danling
    pointer kinda situation.  Fix from Tom Parkin.

13) Hardware driven by the vmxnet driver can't handle larger than 16K
    fragments, so split them up when necessary.  From Eric Dumazet.

14) Handle zero length data length in tcp_send_rcvq() properly.  Fix
    from Pavel Emelyanov.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits)
  tcp-repair: Handle zero-length data put in rcv queue
  vmxnet3: must split too big fragments
  l2tp: fix oops in l2tp_eth_create() error path
  cxgb4: Fix unable to get UP event from the LLD
  drivers/net/phy/mdio-bitbang.c: Call mdiobus_unregister before mdiobus_free
  drivers/net/ethernet/nxp/lpc_eth.c: Call mdiobus_unregister before mdiobus_free
  bnx2x: fix HW initialization using fw 7.8.x
  tcp: Fix double sizeof in new tcp_metrics code
  net: fix divide by zero in tcp algorithm illinois
  net: sctp: Fix typo in net/sctp
  bonding: fix second off-by-one error
  bonding: fix off-by-one error
  bnx2x: Disable FCoE for 57840 since not yet supported by FW
  bnx2x: Fix no link on 577xx 10G-baseT
  bnx2x: Fix unrecognized SFP+ module after driver is loaded
  bnx2x: Fix potential incorrect link speed provision
  bnx2x: Restore global registers back to default.
  bnx2x: Fix link down in 57712 following LFA
  bnx2x: Fix 57810 1G-KR link against certain switches.
  ixgbe: PTP get_ts_info missing software support
  ...
2012-11-02 20:48:41 -07:00
Linus Torvalds
66b6a0c979 Bug-fixes:
* Use appropriate macros instead of hand-rolling our own (ARM).
  * Fixes if FB/KBD closed unexpectedly.
  * Fix memory leak in /dev/gntdev ioctl calls.
  * Fix overflow check in xenbus_file_write.
  * Document cleanup.
  * Performance optimization when migrating guests.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJQk9ngAAoJEFjIrFwIi8fJXOcH/jEmTaV2rbUCCnnivlQGj5B2
 AAXt03MM2F7Ohifo8IEHDhvJlUqQnglQq4wcku/8X/bqSkxtqJMfa/UAStmS2e6r
 605msiMws/GKiDPgKywWHjMPk7JJow/T7du9mpT2Swla12+DXc7e0P6Sqm6qGtB5
 tCBFYe3CS+j8Xi/siPhveAoLoDVmC8RpNzV8EWBdUKhNeD6U4s5M3+ChVexOrB/6
 43YkzurkY/FOsP+8YhNnKFSFrpYleRB1GdFcr8PN5mv85sNKts7vHCb4qJFzZdbk
 BMImdLrTUnKArE4y4FS0iqabOTGXaUplEXfyxDw5hweESGa1qzrd29ocyMQ5p/U=
 =LQxc
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.7-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Pull Xen bugfixes from Konrad Rzeszutek Wilk:
 - Use appropriate macros instead of hand-rolling our own (ARM).
 - Fixes if FB/KBD closed unexpectedly.
 - Fix memory leak in /dev/gntdev ioctl calls.
 - Fix overflow check in xenbus_file_write.
 - Document cleanup.
 - Performance optimization when migrating guests.

* tag 'stable/for-linus-3.7-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/mmu: Use Xen specific TLB flush instead of the generic one.
  xen/arm: use the __HVC macro
  xen/xenbus: fix overflow check in xenbus_file_write()
  xen-kbdfront: handle backend CLOSED without CLOSING
  xen-fbfront: handle backend CLOSED without CLOSING
  xen/gntdev: don't leak memory from IOCTL_GNTDEV_MAP_GRANT_REF
  x86: remove obsolete comment from asm/xen/hypervisor.h
2012-11-02 13:26:11 -07:00
Sasha Levin
d9b482c8ba hashtable: introduce a small and naive hashtable
This hashtable implementation is using hlist buckets to provide a simple
hashtable to prevent it from getting reimplemented all over the kernel.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
[ Merging this now, so that subsystems can start applying Sasha's
  patches that use this   - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-11-02 12:44:51 -07:00
Linus Torvalds
8c23f406c6 Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fix from Marcelo Tosatti.

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: fix vcpu->mmio_fragments overflow
2012-11-01 08:27:02 -07:00
Linus Torvalds
33046957cd Sound fixes for 3.7-rc4
This contains unexpectedly many changes in a wide range due to the
 fixes for races at disconnection of USB audio devices.  In the end,
 we end up covering fairly core parts of sound subsystem.
 
 Other than that, just a few usual small fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJQkXVMAAoJEGwxgFQ9KSmkXTYP/A0n7Ni1n2ve/GXn9qK2z0Mp
 WRsfIOF4kMBASFtArqwcud98ryDJ6++5qZKVCodCUNJl25LCLCQHuGZQz+1QLtv7
 mTET45H21WH1hShvHyj2pGSGM8p/ek1aXjRwzOhO0GvtwNPh4OnutHh46+dOR92j
 /0WNKkLoxn5LZK/t2Axs5bdcMVaPJXJ8eWJE6oyo1fwb1410jk5fSN0OXZTaaPiA
 vpe9oh1g+wGj0Vj6X7NB6+Di3PM47nyW3B4iYSA0SUHcTrPR+CQC/XflKvXZGAYv
 O9nk92hYJ5T3CzySPHHa+At5DLlNZaGrJapO0UkT16VSaobyywQovXqoXPT8TjU/
 ldGVaDXKhDxvrOMyp+XjlVhZgs8AmOaqPAI6CljcGb698XAKqhNlpOgF+KWKUCyB
 9s0YRiRlMxNbaSdMANjX/tTsVppLDv6Ytmy42Goz0BgBz07IDDCTOcOXOkTcOCkm
 cAKL2sTFwY740U1yxfsoCJ7KPnUxYRgysgkuE3QbMyP/d5Wi6lBlahkTcOvcqOmf
 /WtMCOImbzhbCz8JLhlMgFJwTqgswveBdT+IKv/dNnA/pTTkt6sycFoBCMavFuD1
 iHQB5IjmZ5O9uSFREUn79CAM4QRBczxE/OzVK4LPin0t4+iM0MCQdadFtsXZKOwU
 InTxst5Jur92PE/yLF/W
 =bxYB
 -----END PGP SIGNATURE-----

Merge tag 'sound-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "This contains unexpectedly many changes in a wide range due to the
  fixes for races at disconnection of USB audio devices.  In the end, we
  end up covering fairly core parts of sound subsystem.

  Other than that, just a few usual small fixes."

* tag 'sound-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: ice1724: Fix rate setup after resume
  ALSA: Avoid endless sleep after disconnect
  ALSA: Add a reference counter to card instance
  ALSA: usb-audio: Fix races at disconnection in mixer_quirks.c
  ALSA: usb-audio: Use rwsem for disconnect protection
  ALSA: usb-audio: Fix races at disconnection
  ALSA: PCM: Fix some races at disconnection
  ASoC: omap-dmic: Correct functional clock name
  ASoC: zoom2: Fix compile error by including correct header files
  ALSA: hda - Fix mute-LED setup for HP dv5 laptop
2012-10-31 15:38:32 -07:00
Xiao Guangrong
87da7e66a4 KVM: x86: fix vcpu->mmio_fragments overflow
After commit b3356bf0db (KVM: emulator: optimize "rep ins" handling),
the pieces of io data can be collected and write them to the guest memory
or MMIO together

Unfortunately, kvm splits the mmio access into 8 bytes and store them to
vcpu->mmio_fragments. If the guest uses "rep ins" to move large data, it
will cause vcpu->mmio_fragments overflow

The bug can be exposed by isapc (-M isapc):

[23154.818733] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
[ ......]
[23154.858083] Call Trace:
[23154.859874]  [<ffffffffa04f0e17>] kvm_get_cr8+0x1d/0x28 [kvm]
[23154.861677]  [<ffffffffa04fa6d4>] kvm_arch_vcpu_ioctl_run+0xcda/0xe45 [kvm]
[23154.863604]  [<ffffffffa04f5a1a>] ? kvm_arch_vcpu_load+0x17b/0x180 [kvm]

Actually, we can use one mmio_fragment to store a large mmio access then
split it when we pass the mmio-exit-info to userspace. After that, we only
need two entries to store mmio info for the cross-mmio pages access

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-31 20:36:30 -02:00
John W. Linville
0627291148 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2012-10-31 13:10:01 -04:00
Konrad Rzeszutek Wilk
95a7d76897 xen/mmu: Use Xen specific TLB flush instead of the generic one.
As Mukesh explained it, the MMUEXT_TLB_FLUSH_ALL allows the
hypervisor to do a TLB flush on all active vCPUs. If instead
we were using the generic one (which ends up being xen_flush_tlb)
we end up making the MMUEXT_TLB_FLUSH_LOCAL hypercall. But
before we make that hypercall the kernel will IPI all of the
vCPUs (even those that were asleep from the hypervisor
perspective). The end result is that we needlessly wake them
up and do a TLB flush when we can just let the hypervisor
do it correctly.

This patch gives around 50% speed improvement when migrating
idle guest's from one host to another.

Oracle-bug: 14630170

CC: stable@vger.kernel.org
Tested-by:  Jingjie Jiang <jingjie.jiang@oracle.com>
Suggested-by:  Mukesh Rathor <mukesh.rathor@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-10-31 12:38:31 -04:00
Linus Torvalds
2df4f26167 Some fixes for md in 3.7
- one recently introduced crash for dm-raid10 with discard
  - one bug in new functionality that has been around for a few releases.
  - minor bug in md's 'faulty' personality
 
  and UAPI disintegration for md.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQIVAwUAUJB3rDnsnt1WYoG5AQLuAw/8C2I1LNHRc9zccO4akAg9AyYcpoNGcY6I
 PG1SR7sQiKuQYNTwc7xqqYJ241r3U+Ablh8nurr0rbCmYX8rcnjwTZzhH6h0ER5Q
 M31i7CKb2OY7VGKjs1FtlVnRtdRWVkLHHappEaT0NzjHUqpCDGZYcLMoSaLaCNdE
 8P8GlAI+w8kachkWRnp1a4pdR7Kc1SnP97aZJ304EDy63gYwcsOg+m8zZj5h74u9
 gJpVES1yqflN12CHIkK3K22QM9a1KbP9L9TKQSsevmOe4ju/ID3IlTKjKJvvYoUS
 r9FJIJsGbzOREr1iap4hr81+rrH56t4o1FxgWCuj2wpw7EWelMFrTH0iMNNaxjyk
 z+g7ZElnSjkOYxQXirKcWTJ+F5F4jEc48XlFNjtuvHz771xby3Q5dTN/+hMCQ9k1
 JNML2A9QquK0jLZauRIsbBpVy2uC+vOoJ2BX2kcMOvuHUeCzK78x4HZjZi7mP6Dg
 O9E4+ocGnFZsqnCPtBAxv9G8RE36Efp3uxms9HlwY6TeTGJWyZuiWDyNea2tRLct
 OARMseYVxkup7DOnHirtb9Pywc3kkLqtXcWbZH68Hi5uHMrGFUO2ZhSwjfsC5+rZ
 Nyt1lcRLZaxy/JFgHXzOeLqA2o/nY62OiMEgP+ENbASNJ4HKf685ytzmg2BVetsY
 9E/KUQBEJqY=
 =plEs
 -----END PGP SIGNATURE-----

Merge tag 'md-3.7-fixes' of git://neil.brown.name/md

Pull md fixes from NeilBrown:
 "Some fixes for md in 3.7
   - one recently introduced crash for dm-raid10 with discard
   - one bug in new functionality that has been around for a few
     releases.
   - minor bug in md's 'faulty' personality

  and UAPI disintegration for md."

* tag 'md-3.7-fixes' of git://neil.brown.name/md:
  MD RAID10: Fix oops when creating RAID10 arrays via dm-raid.c
  md/raid1: Fix assembling of arrays containing Replacements.
  md faulty: use disk_stack_limits()
  UAPI: (Scripted) Disintegrate include/linux/raid
2012-10-30 19:48:48 -07:00
Takashi Iwai
a0830dbd4e ALSA: Add a reference counter to card instance
For more strict protection for wild disconnections, a refcount is
introduced to the card instance, and let it up/down when an object is
referred via snd_lookup_*() in the open ops.

The free-after-last-close check is also changed to check this refcount
instead of the empty list, too.

Reported-by: Matthieu CASTET <matthieu.castet@parrot.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2012-10-30 11:07:10 +01:00
John W. Linville
efec22b468 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 2012-10-29 14:14:48 -04:00
Mikulas Patocka
1bf11c5353 percpu-rw-semaphores: use rcu_read_lock_sched
Use rcu_read_lock_sched / rcu_read_unlock_sched / synchronize_sched
instead of rcu_read_lock / rcu_read_unlock / synchronize_rcu.

This is an optimization. The RCU-protected region is very small, so
there will be no latency problems if we disable preempt in this region.

So we use rcu_read_lock_sched / rcu_read_unlock_sched that translates
to preempt_disable / preempt_disable. It is smaller (and supposedly
faster) than preemptible rcu_read_lock / rcu_read_unlock.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-28 10:59:36 -07:00
Mikulas Patocka
5c1eabe685 percpu-rw-semaphores: use light/heavy barriers
This patch introduces new barrier pair light_mb() and heavy_mb() for
percpu rw semaphores.

This patch fixes a bug in percpu-rw-semaphores where a barrier was
missing in percpu_up_write.

This patch improves performance on the read path of
percpu-rw-semaphores: on non-x86 cpus, there was a smp_mb() in
percpu_up_read. This patch changes it to a compiler barrier and removes
the "#if defined(X86) ..." condition.

From: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-28 10:59:36 -07:00