Commit Graph

441234 Commits

Author SHA1 Message Date
Jianyu Zhan
a2a1f9eaf9 cgroup: replace pr_warning with preferred pr_warn
As suggested by scripts/checkpatch.pl, substitude all pr_warning()
with pr_warn().

No functional change.

Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2014-04-25 18:28:03 -04:00
Jianyu Zhan
f8719ccf7b cgroup: remove orphaned cgroup_pidlist_seq_operations
6612f05b88 ("cgroup: unify pidlist and other file handling")
has removed the only user of cgroup_pidlist_seq_operations :
cgroup_pidlist_open().

This patch removes it.

Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2014-04-25 18:28:03 -04:00
Jianyu Zhan
2f0edc04e7 cgroup: clean up obsolete comment for parse_cgroupfs_options()
1d5be6b287 ("cgroup: move module ref handling into
rebind_subsystems()") makes parse_cgroupfs_options() no longer takes
refcounts on subsystems.

And unified hierachy makes parse_cgroupfs_options not need to call
with cgroup_mutex held to protect the cgroup_subsys[].

So this patch removes BUG_ON() and the comment.  As the comment
doesn't contain useful information afterwards, the whole comment is
removed.

Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2014-04-25 18:28:03 -04:00
Tejun Heo
6573157800 cgroup: add documentation about unified hierarchy
Unified hierarchy will be the new version of cgroup interface.  This
patch adds Documentation/cgroups/unified-hierarchy.txt which describes
the design and rationales of unified hierarchy.

v2: Grammatical updates as per Randy Dunlap's review.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
2014-04-25 18:28:02 -04:00
Tejun Heo
842b597ee0 cgroup: implement cgroup.populated for the default hierarchy
cgroup users often need a way to determine when a cgroup's
subhierarchy becomes empty so that it can be cleaned up.  cgroup
currently provides release_agent for it; unfortunately, this mechanism
is riddled with issues.

* It delivers events by forking and execing a userland binary
  specified as the release_agent.  This is a long deprecated method of
  notification delivery.  It's extremely heavy, slow and cumbersome to
  integrate with larger infrastructure.

* There is single monitoring point at the root.  There's no way to
  delegate management of a subtree.

* The event isn't recursive.  It triggers when a cgroup doesn't have
  any tasks or child cgroups.  Events for internal nodes trigger only
  after all children are removed.  This again makes it impossible to
  delegate management of a subtree.

* Events are filtered from the kernel side.  "notify_on_release" file
  is used to subscribe to or suppress release event.  This is
  unnecessarily complicated and probably done this way because event
  delivery itself was expensive.

This patch implements interface file "cgroup.populated" which can be
used to monitor whether the cgroup's subhierarchy has tasks in it or
not.  Its value is 0 if there is no task in the cgroup and its
descendants; otherwise, 1, and kernfs_notify() notificaiton is
triggers when the value changes, which can be monitored through poll
and [di]notify.

This is a lot ligther and simpler and trivially allows delegating
management of subhierarchy - subhierarchy monitoring can block further
propgation simply by putting itself or another process in the root of
the subhierarchy and monitor events that it's interested in from there
without interfering with monitoring higher in the tree.

v2: Patch description updated as per Serge.

v3: "cgroup.subtree_populated" renamed to "cgroup.populated".  The
    subtree_ prefix was a bit confusing because
    "cgroup.subtree_control" uses it to denote the tree rooted at the
    cgroup sans the cgroup itself while the populated state includes
    the cgroup itself.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Lennart Poettering <lennart@poettering.net>
2014-04-25 18:28:02 -04:00
Tejun Heo
50bce01b0e Merge branch 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core into for-3.16
Pull in driver-core-next to receive kernfs_notify() updates which will
be used by the planned "cgroup.populated" implementation.

Signed-off-by: Tejun Heo <tj@kernel.org>
2014-04-25 18:25:55 -04:00
Michael Marineau
86d56134f1 kobject: Make support for uevent_helper optional.
Support for uevent_helper, aka hotplug, is not required on many systems
these days but it can still be enabled via sysfs or sysctl.

Reported-by: Darren Shepherd <darren.s.shepherd@gmail.com>
Signed-off-by: Michael Marineau <mike@marineau.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-25 12:00:49 -07:00
Tejun Heo
d911d98748 kernfs: make kernfs_notify() trigger inotify events too
kernfs_notify() is used to indicate either new data is available or
the content of a file has changed.  It currently only triggers poll
which may not be the most convenient to monitor especially when there
are a lot to monitor.  Let's hook it up to fsnotify too so that the
events can be monitored via inotify too.

fsnotify_modify() requires file * but kernfs_notify() doesn't have any
specific file associated; however, we can walk all super_blocks
associated with a kernfs_root and as kernfs always associate one ino
with inode and one dentry with an inode, it's trivial to look up the
dentry associated with a given kernfs_node.  As any active monitor
would pin dentry, just looking up existing dentry is enough.  This
patch looks up the dentry associated with the specified kernfs_node
and generates events equivalent to fsnotify_modify().

Note that as fsnotify doesn't provide fsnotify_modify() equivalent
which can be called with dentry, kernfs_notify() directly calls
fsnotify_parent() and fsnotify().  It might be better to add a wrapper
in fsnotify.h instead.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-25 11:43:31 -07:00
Tejun Heo
7d568a8383 kernfs: implement kernfs_root->supers list
Currently, there's no way to find out which super_blocks are
associated with a given kernfs_root.  Let's implement it - the planned
inotify extension to kernfs_notify() needs it.

Make kernfs_super_info point back to the super_block and chain it at
kernfs_root->supers.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-25 11:43:31 -07:00
Tejun Heo
f8f22e53a2 cgroup: implement dynamic subtree controller enable/disable on the default hierarchy
cgroup is switching away from multiple hierarchies and will use one
unified default hierarchy where controllers can be dynamically enabled
and disabled per subtree.  The default hierarchy will serve as the
unified hierarchy to which all controllers are attached and a css on
the default hierarchy would need to also serve the tasks of descendant
cgroups which don't have the controller enabled - ie. the tree may be
collapsed from leaf towards root when viewed from specific
controllers.  This has been implemented through effective css in the
previous patches.

This patch finally implements dynamic subtree controller
enable/disable on the default hierarchy via a new knob -
"cgroup.subtree_control" which controls which controllers are enabled
on the child cgroups.  Let's assume a hierarchy like the following.

  root - A - B - C
               \ D

root's "cgroup.subtree_control" determines which controllers are
enabled on A.  A's on B.  B's on C and D.  This coincides with the
fact that controllers on the immediate sub-level are used to
distribute the resources of the parent.  In fact, it's natural to
assume that resource control knobs of a child belong to its parent.
Enabling a controller in "cgroup.subtree_control" declares that
distribution of the respective resources of the cgroup will be
controlled.  Note that this means that controller enable states are
shared among siblings.

The default hierarchy has an extra restriction - only cgroups which
don't contain any task may have controllers enabled in
"cgroup.subtree_control".  Combined with the other properties of the
default hierarchy, this guarantees that, from the view point of
controllers, tasks are only on the leaf cgroups.  In other words, only
leaf csses may contain tasks.  This rules out situations where child
cgroups compete against internal tasks of the parent, which is a
competition between two different types of entities without any clear
way to determine resource distribution between the two.  Different
controllers handle it differently and all the implemented behaviors
are ambiguous, ad-hoc, cumbersome and/or just wrong.  Having this
structural constraints imposed from cgroup core removes the burden
from controller implementations and enables showing one consistent
behavior across all controllers.

When a controller is enabled or disabled, css associations for the
controller in the subtrees of each child should be updated.  After
enabling, the whole subtree of a child should point to the new css of
the child.  After disabling, the whole subtree of a child should point
to the cgroup's css.  This is implemented by first updating cgroup
states such that cgroup_e_css() result points to the appropriate css
and then invoking cgroup_update_dfl_csses() which migrates all tasks
in the affected subtrees to the self cgroup on the default hierarchy.

* When read, "cgroup.subtree_control" lists all the currently enabled
  controllers on the children of the cgroup.

* White-space separated list of controller names prefixed with either
  '+' or '-' can be written to "cgroup.subtree_control".  The ones
  prefixed with '+' are enabled on the controller and '-' disabled.

* A controller can be enabled iff the parent's
  "cgroup.subtree_control" enables it and disabled iff no child's
  "cgroup.subtree_control" has it enabled.

* If a cgroup has tasks, no controller can be enabled via
  "cgroup.subtree_control".  Likewise, if "cgroup.subtree_control" has
  some controllers enabled, tasks can't be migrated into the cgroup.

* All controllers which aren't bound on other hierarchies are
  automatically associated with the root cgroup of the default
  hierarchy.  All the controllers which are bound to the default
  hierarchy are listed in the read-only file "cgroup.controllers" in
  the root directory.

* "cgroup.controllers" in all non-root cgroups is read-only file whose
  content is equal to that of "cgroup.subtree_control" of the parent.
  This indicates which controllers can be used in the cgroup's
  "cgroup.subtree_control".

This is still experimental and there are some holes, one of which is
that ->can_attach() failure during cgroup_update_dfl_csses() may leave
the cgroups in an undefined state.  The issues will be addressed by
future patches.

v2: Non-root cgroups now also have "cgroup.controllers".

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-04-23 11:13:16 -04:00
Tejun Heo
f817de9851 cgroup: prepare migration path for unified hierarchy
Unified hierarchy implementation would require re-migrating tasks onto
the same cgroup on the default hierarchy to reflect updated effective
csses.  Update cgroup_migrate_prepare_dst() so that it accepts NULL as
the destination cgrp.  When NULL is specified, the destination is
considered to be the cgroup on the default hierarchy associated with
each css_set.

After this change, the identity check in cgroup_migrate_add_src()
isn't sufficient for noop detection as the associated csses may change
without any cgroup association changing.  The only way to tell whether
a migration is noop or not is testing whether the source and
destination csets are identical.  The noop check in
cgroup_migrate_add_src() is removed and cset identity test is added to
cgroup_migreate_prepare_dst().  If it's detected that source and
destination csets are identical, the cset is removed removed from
@preloaded_csets and all the migration nodes are cleared which makes
cgroup_migrate() ignore the cset.

Also, make the function append the destination css_sets to
@preloaded_list so that destination css_sets always come after source
css_sets.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-04-23 11:13:16 -04:00
Tejun Heo
7fd8c565d8 cgroup: update subsystem rebind restrictions
Because the default root couldn't have any non-root csses attached to
it, rebinding away from it was always allowed; however, the default
hierarchy will soon host the unified hierarchy and have non-root csses
so the rebind restrictions need to be updated accordingly.

Instead of special casing rebinding from the default hierarchy and
then checking whether the source hierarchy has children cgroups, which
implies non-root csses for !dfl hierarchies, simply check whether the
source hierarchy has non-root csses for the subsystem using
css_next_child().

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-04-23 11:13:16 -04:00
Tejun Heo
6803c00628 cgroup: add css_set->dfl_cgrp
To implement the unified hierarchy behavior, we'll need to be able to
determine the associated cgroup on the default hierarchy from css_set.
Let's add css_set->dfl_cgrp so that it can be accessed conveniently
and efficiently.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-04-23 11:13:16 -04:00
Tejun Heo
bd53d617b3 cgroup: allow cgroup creation and suppress automatic css creation in the unified hierarchy
Now that effective css handling has been added and iterators updated
accordingly, it's safe to allow cgroup creation in the default
hierarchy.  Unblock cgroup creation in the default hierarchy.

As the default hierarchy will implement explicit enabling and
disabling of controllers on each cgroup, suppress automatic css
enabling on cgroup creation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-04-23 11:13:16 -04:00
Tejun Heo
e329780310 cgroup: cgroup->subsys[] should be cleared after the css is offlined
After a css finishes offlining, offline_css() mistakenly performs
RCU_INIT_POINTER(css->cgroup->subsys[ss->id], css) which just sets the
cgroup->subsys[] pointer to the current value.  The intention was to
clear it after offline is complete, not reassign the same value.

Update it to assign NULL instead of the current value.  This makes
cgroup_css() to return NULL once offline is complete.  All the
existing users of the function either can handle NULL return already
or guarantee that the css doesn't get offlined.

While this is a bugfix, as css lifetime is currently tied to the
cgroup it belongs to, this bug doesn't cause any actual problems.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-04-23 11:13:15 -04:00
Tejun Heo
3ebb2b6ef3 cgroup: teach css_task_iter about effective csses
Currently, css_task_iter iterates tasks associated with a css by
visiting each css_set associated with the owning cgroup and walking
tasks of each of them.  This works fine for !unified hierarchies as
each cgroup has its own css for each associated subsystem on the
hierarchy; however, on the planned unified hierarchy, a cgroup may not
have csses associated and its tasks would be considered associated
with the matching css of the nearest ancestor which has the subsystem
enabled.

This means that on the default unified hierarchy, just walking all
tasks associated with a cgroup isn't enough to walk all tasks which
are associated with the specified css.  If any of its children doesn't
have the matching css enabled, task iteration should also include all
tasks from the subtree.  We already added cgroup->e_csets[] to list
all css_sets effectively associated with a given css and walk css_sets
on that list instead to achieve such iteration.

This patch updates css_task_iter iteration such that it walks css_sets
on cgroup->e_csets[] instead of cgroup->cset_links if iteration is
requested on an non-dummy css.  Thanks to the previous iteration
update, this change can be achieved with the addition of
css_task_iter->ss and minimal updates to css_advance_task_iter() and
css_task_iter_start().

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-04-23 11:13:15 -04:00
Tejun Heo
0f0a2b4fa6 cgroup: reorganize css_task_iter
This patch reorganizes css_task_iter so that adding effective css
support is easier.

* s/->cset_link/->cset_pos/ and s/->task/->task_pos/ for consistency

* ->origin_css is used to determine whether the iteration reached the
  last css_set.  Replace it with explicit ->cset_head so that
  css_advance_task_iter() doesn't have to know the termination
  condition directly.

* css_task_iter_next() currently assumes that it's walking list of
  cgrp_cset_link and reaches into the current cset through the current
  link to determine the termination conditions for task walking.  As
  this won't always be true for effective css walking, add
  ->tasks_head and ->mg_tasks_head and use them to control task
  walking so that css_task_iter_next() doesn't have to know how
  css_sets are being walked.

This patch doesn't make any behavior changes.  The iteration logic
stays unchanged after the patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-04-23 11:13:15 -04:00
Tejun Heo
3b281afbc3 cgroup: make css_next_child() skip missing csses
css_next_child() walks the children of the specified css.  It does
this by finding the next cgroup and then returning the requested css.
On the default unified hierarchy, a cgroup may not have a css
associated with it even if the hierarchy has the subsystem enabled.
This patch updates css_next_child() so that it skips children without
the requested css associated.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-04-23 11:13:15 -04:00
Tejun Heo
2d8f243a5e cgroup: implement cgroup->e_csets[]
On the default unified hierarchy, a cgroup may be associated with
csses of its ancestors, which means that a css of a given cgroup may
be associated with css_sets of descendant cgroups.  This means that we
can't walk all tasks associated with a css by iterating the css_sets
associated with the cgroup as there are css_sets which are pointing to
the css but linked on the descendants.

This patch adds per-subsystem list heads cgroup->e_csets[].  Any
css_set which is pointing to a css is linked to
css->cgroup->e_csets[$SUBSYS_ID] through
css_set->e_cset_node[$SUBSYS_ID].  The lists are protected by
css_set_rwsem and will allow us to walk all css_sets associated with a
given css so that we can find out all associated tasks.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-04-23 11:13:15 -04:00
Tejun Heo
aec3dfcb2e cgroup: introduce effective cgroup_subsys_state
In the planned default unified hierarchy, controllers may get
dynamically attached to and detached from a cgroup and a cgroup may
not have csses for all the controllers associated with the hierarchy.

When a cgroup doesn't have its own css for a given controller, the css
of the nearest ancestor with the controller enabled will be used,
which is called the effective css.  This patch introduces
cgroup_e_css() and for_each_e_css() to access the effective csses and
convert compare_css_sets(), find_existing_css_set() and
cgroup_migrate() to use the effective csses so that they can handle
cgroups with partial csses correctly.

This means that for two css_sets to be considered identical, they
should have both matching csses and cgroups.  compare_css_sets()
already compares both, not for correctness but for optimization.  As
this now becomes a matter of correctness, update the comments
accordingly.

For all !default hierarchies, cgroup_e_css() always equals
cgroup_css(), so this patch doesn't change behavior.

While at it, fix incorrect locking comment for for_each_css().

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-04-23 11:13:14 -04:00
Tejun Heo
f392e51cd6 cgroup: update cgroup->subsys_mask to ->child_subsys_mask and restore cgroup_root->subsys_mask
944196278d ("cgroup: move ->subsys_mask from cgroupfs_root to
cgroup") moved ->subsys_mask from cgroup_root to cgroup to prepare for
the unified hierarhcy; however, it turns out that carrying the
subsys_mask of the children in the parent, instead of itself, is a lot
more natural.  This patch restores cgroup_root->subsys_mask and morphs
cgroup->subsys_mask into cgroup->child_subsys_mask.

* Uses of root->cgrp.subsys_mask are restored to root->subsys_mask.

* Remove automatic setting and clearing of cgrp->subsys_mask and
  instead just inherit ->child_subsys_mask from the parent during
  cgroup creation.  Note that this doesn't affect any current
  behaviors.

* Undo __kill_css() separation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-04-23 11:13:14 -04:00
Tejun Heo
ea8fd3b47f cgroup: cgroup_apply_cftypes() shouldn't skip the default hierarhcy
cgroup_apply_cftypes() skip creating or removing files if the
subsystem is attached to the default hierarchy, which led to missing
files in the root of the default hierarchy.

Skipping made sense when the default hierarchy was dummy; however, now
that the default hierarchy is full functional and planned to be used
as the unified hierarchy, it shouldn't be skipped over.

Reported-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-04-23 11:13:14 -04:00
Linus Torvalds
a798c10faf Linux 3.15-rc2 2014-04-20 11:08:50 -07:00
Linus Torvalds
372feacb36 Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dmaengine fixes from Vinod Koul:
 "Back from long weekend here in India and now the time to send fixes
  for slave dmaengine.
   - Dan's fix of sirf xlate code
   - Jean's fix for timberland
   - edma fixes by Sekhar for SG handling and Yuan for changing init
     call"

* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
  dma: fix eDMA driver as a subsys_initcall
  dmaengine: sirf: off by one in of_dma_sirfsoc_xlate()
  platform: Fix timberdale dependencies
  dma: edma: fix incorrect SG list handling
2014-04-20 10:35:31 -07:00
Linus Torvalds
5269519f9f IOMMU Fixes for Linux v3.15-rc1
Fixes for regressions:
 
 	* Fix wrong IOMMU enumeration causing some SCSI device drivers
 	  initialization failures
 	* ARM-SMMU fixes for a panic condition and a wrong return value.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJTU5YcAAoJECvwRC2XARrjQF0P/3OFeKUG2CQK7WDVj6ogqgwh
 OpvmdzEmOjZc6igZMIKvD7qYhFHR51L4M0FD91jWcEIgEuQZgwYMXPEhxaQRgx7E
 t8bKbkyFRZfCDYDbZJN14Sk+YYStnwi/5n84G938V2HZC/1OMKyqohhDBXUaIcAK
 tzu92P7m3yn11wCV2WQ28OyCtgcE5SPQyUxU8SN8YTrtsUmHXljVvQ17hJ5DgpeX
 XQSyUcoXnub6spr93PQWuzh3nAwPZUlrRKR9p2nxspL4Q4TxbqjFOdeZg3p12+9W
 Et6DKPkf8Xq4cbGJQU8bPYXWEp26NasFcGMDqtlcZBFNOO9stwaCf28YE4r/LUxQ
 StkjaZcUzhDxEjyUqksQjU6kCUh/6/nIl1DtnrpF2y3aZgkBSxx5rj4Vowqpbnf3
 AnR0BbeOUgWtuH98nUCKXJdDCaTWiFUAwzlAWN1Byp922Ol05obfE7QgyPHT2Gsb
 j1GJqBTw1lAM5s1I/vl1+uJm9W/jD+9tCHmOfmTxPswKnFhji7Un9hXoopSEhFEE
 pc29KMofFmYAM6n2a4p4FrGFLkbaYg+SC6vlNiQuLIi/KRid60xVeZGgzEU+0ajB
 O+MhARdd3ulFMPxoOWKR8XQkz1vo3cb4ArJfCG72dfg/AUqDO8vTp3kIsi5dx61e
 8308WSsq+gv2QNFVbrdj
 =vhSn
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes-v3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:
 "Fixes for regressions:

   - fix wrong IOMMU enumeration causing some SCSI device drivers
     initialization failures
   - ARM-SMMU fixes for a panic condition and a wrong return value"

* tag 'iommu-fixes-v3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/arm-smmu: fix panic in arm_smmu_alloc_init_pte
  iommu/arm-smmu: Return 0 on unmap failure
  iommu/vt-d: fix bug in matching PCI devices with DRHD/RMRR descriptors
  iommu/vt-d: Fix get_domain_for_dev() handling of upstream PCIe bridges
  iommu/vt-d: fix memory leakage caused by commit ea8ea46
2014-04-20 10:33:49 -07:00
Linus Torvalds
200bde278d Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf tooling fixes from Ingo Molnar:
 "Three small tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools: Improve error reporting
  perf tools: Adjust symbols in VDSO
  perf kvm: Fix 'Min time' counting in report command
2014-04-20 10:32:33 -07:00
Ingo Molnar
fd741edc25 perf/urgent fixes:
User visible:
 
 . Adjust symbols in VDSO to properly resolve its function names (Vladimir Nikulichev)
 
 . Improve error reporting for record session failure (Adrien BAK)
 
 . Fix 'Min time' counting in report command (Alexander Yarygin)
 
 Signed-off-by: Jiri Olsa <jolsa@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTUvYSAAoJEPZqUSBWB3s9NK4P/0sNNlm6GFrA/AvwkodtIrTa
 MAVOcmWxP/ayafJkSVmPjDHwi4tlcfdzgilD1jG+M9OerwNvBi5yAcO6BZCRGueO
 9Inej82QzGwZES0Llq68HYIh8nx+85XP2ASJga84JvFKRuSLVlgPkkXzhQTa/Gpp
 FMI+h4GvPkIXnakfiq1u9Icth5zgcJuJMDmgDs1h9F9c2Ry4ecaghx+PSrc7vf10
 TWAYIvNNuykE9twOJWiBh8SuN0ws3rVA3ozJPkTlBtq9WIgNjxsFhBDV28njcaNY
 bokkDIjfgTPkWSPVBE3TbN1WwBBd/B3zreRKIYXGLZLJ0ey1C2aT/0uaW45UjyIr
 r7cCkSvdFIJIBf1nD/zC/KYJ/OW8ciUJo/Iwn3+M6cRx2nxkWoRhZphOuehLjqzF
 5aQBoE5USUVusi26BXGopP+fr+Inr2chEXB3gdq0MKV9HbTIegWze+/y5vEgjUFQ
 HR3RAgNbLXLdYWDk/PHvUHywBX64KE8AU/eGXg7BRJt+gB+ShYC4LP4xBfnWwxa4
 zKWbdOsqPwUsO9vVERxw8bQFsFiGkZhbllqKVMjhxjPL3hCG508ggG7jZRqFRCYW
 BNoHaCIPv4pTzTh5opncaxQtApsVW5LcfM2GwwEIaNyfOBT8qOVnM0AHyzLuab2i
 loqvffbTT+bopK0x+OiL
 =CtdG
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/urgent

Pull perf/urgent fixes from Jiri Olsa:

User visible changes:

  * Adjust symbols in VDSO to properly resolve its function names (Vladimir Nikulichev)

  * Improve error reporting for record session failure (Adrien BAK)

  * Fix 'Min time' counting in report command (Alexander Yarygin)

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-20 09:53:55 +02:00
Adrien BAK
ffa91880a9 perf tools: Improve error reporting
In the current version, when using perf record, if something goes
wrong in tools/perf/builtin-record.c:375
  session = perf_session__new(file, false, NULL);

The error message:
"Not enough memory for reading per file header"

is issued. This error message seems to be outdated and is not very
helpful. This patch proposes to replace this error message by
"Perf session creation failed"

I believe this issue has been brought to lkml:
https://lkml.org/lkml/2014/2/24/458
although this patch only tackles a (small) part of the issue.

Additionnaly, this patch improves error reporting in
tools/perf/util/data.c open_file_write.

Currently, if the call to open fails, the user is unaware of it.
This patch logs the error, before returning the error code to
the caller.

Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Adrien BAK <adrien.bak@metascale.org>
Link: http://lkml.kernel.org/r/1397786443.3093.4.camel@beast
[ Reorganize the changelog into paragraphs ]
[ Added empty line after fd declaration in open_file_write ]
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-20 00:15:12 +02:00
Vladimir Nikulichev
922d0e4d9f perf tools: Adjust symbols in VDSO
pert-report doesn't resolve function names in VDSO:

$ perf report --stdio -g flat,0.0,15,callee --sort pid
...
            8.76%
               0x7fff6b1fe861
               __gettimeofday
               ACE_OS::gettimeofday()
...

In this case symbol values should be adjusted the same way as for executables,
relocatable objects and prelinked libraries.

After fix:

$ perf report --stdio -g flat,0.0,15,callee --sort pid
...
            8.76%
               __vdso_gettimeofday
               __gettimeofday
               ACE_OS::gettimeofday()

Signed-off-by: Vladimir Nikulichev <nvs@tbricks.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/969812.163009436-sendEmail@nvs
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-20 00:15:11 +02:00
Alexander Yarygin
acb61fc8ed perf kvm: Fix 'Min time' counting in report command
Every event in the perf-kvm has a 'stats' structure, which contains
max/min/average/etc times of handling this event.
The problem is that the 'perf-kvm stat report' command always shows
that 'min time' is 0us for every event. Example:

 # perf kvm stat report

 Analyze events for all VCPUs:

    VM-EXIT    Samples  Samples%     Time%   Min Time   Max Time Avg time
  [..]
  0xB2 MSCH         12     0.07%     0.00%        0us        8us 7.31us ( +-   2.11% )
  0xB2 CHSC         12     0.07%     0.00%        0us       18us 9.39us ( +-   9.49% )
  0xB2 STPX          8     0.05%     0.00%        0us        2us 1.88us ( +-   7.18% )
  0xB2 STSI          7     0.04%     0.00%        0us       44us 16.49us ( +-  38.20% )
  [..]

This happens because the 'stats' structure is not initialized and
stats->min equals to 0. Lets initialize the structure for every
event after its allocation using init_stats() function. This initializes
stats->min to -1 and makes 'Min time' statistics counting work:

 # perf kvm stat report

 Analyze events for all VCPUs:

    VM-EXIT    Samples  Samples%     Time%   Min Time   Max Time Avg time
  [..]
  0xB2 MSCH         12     0.07%     0.00%        6us        8us 7.31us ( +-   2.11% )
  0xB2 CHSC         12     0.07%     0.00%        7us       18us 9.39us ( +-   9.49% )
  0xB2 STPX          8     0.05%     0.00%        1us        2us 1.88us ( +-   7.18% )
  0xB2 STSI          7     0.04%     0.00%        1us       44us 16.49us ( +-  38.20% )
  [..]

Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1397053319-2130-3-git-send-email-borntraeger@de.ibm.com
[ Fixing the perf examples changelog output ]
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-20 00:14:08 +02:00
Eric Dumazet
404ca80eb5 coredump: fix va_list corruption
A va_list needs to be copied in case it needs to be used twice.

Thanks to Hugh for debugging this issue, leading to various panics.

Tested:

  lpq84:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern

'produce_core' is simply : main() { *(int *)0 = 1;}

  lpq84:~# ./produce_core
  Segmentation fault (core dumped)
  lpq84:~# dmesg | tail -1
  [  614.352947] Core dump to |/foobar12345 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 (null) pipe failed

Notice the last argument was replaced by a NULL (we were lucky enough to
not crash, but do not try this on your production machine !)

After fix :

  lpq83:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern
  lpq83:~# ./produce_core
  Segmentation fault
  lpq83:~# dmesg | tail -1
  [  740.800441] Core dump to |/foobar12345 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 pipe failed

Fixes: 5fe9d8ca21 ("coredump: cn_vprintf() has no reason to call vsnprintf() twice")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Diagnosed-by: Hugh Dickins <hughd@google.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-19 13:23:31 -07:00
Linus Torvalds
6d4596905b Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
 "This fixes the preemption-count imbalance crash reported by Owen
  Kibel"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Fix CMCI preemption bugs
2014-04-19 10:41:43 -07:00
Linus Torvalds
8f98f6f5d6 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "Two fixes:

   - a SCHED_DEADLINE task selection fix
   - a sched/numa related lockdep splat fix"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Check for stop task appearance when balancing happens
  sched/numa: Fix task_numa_free() lockdep splat
2014-04-19 10:40:51 -07:00
Linus Torvalds
8de3f7a705 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Two kernel side fixes:

   - an Intel uncore PMU driver potential crash fix
   - a kprobes/perf-call-graph interaction fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Use rdmsrl_safe() when initializing RAPL PMU
  kprobes/x86: Fix page-fault handling logic
2014-04-19 10:40:11 -07:00
Linus Torvalds
b93124202f Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Unfortunately this contains no easter eggs, its a bit larger than I'd
  like, but I included a patch that just moves code from one file to
  another and I'd like to avoid merge conflicts with that later, so it
  makes it seem worse than it is,

  Otherwise:
   - radeon: fixes to use new microcode to stabilise some cards, use
     some common displayport code, some runtime pm fixes, pll regression
     fixes
   - i915: fix for some context oopses, a warn in a used path, backlight
     fixes
   - nouveau: regression fix
   - omap: a bunch of fixes"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (51 commits)
  drm: bochs: drop unused struct fields
  drm: bochs: add power management support
  drm: cirrus: add power management support
  drm: Split out drm_probe_helper.c from drm_crtc_helper.c
  drm/plane-helper: Don't fake-implement primary plane disabling
  drm/ast: fix value check in cbr_scan2
  drm/nouveau/bios: fix a bit shift error introduced by 457e77b
  drm/radeon/ci: make sure mc ucode is loaded before checking the size
  drm/radeon/si: make sure mc ucode is loaded before checking the size
  drm/radeon: improve PLL params if we don't match exactly v2
  drm/radeon: memory leak on bo reservation failure. v2
  drm/radeon: fix VCE fence command
  drm/radeon: re-enable mclk dpm on R7 260X asics
  drm/radeon: add support for newer mc ucode on CI (v2)
  drm/radeon: add support for newer mc ucode on SI (v2)
  drm/radeon: apply more strict limits for PLL params v2
  drm/radeon: update CI DPM powertune settings
  drm/radeon: fix runpm handling on APUs (v4)
  drm/radeon: disable mclk dpm on R7 260X
  drm/tegra: Remove gratuitous pad field
  ...
2014-04-19 10:35:30 -07:00
Dave Airlie
a42892ed10 Merge branch 'drm-next-3.15-wip' of git://people.freedesktop.org/~deathsimple/linux into drm-next
Some i2c fixes over DisplayPort.

* 'drm-next-3.15-wip' of git://people.freedesktop.org/~deathsimple/linux:
  drm/radeon: Improve vramlimit module param documentation
  drm/radeon: fix audio pin counts for DCE6+ (v2)
  drm/radeon/dp: switch to the common i2c over aux code
  drm/dp/i2c: Update comments about common i2c over dp assumptions (v3)
  drm/dp/i2c: send bare addresses to properly reset i2c connections (v4)
  drm/radeon/dp: handle zero sized i2c over aux transactions (v2)
  drm/i915: support address only i2c-over-aux transactions
  drm/tegra: dp: Support address-only I2C-over-AUX transactions
2014-04-19 11:16:02 +10:00
Linus Torvalds
ebfc45ee70 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull more networking fixes from David Miller:

 1) Fix mlx4_en_netpoll implementation, it needs to schedule a NAPI
    context, not synchronize it.  From Chris Mason.

 2) Ipv4 flow input interface should never be zero, it should be
    LOOPBACK_IFINDEX instead.  From Cong Wang and Julian Anastasov.

 3) Properly configure MAC to PHY connection in mvneta devices, from
    Thomas Petazzoni.

 4) sys_recv should use SYSCALL_DEFINE.  From Jan Glauber.

 5) Tunnel driver ioctls do not use the correct namespace, fix from
    Nicolas Dichtel.

 6) Fix memory leak on seccomp filter attach, from Kees Cook.

 7) Fix lockdep warning for nested vlans, from Ding Tianhong.

 8) Crashes can happen in SCTP due to how the auth_enable value is
    managed, fix from Vlad Yasevich.

 9) Wireless fixes from John W Linville and co.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits)
  net: sctp: cache auth_enable per endpoint
  tg3: update rx_jumbo_pending ring param only when jumbo frames are enabled
  vlan: Fix lockdep warning when vlan dev handle notification
  seccomp: fix memory leak on filter attach
  isdn: icn: buffer overflow in icn_command()
  ip6_tunnel: use the right netns in ioctl handler
  sit: use the right netns in ioctl handler
  ip_tunnel: use the right netns in ioctl handler
  net: use SYSCALL_DEFINEx for sys_recv
  net: mdio-gpio: Add support for separate MDI and MDO gpio pins
  net: mdio-gpio: Add support for active low gpio pins
  net: mdio-gpio: Use devm_ functions where possible
  ipv4, route: pass 0 instead of LOOPBACK_IFINDEX to fib_validate_source()
  ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif
  mlx4_en: don't use napi_synchronize inside mlx4_en_netpoll
  net: mvneta: properly configure the MAC <-> PHY connection in all situations
  net: phy: add minimal support for QSGMII PHY
  sfc:On MCDI timeout, issue an FLR (and mark MCDI to fail-fast)
  mwifiex: fix hung task on command timeout
  mwifiex: process event before command response
  ...
2014-04-18 17:53:46 -07:00
Linus Torvalds
6e66d5dab5 Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
 "A set of 5 small cifs fixes"

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  cif: fix dead code
  cifs: fix error handling cifs_user_readv
  fs: cifs: remove unused variable.
  Return correct error on query of xattr on file with empty xattrs
  cifs: Wait for writebacks to complete before attempting write.
2014-04-18 17:52:39 -07:00
Linus Torvalds
25bfe4f5f1 Char/Misc driver fixes for 3.15-rc2
Here are a few driver fixes for char/misc drivers that resolve reported
 issues.
 
 All have been in linux-next successfully for a few days.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iEYEABECAAYFAlNRlx4ACgkQMUfUDdst+ylYCgCdHm8SDiXwRfhUQJcYYlXrI1xs
 skwAn3p1ydnIyVZJ/B3uxTA/0/1Jof5Z
 =no/7
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are a few driver fixes for char/misc drivers that resolve
  reported issues.

  All have been in linux-next successfully for a few days"

* tag 'char-misc-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  Drivers: hv: vmbus: Negotiate version 3.0 when running on ws2012r2 hosts
  Tools: hv: Handle the case when the target file exists correctly
  vme_tsi148: Utilize to_pci_dev() macro
  vme_tsi148: Fix PCI address mapping assumption
  vme_tsi148: Fix typo in tsi148_slave_get()
  w1: avoid recursive device_add
  w1: fix netlink refcnt leak on error path
  misc: Grammar s/addition/additional/
  drivers: mcb: fix memory leak in chameleon_parse_cells() error path
  mei: ignore client writing state during cb completion
  mei: me: do not load the driver if the FW doesn't support MEI interface
  GenWQE: Increase driver version number
  GenWQE: Fix multithreading problems
  GenWQE: Ensure rc is not returning an uninitialized value
  GenWQE: Add wmb before DDCB is started
  GenWQE: Enable access to VPD flash area
2014-04-18 17:02:35 -07:00
Linus Torvalds
60fbf2bda1 driver core fixes for 3.15-rc2
Here are some driver core fixes for 3.15-rc2.  Also in here are some
 documentation updates, as well as an API removal that had to wait for
 after -rc1 due to the cleanups coming into you from multiple developer
 trees (this one and the PPC tree.)
 
 All have been in linux next successfully.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iEYEABECAAYFAlNRl6cACgkQMUfUDdst+yllxACfV9fZ/A6IQja60AdPEo+oa6Cw
 RiIAoJtH0D0G0eC4+/Qs9GSRMoB4jPPC
 =Wi3a
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are some driver core fixes for 3.15-rc2.  Also in here are some
  documentation updates, as well as an API removal that had to wait for
  after -rc1 due to the cleanups coming into you from multiple developer
  trees (this one and the PPC tree.)

  All have been in linux next successfully"

* tag 'driver-core-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  drivers/base/dd.c incorrect pr_debug() parameters
  Documentation: Update stable address in Chinese and Japanese translations
  topology: Fix compilation warning when not in SMP
  Chinese: add translation of io_ordering.txt
  stable_kernel_rules: spelling/word usage
  sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()
  kernfs: protect lazy kernfs_iattrs allocation with mutex
  fs: Don't return 0 from get_anon_bdev
2014-04-18 16:59:52 -07:00
Linus Torvalds
8cb652bb10 staging driver fixes for 3.15-rc2
Here are a few staging driver fixes for issues that have been reported
 for 3.15-rc2.
 
 Also dominating the diffstat for the pull request is the removal of the
 rtl8187se driver.  It's no longer needed in staging as a "real" driver
 for this hardware is now merged in the tree in the "correct" location in
 drivers/net/
 
 All of these patches have been tested in linux-next.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iEYEABECAAYFAlNRmSwACgkQMUfUDdst+ympmwCgmht0boH4QJSf8ojnkrgXhZOG
 JUsAoJqfC22c/dKfLbMgXYuU12Kuidhe
 =poU3
 -----END PGP SIGNATURE-----

Merge tag 'staging-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging driver fixes from Greg KH:
 "Here are a few staging driver fixes for issues that have been reported
  for 3.15-rc2.

  Also dominating the diffstat for the pull request is the removal of
  the rtl8187se driver.  It's no longer needed in staging as a "real"
  driver for this hardware is now merged in the tree in the "correct"
  location in drivers/net/

  All of these patches have been tested in linux-next"

* tag 'staging-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: r8188eu: Fix case where ethtype was never obtained and always be checked against 0
  staging: r8712u: Fix case where ethtype was never obtained and always be checked against 0
  staging: r8188eu: Calling rtw_get_stainfo() with a NULL sta_addr will return NULL
  staging: comedi: fix circular locking dependency in comedi_mmap()
  staging: r8723au: Add missing initialization of change_inx in sort algorithm
  Staging: unisys: use after free in list_for_each()
  staging: unisys: use after free in error messages
  staging: speakup: fix misuse of kstrtol() in handle_goto()
  staging: goldfish: Call free_irq in error path
  staging: delete rtl8187se wireless driver
  staging: rtl8723au: Fix buffer overflow in rtw_get_wfd_ie()
  staging: gs_fpgaboot: remove __TIMESTAMP__ macro
  staging: vme: fix memory leak in vme_user_probe()
  staging: fpgaboot: clean up Makefile
  staging/usbip: fix store_attach() sscanf return value check
  staging/usbip: userspace - fix usbipd SIGSEGV from refresh_exported_devices()
  staging: rtl8188eu: remove spaces, correct counts to unbreak P2P ioctls
  staging/rtl8821ae: Fix OOM handling in _rtl_init_deferred_work()
2014-04-18 16:58:47 -07:00
Linus Torvalds
575a292981 TTY/Serial driver fixes for 3.15-rc2
Here are a number of small tty/serial driver fixes for 3.15-rc2.  Also
 in here are some Documentation file removals for drivers that we removed
 a long time ago, no need to keep it around any longer.
 
 All of these have been in linux-next for a bit.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iEYEABECAAYFAlNRmhMACgkQMUfUDdst+yl1egCgi9FsrVUE57dZ2AJNuVwxeh90
 cDQAoIAbaKSE1ooen/txjFI27rB92qgZ
 =d+yo
 -----END PGP SIGNATURE-----

Merge tag 'tty-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial driver fixes from Greg KH:
 "Here are a number of small tty/serial driver fixes for 3.15-rc2.  Also
  in here are some Documentation file removals for drivers that we
  removed a long time ago, no need to keep it around any longer.

  All of these have been in linux-next for a bit"

* tag 'tty-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  Revert "serial: 8250, disable "too much work" messages"
  serial: amba-pl011: fix regression, causing an Oops on rmmod
  tty: Fix help text of SYNCLINK_CS
  tty: fix memleak in alloc_pid
  ttyprintk: Allow built as a module
  ttyprintk: Fix wrong tty_unregister_driver() call in the error path
  serial: 8250, disable "too much work" messages
  Documentation/serial: Delete obsolete driver documentation
  serial: omap: Fix missing pm_runtime_resume handling by simplifying code
  serial_core: Fix pm imbalance on unbind
  serial: pl011: change Rx burst size to half of trigger level
  serial: timberdale: Depend on X86_32
  serial: st-asc: Fix SysRq char handling
  Revert "serial: clps711x: Give a chance to perform useful tasks during wait loop"
  serial_core: Fix conditional start_tx on ring buffer not empty
  serial: efm32: use $vendor,$device scheme for compatible string
  serial: omap: free the wakeup settings in remove
2014-04-18 16:57:53 -07:00
Linus Torvalds
7e55f81ecf USB fixes for 3.15-rc2
Here are a number of tiny USB fixes and new device ids for 3.15-rc2.
 Nothing major, just issues some people have reported.
 
 All of these have been in linux-next.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iEYEABECAAYFAlNRmoQACgkQMUfUDdst+ynbMgCgn32vVlhBxMJuS5MXRMH+WUUj
 6PkAn1jn+8gLz2aX5BkMKUuE2pgU2Phz
 =pKvA
 -----END PGP SIGNATURE-----

Merge tag 'usb-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are a number of tiny USB fixes and new device ids for 3.15-rc2.
  Nothing major, just issues some people have reported.

  All of these have been in linux-next"

* tag 'usb-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  uas: fix deadlocky memory allocations
  uas: fix error handling during scsi_scan()
  uas: fix GFP_NOIO under spinlock
  uwb: adds missing error handling
  USB: cdc-acm: Remove Motorola/Telit H24 serial interfaces from ACM driver
  USB: ohci-jz4740: FEAT_POWER is a port feature, not a hub feature
  USB: ohci-jz4740: Fix uninitialized variable warning
  USB: EHCI: tegra: set txfill_tuning
  usb: ehci-platform: Return immediately from suspend if ehci_suspend fails
  usb: ehci-exynos: Return immediately from suspend if ehci_suspend fails
  USB: fix crash during hotplug of PCI USB controller card
  USB: cdc-acm: fix double usb_autopm_put_interface() in acm_port_activate()
  usb: usb-common: fix typo for usb_state_string
  USB: usb_wwan: fix handling of missing bulk endpoints
  USB: pl2303: add ids for Hewlett-Packard HP POS pole displays
  USB: cp210x: Add 8281 (Nanotec Plug & Drive)
  usb: option driver, add support for Telit UE910v2
  Revert "USB: serial: add usbid for dell wwan card to sierra.c"
  USB: serial: ftdi_sio: add id for Brainboxes serial cards
2014-04-18 16:57:00 -07:00
Linus Torvalds
ea2388f281 Merge branch 'akpm' (incoming from Andrew)
Merge misc fixes from Andrew Morton:
 "13 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  thp: close race between split and zap huge pages
  mm: fix new kernel-doc warning in filemap.c
  mm: fix CONFIG_DEBUG_VM_RB description
  mm: use paravirt friendly ops for NUMA hinting ptes
  mips: export flush_icache_range
  mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages()
  wait: explain the shadowing and type inconsistencies
  Shiraz has moved
  Documentation/vm/numa_memory_policy.txt: fix wrong document in numa_memory_policy.txt
  powerpc/mm: fix ".__node_distance" undefined
  kernel/watchdog.c:touch_softlockup_watchdog(): use raw_cpu_write()
  init/Kconfig: move the trusted keyring config option to general setup
  vmscan: reclaim_clean_pages_from_list() must use mod_zone_page_state()
2014-04-18 16:40:31 -07:00
Kirill A. Shutemov
b5a8cad376 thp: close race between split and zap huge pages
Sasha Levin has reported two THP BUGs[1][2].  I believe both of them
have the same root cause.  Let's look to them one by one.

The first bug[1] is "kernel BUG at mm/huge_memory.c:1829!".  It's
BUG_ON(mapcount != page_mapcount(page)) in __split_huge_page().  From my
testing I see that page_mapcount() is higher than mapcount here.

I think it happens due to race between zap_huge_pmd() and
page_check_address_pmd().  page_check_address_pmd() misses PMD which is
under zap:

	CPU0						CPU1
						zap_huge_pmd()
						  pmdp_get_and_clear()
__split_huge_page()
  anon_vma_interval_tree_foreach()
    __split_huge_page_splitting()
      page_check_address_pmd()
        mm_find_pmd()
	  /*
	   * We check if PMD present without taking ptl: no
	   * serialization against zap_huge_pmd(). We miss this PMD,
	   * it's not accounted to 'mapcount' in __split_huge_page().
	   */
	  pmd_present(pmd) == 0

  BUG_ON(mapcount != page_mapcount(page)) // CRASH!!!

						  page_remove_rmap(page)
						    atomic_add_negative(-1, &page->_mapcount)

The second bug[2] is "kernel BUG at mm/huge_memory.c:1371!".
It's VM_BUG_ON_PAGE(!PageHead(page), page) in zap_huge_pmd().

This happens in similar way:

	CPU0						CPU1
						zap_huge_pmd()
						  pmdp_get_and_clear()
						  page_remove_rmap(page)
						    atomic_add_negative(-1, &page->_mapcount)
__split_huge_page()
  anon_vma_interval_tree_foreach()
    __split_huge_page_splitting()
      page_check_address_pmd()
        mm_find_pmd()
	  pmd_present(pmd) == 0	/* The same comment as above */
  /*
   * No crash this time since we already decremented page->_mapcount in
   * zap_huge_pmd().
   */
  BUG_ON(mapcount != page_mapcount(page))

  /*
   * We split the compound page here into small pages without
   * serialization against zap_huge_pmd()
   */
  __split_huge_page_refcount()
						VM_BUG_ON_PAGE(!PageHead(page), page); // CRASH!!!

So my understanding the problem is pmd_present() check in mm_find_pmd()
without taking page table lock.

The bug was introduced by me commit with commit 117b0791ac. Sorry for
that. :(

Let's open code mm_find_pmd() in page_check_address_pmd() and do the
check under page table lock.

Note that __page_check_address() does the same for PTE entires
if sync != 0.

I've stress tested split and zap code paths for 36+ hours by now and
don't see crashes with the patch applied. Before it took <20 min to
trigger the first bug and few hours for second one (if we ignore
first).

[1] https://lkml.kernel.org/g/<53440991.9090001@oracle.com>
[2] https://lkml.kernel.org/g/<5310C56C.60709@oracle.com>

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Bob Liu <lliubbo@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michel Lespinasse <walken@google.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>	[3.13+]

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18 16:40:09 -07:00
Randy Dunlap
b59b8cbca6 mm: fix new kernel-doc warning in filemap.c
Fix new kernel-doc warning in mm/filemap.c:

  Warning(mm/filemap.c:2600): Excess function parameter 'ppos' description in '__generic_file_aio_write'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18 16:40:09 -07:00
Davidlohr Bueso
a663dad65f mm: fix CONFIG_DEBUG_VM_RB description
This appears to be a copy/paste error.  Update the description to
reflect extra rbtree debug and checks for the config option instead of
duplicating CONFIG_DEBUG_VM.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18 16:40:09 -07:00
Mel Gorman
29c7787075 mm: use paravirt friendly ops for NUMA hinting ptes
David Vrabel identified a regression when using automatic NUMA balancing
under Xen whereby page table entries were getting corrupted due to the
use of native PTE operations.  Quoting him

	Xen PV guest page tables require that their entries use machine
	addresses if the preset bit (_PAGE_PRESENT) is set, and (for
	successful migration) non-present PTEs must use pseudo-physical
	addresses.  This is because on migration MFNs in present PTEs are
	translated to PFNs (canonicalised) so they may be translated back
	to the new MFN in the destination domain (uncanonicalised).

	pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma()
	set and clear the _PAGE_PRESENT bit using pte_set_flags(),
	pte_clear_flags(), etc.

	In a Xen PV guest, these functions must translate MFNs to PFNs
	when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting
	_PAGE_PRESENT.

His suggested fix converted p[te|md]_[set|clear]_flags to using
paravirt-friendly ops but this is overkill.  He suggested an alternative
of using p[te|md]_modify in the NUMA page table operations but this is
does more work than necessary and would require looking up a VMA for
protections.

This patch modifies the NUMA page table operations to use paravirt
friendly operations to set/clear the flags of interest.  Unfortunately
this will take a performance hit when updating the PTEs on
CONFIG_PARAVIRT but I do not see a way around it that does not break
Xen.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: David Vrabel <david.vrabel@citrix.com>
Tested-by: David Vrabel <david.vrabel@citrix.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Noonan <steven@uplinklabs.net>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18 16:40:09 -07:00
Kees Cook
8229f1a044 mips: export flush_icache_range
The lkdtm module performs tests against executable memory ranges, so it
needs to flush the icache for proper behaviors.  Other architectures
already export this, so do the same for MIPS.

[akpm@linux-foundation.org: relocate export sites]
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Cc: John Crispin <blogic@openwrt.org>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18 16:40:09 -07:00
Mizuma, Masayoshi
7848a4bf51 mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages()
soft lockup in freeing gigantic hugepage fixed in commit 55f67141a8 "mm:
hugetlb: fix softlockup when a large number of hugepages are freed." can
happen in return_unused_surplus_pages(), so let's fix it.

Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-18 16:40:08 -07:00