Some drivers dealing with a gpio_chip might need to act on its
descriptors directly; one example is pinctrl drivers that need to lock a
GPIO for being used as IRQ using gpiod_lock_as_irq().
This patch exports a gpiochip_get_desc() function that returns the
GPIO descriptor at the requested index. It also sweeps the
gpio_to_chip() function out of the consumer interface since any holder
of a gpio_chip reference can manipulate its GPIOs way beyond what a
consumer should be allowed to do.
As a result, gpio_chip is not visible anymore to simple GPIO consumers.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Jean-Jacques Hiblot <jjhiblot@traphandler.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The need to know the number of array elements in a property is
a common pattern. To prevent duplication of open-coded implementations
add a helper static function that also centralises strict sanity
checking and DTB format details, as well as a set of wrapper functions
for u8, u16, u32 and u64.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Heiko Stuebner <heiko.stuebner@bqreaders.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Grant Likely <grant.likely@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Currently, cgroupfs_root and its ->top_cgroup are separated reference
counted and the latter's is ignored. There's no reason to do this
separately. This patch removes cgroupfs_root->refcnt and destroys
cgroupfs_root when the top_cgroup is released.
* cgroup_put() updated to ignore cgroup_is_dead() test for top
cgroups. cgroup_free_fn() updated to handle root destruction when
releasing a top cgroup.
* As root destruction is now bounced through cgroup destruction, it is
asynchronous. Update cgroup_mount() so that it waits for pending
release which is currently implemented using msleep(). Converting
this to proper wait_queue isn't hard but likely unnecessary.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
root->number_of_cgroups is currently an integer protected with
cgroup_mutex. Except for sanity checks and proc reporting, the only
place it's used is to check whether the root has any child during
remount; however, this is a bit flawed as the counter is not
decremented when the cgroup is unlinked but when it's released,
meaning that there could be an extended period where all cgroups are
removed but remount is still not allowed because some internal objects
are lingering. While not perfect either, it'd be better to use
emptiness test on root->top_cgroup.children.
This patch updates cgroup_remount() to test top_cgroup's children
instead, which makes number_of_cgroups only actual usage statistics
printing in proc implemented in proc_cgroupstats_show(). Let's
shorten its name and make it an atomic_t so that we don't have to
worry about its synchronization. It's purely auxiliary at this point.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
cgroup->name handling became quite complicated over time involving
dedicated struct cgroup_name for RCU protection. Now that cgroup is
on kernfs, we can drop all of it and simply use kernfs_name/path() and
friends. Replace cgroup->name and all related code with kernfs
name/path constructs.
* Reimplement cgroup_name() and cgroup_path() as thin wrappers on top
of kernfs counterparts, which involves semantic changes.
pr_cont_cgroup_name() and pr_cont_cgroup_path() added.
* cgroup->name handling dropped from cgroup_rename().
* All users of cgroup_name/path() updated to the new semantics. Users
which were formatting the string just to printk them are converted
to use pr_cont_cgroup_name/path() instead, which simplifies things
quite a bit. As cgroup_name() no longer requires RCU read lock
around it, RCU lockings which were protecting only cgroup_name() are
removed.
v2: Comment above oom_info_lock updated as suggested by Michal.
v3: dummy_top doesn't have a kn associated and
pr_cont_cgroup_name/path() ended up calling the matching kernfs
functions with NULL kn leading to oops. Test for NULL kn and
print "/" if so. This issue was reported by Fengguang Wu.
v4: Rebased on top of 0ab02ca8f8 ("cgroup: protect modifications to
cgroup_idr with cgroup_mutex").
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
cftype_set was added primarily to allow registering the same cftype
array more than once for different subsystems. Nobody uses or needs
such thing and it's already broken because each cftype has ->ss
pointer which is initialized during registration.
Let's add list_head ->node to cftype and use the first cftype entry in
the array to link them instead of allocating separate cftype_set.
While at it, trigger WARN if cft seems previously initialized during
registration.
This simplifies cftype handling a bit.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Mount option "xattr" is no longer necessary as it's enabled by default
on kernfs. Warn if "xattr" is specified with "sane_behavior" so that
the option can be removed in the future.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
There's no driver using this flag and consequently no userspace
application is actually looking at it. As it seems unlikely for
any driver to start using it, remove it and the (very little)
code that used it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This patch queries the register SVGA_REG_MOB_MAX_SIZE for the
maximum size of a single mob.
Signed-off-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
This is a part of preliminary works for modernizing the ALSA device
structure. So far, we set card->dev at later point after the object
creation. Because of this, the core layer doesn't always know which
device is being handled before it's actually registered, and it makes
impossible to show the device in error messages, for example. The
first goal is to achieve a proper struct device initialization at the
very beginning of probing.
As a first step, this patch introduces snd_card_new() function (yes
there was the same named function in the very past), in order to
receive the parent device pointer from the very beginning.
snd_card_create() is marked as deprecated.
At this point, there is no functional change other than that. The
actual change of the device creation scheme will follow later.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The last argument, name, of snd_oss_register_device() is nowhere
referred in the function in the current code. Let's drop it.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Inserting a entry into flowcache, or flushing flowcache should be based
on per net scope. The reason to do so is flushing operation from fat
netns crammed with flow entries will also making the slim netns with only
a few flow cache entries go away in original implementation.
Since flowcache is tightly coupled with IPsec, so it would be easier to
put flow cache global parameters into xfrm namespace part. And one last
thing needs to do is bumping flow cache genid, and flush flow cache should
also be made in per net style.
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
As compared with skb_to_sgvec, skb_to_sgvec_nomark only map skb to given sglist
without mark the sg which contain last skb data as the end. So the caller can
mannipulate sg list as will when padding new data after the first call without
calling sg_unmark_end to expend sg list.
Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
The goal of multi-platform kernels is to remove the need for mach
directories and machine descriptors. To further that goal,
introduce CPU_METHOD_OF_DECLARE() to allow cpu hotplug/smp
support to be separated from the machine descriptors.
Implementers should specify an enable-method property in their
cpus node and then implement a matching set of smp_ops in their
hotplug/smp code, wiring it up with the CPU_METHOD_OF_DECLARE()
macro. When the kernel is compiled we'll collect all the
enable-method smp_ops into one section for use at boot.
At boot time we'll look for an enable-method in each cpu node and
try to match that against all known CPU enable methods in the
kernel. If there are no enable-methods in the cpu nodes we
fallback to the cpus node and try to use any enable-method found
there. If that doesn't work we fall back to the old way of using
the machine descriptor.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: <devicetree@vger.kernel.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Kumar Gala <galak@codeaurora.org>
Quirks that enable ACS-compatible functionality on a device need some way
to track whether a given device has been enabled. Rather than create new
data structures for this, allocate one of the pci_dev_flags to indicate
this setup.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some devices support PCI ACS-like features, but don't report it using the
standard PCIe capabilities. We already provide hooks for device-specific
testing of ACS, but not for device-specific enabling of ACS. This provides
that setup hook.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Pull DeviceTree fixes from Rob Herring:
- Fix compile error drivers/spi/spi-rspi.c with !CONFIG_OF
- Fix warnings for unused/uninitialized variables with !CONFIG_OF
- Fix PCIe bus matching for powerpc
- Add documentation for various vendor strings
* tag 'dt-fixes-for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
DT: Add vendor prefix for Spansion Inc.
of/device: Nullify match table in of_match_device() for CONFIG_OF=n
dt-bindings: add vendor-prefix for neonode
of: fix PCI bus match for PCIe slots
of: restructure for_each macros to fix compile warnings
of: add vendor prefix for Honeywell
of: Update qcom vendor prefix description
of: add vendor prefix for Allwinner Technology
Pull networking updates from David Miller:
1) Fix flexcan build on big endian, from Arnd Bergmann
2) Correctly attach cpsw to GPIO bitbang MDIO drive, from Stefan Roese
3) udp_add_offload has to use GFP_ATOMIC since it can be invoked from
non-sleepable contexts. From Or Gerlitz
4) vxlan_gro_receive() does not iterate over all possible flows
properly, fix also from Or Gerlitz
5) CAN core doesn't use a proper SKB destructor when it hooks up
sockets to SKBs. Fix from Oliver Hartkopp
6) ip_tunnel_xmit() can use an uninitialized route pointer, fix from
Eric Dumazet
7) Fix address family assignment in IPVS, from Michal Kubecek
8) Fix ath9k build on ARM, from Sujith Manoharan
9) Make sure fail_over_mac only applies for the correct bonding modes,
from Ding Tianhong
10) The udp offload code doesn't use RCU correctly, from Shlomo Pongratz
11) Handle gigabit features properly in generic PHY code, from Florian
Fainelli
12) Don't blindly invoke link operations in
rtnl_link_get_slave_info_data_size, they are optional. Fix from
Fernando Luis Vazquez Cao
13) Add USB IDs for Netgear Aircard 340U, from Bjørn Mork
14) Handle netlink packet padding properly in openvswitch, from Thomas
Graf
15) Fix oops when deleting chains in nf_tables, from Patrick McHardy
16) Fix RX stalls in xen-netback driver, from Zoltan Kiss
17) Fix deadlock in mac80211 stack, from Emmanuel Grumbach
18) inet_nlmsg_size() forgets to consider ifa_cacheinfo, fix from Geert
Uytterhoeven
19) tg3_change_mtu() can deadlock, fix from Nithin Sujir
20) Fix regression in setting SCTP local source addresses on accepted
sockets, caused by some generic ipv6 socket changes. Fix from
Matija Glavinic Pecotic
21) IPPROTO_* must be pure defines, otherwise module aliases don't get
constructed properly. Fix from Jan Moskyto
22) IPV6 netconsole setup doesn't work properly unless an explicit
source address is specified, fix from Sabrina Dubroca
23) Use __GFP_NORETRY for high order skb page allocations in
sock_alloc_send_pskb and skb_page_frag_refill. From Eric Dumazet
24) Fix a regression added in netconsole over bridging, from Cong Wang
25) TCP uses an artificial offset of 1ms for SRTT, but this doesn't jive
well with TCP pacing which needs the SRTT to be accurate. Fix from
Eric Dumazet
26) Several cases of missing header file includes from Rashika Kheria
27) Add ZTE MF667 device ID to qmi_wwan driver, from Raymond Wanyoike
28) TCP Small Queues doesn't handle nonagle properly in some corner
cases, fix from Eric Dumazet
29) Remove extraneous read_unlock in bond_enslave, whoops. From Ding
Tianhong
30) Fix 9p trans_virtio handling of vmalloc buffers, from Richard Yao
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (136 commits)
6lowpan: fix lockdep splats
alx: add missing stats_lock spinlock init
9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
bonding: remove unwanted bond lock for enslave processing
USB2NET : SR9800 : One chip USB2.0 USB2NET SR9800 Device Driver Support
tcp: tsq: fix nonagle handling
bridge: Prevent possible race condition in br_fdb_change_mac_address
bridge: Properly check if local fdb entry can be deleted when deleting vlan
bridge: Properly check if local fdb entry can be deleted in br_fdb_delete_by_port
bridge: Properly check if local fdb entry can be deleted in br_fdb_change_mac_address
bridge: Fix the way to check if a local fdb entry can be deleted
bridge: Change local fdb entries whenever mac address of bridge device changes
bridge: Fix the way to find old local fdb entries in br_fdb_change_mac_address
bridge: Fix the way to insert new local fdb entries in br_fdb_changeaddr
bridge: Fix the way to find old local fdb entries in br_fdb_changeaddr
tcp: correct code comment stating 3 min timeout for FIN_WAIT2, we only do 1 min
net: vxge: Remove unused device pointer
net: qmi_wwan: add ZTE MF667
3c59x: Remove unused pointer in vortex_eisa_cleanup()
net: fix 'ip rule' iif/oif device rename
...
Try to detect 'ls -l' by having nfs_getattr() look at whether or not
there is an opendir() file descriptor for the parent directory.
If so, then assume that we want to force use of readdirplus in order
to avoid the multiple GETATTR calls over the wire.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Since TCP is a stream protocol, our callback read code needs to take into
account the fact that RPC callbacks are not always confined to a single
TCP segment.
This patch adds support for multiple TCP segments by ensuring that we
only remove the rpc_rqst structure from the 'free backchannel requests'
list once the data has been completely received. We rely on the fact
that TCP data is ordered for the duration of the connection.
Reported-by: shaobingqing <shaobingqing@bwstor.com.cn>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Currently acpi_dma_request_slave_chan_by_index() and
acpi_dma_request_slave_chan_by_name() return only requested channel or NULL.
This patch converts them to return appropriate error code instead of NULL in
case of unsuccessfull request.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
cgroup filesystem code was derived from the original sysfs
implementation which was heavily intertwined with vfs objects and
locking with the goal of re-using the existing vfs infrastructure.
That experiment turned out rather disastrous and sysfs switched, a
long time ago, to distributed filesystem model where a separate
representation is maintained which is queried by vfs. Unfortunately,
cgroup stuck with the failed experiment all these years and
accumulated even more problems over time.
Locking and object lifetime management being entangled with vfs is
probably the most egregious. vfs is never designed to be misused like
this and cgroup ends up jumping through various convoluted dancing to
make things work. Even then, operations across multiple cgroups can't
be done safely as it'll deadlock with rename locking.
Recently, kernfs is separated out from sysfs so that it can be used by
users other than sysfs. This patch converts cgroup to use kernfs,
which will bring the following benefits.
* Separation from vfs internals. Locking and object lifetime
management is contained in cgroup proper making things a lot
simpler. This removes significant amount of locking convolutions,
hairy object lifetime rules and the restriction on multi-cgroup
operations.
* Can drop a lot of code to implement filesystem interface as most are
provided by kernfs.
* Proper "severing" semantics, which allows controllers to not worry
about lingering file accesses after offline.
While the preceding patches did as much as possible to make the
transition less painful, large part of the conversion has to be one
discrete step making this patch rather large. The rest of the commit
message lists notable changes in different areas.
Overall
-------
* vfs constructs replaced with kernfs ones. cgroup->dentry w/ ->kn,
cgroupfs_root->sb w/ ->kf_root.
* All dentry accessors are removed. Helpers to map from kernfs
constructs are added.
* All vfs plumbing around dentry, inode and bdi removed.
* cgroup_mount() now directly looks for matching root and then
proceeds to create a new one if not found.
Synchronization and object lifetime
-----------------------------------
* vfs inode locking removed. Among other things, this removes the
need for the convolution in cgroup_cfts_commit(). Future patches
will further simplify it.
* vfs refcnting replaced with cgroup internal ones. cgroup->refcnt,
cgroupfs_root->refcnt added. cgroup_put_root() now directly puts
root->refcnt and when it reaches zero proceeds to destroy it thus
merging cgroup_put_root() and the former cgroup_kill_sb().
Simliarly, cgroup_put() now directly schedules cgroup_free_rcu()
when refcnt reaches zero.
* Unlike before, kernfs objects don't hold onto cgroup objects. When
cgroup destroys a kernfs node, all existing operations are drained
and the association is broken immediately. The same for
cgroupfs_roots and mounts.
* All operations which come through kernfs guarantee that the
associated cgroup is and stays valid for the duration of operation;
however, there are two paths which need to find out the associated
cgroup from dentry without going through kernfs -
css_tryget_from_dir() and cgroupstats_build(). For these two,
kernfs_node->priv is RCU managed so that they can dereference it
under RCU read lock.
File and directory handling
---------------------------
* File and directory operations converted to kernfs_ops and
kernfs_syscall_ops.
* xattrs is implicitly supported by kernfs. No need to worry about it
from cgroup. This means that "xattr" mount option is no longer
necessary. A future patch will add a deprecated warning message
when sane_behavior.
* When cftype->max_write_len > PAGE_SIZE, it's necessary to make a
private copy of one of the kernfs_ops to set its atomic_write_len.
cftype->kf_ops is added and cgroup_init/exit_cftypes() are updated
to handle it.
* cftype->lockdep_key added so that kernfs lockdep annotation can be
per cftype.
* Inidividual file entries and open states are now managed by kernfs.
No need to worry about them from cgroup. cfent, cgroup_open_file
and their friends are removed.
* kernfs_nodes are created deactivated and kernfs_activate()
invocations added to places where creation of new nodes are
committed.
* cgroup_rmdir() uses kernfs_[un]break_active_protection() for
self-removal.
v2: - Li pointed out in an earlier patch that specifying "name="
during mount without subsystem specification should succeed if
there's an existing hierarchy with a matching name although it
should fail with -EINVAL if a new hierarchy should be created.
Prior to the conversion, this used by handled by deferring
failure from NULL return from cgroup_root_from_opts(), which was
necessary because root was being created before checking for
existing ones. Note that cgroup_root_from_opts() returned an
ERR_PTR() value for error conditions which require immediate
mount failure.
As we now have separate search and creation steps, deferring
failure from cgroup_root_from_opts() is no longer necessary.
cgroup_root_from_opts() is updated to always return ERR_PTR()
value on failure.
- The logic to match existing roots is updated so that a mount
attempt with a matching name but different subsys_mask are
rejected. This was handled by a separate matching loop under
the comment "Check for name clashes with existing mounts" but
got lost during conversion. Merge the check into the main
search loop.
- Add __rcu __force casting in RCU_INIT_POINTER() in
cgroup_destroy_locked() to avoid the sparse address space
warning reported by kbuild test bot. Maybe we want an explicit
interface to use kn->priv as RCU protected pointer?
v3: Make CONFIG_CGROUPS select CONFIG_KERNFS.
v4: Rebased on top of 0ab02ca8f8 ("cgroup: protect modifications to
cgroup_idr with cgroup_mutex").
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: kbuild test robot fengguang.wu@intel.com>
* Un-inline seq_css(). After kernfs conversion, the function will
need to dereference internal data structures.
* Add cgroup_get/put_root() and replace direct super_block->s_active
manipulatinos with them. These will be converted to kernfs_root
refcnting.
* Add cgroup_get/put() and replace dget/put() on cgrp->dentry with
them. These will be converted to kernfs refcnting.
* Update current_css_set_cg_links_read() to use cgroup_name() instead
of reaching into the dentry name. The end result is the same.
These changes don't make functional differences but will make
transition to kernfs easier.
v2: Rebased on top of 0ab02ca8f8 ("cgroup: protect modifications to
cgroup_idr with cgroup_mutex").
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
mm/memory-failure.c::hwpoison_filter_task() has been reaching into
cgroup to extract the associated ino to be used as a filtering
criterion. This is an implementation detail which shouldn't be
depended upon from outside cgroup proper and is about to change with
the scheduled kernfs conversion.
This patch introduces a proper interface to determine the associated
ino, cgroup_ino(), and updates hwpoison_filter_task() to use it
instead of reaching directly into cgroup.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Wu Fengguang <fengguang.wu@intel.com>
cftype->max_write_len is used to extend the maximum size of writes.
It's interpreted in such a way that the actual maximum size is one
less than the specified value. The default size is defined by
CGROUP_LOCAL_BUFFER_SIZE. Its interpretation is quite confusing - its
value is decremented by 1 and then compared for equality with max
size, which means that the actual default size is
CGROUP_LOCAL_BUFFER_SIZE - 2, which is 62 chars.
There's no point in having a limit that low. Update its definition so
that it means the actual string length sans termination and anything
below PAGE_SIZE-1 is treated as PAGE_SIZE-1.
.max_write_len for "release_agent" is updated to PATH_MAX-1 and
cgroup_release_agent_write() is updated so that the redundant strlen()
check is removed and it uses strlcpy() instead of strcpy().
.max_write_len initializations in blk-throttle.c and cfq-iosched.c are
no longer necessary and removed. The one in cpuset is kept unchanged
as it's an approximated value to begin with.
This will also make transition to kernfs smoother.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Currently, cgroup_subsys->base_cftypes registration is different from
dynamic cftypes registartion. Instead of going through
cgroup_add_cftypes(), cgroup_init_subsys() invokes
cgroup_init_cftsets() which makes use of cgroup_subsys->base_cftset
which doesn't involve dynamic allocation.
While avoiding dynamic allocation is somewhat nice, having two
separate paths for cftypes registration is nasty, especially as we're
planning to add more operations during cftypes registration.
This patch drops cgroup_init_cftsets() and cgroup_subsys->base_cftset
and registers base_cftypes using cgroup_add_cftypes(). This is done
as a separate step in cgroup_init() instead of a part of
cgroup_init_subsys(). This is because cgroup_init_subsys() can be
called very early during boot when kmalloc() isn't available yet.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
css_from_dir() returns the matching css (cgroup_subsys_state) given a
dentry and subsystem. The function doesn't pin the css before
returning and requires the caller to be holding RCU read lock or
cgroup_mutex and handling pinning on the caller side.
Given that users of the function are likely to want to pin the
returned css (both existing users do) and that getting and putting
css's are very cheap, there's no reason for the interface to be tricky
like this.
Rename css_from_dir() to css_tryget_from_dir() and make it try to pin
the found css and return it only if pinning succeeded. The callers
are updated so that they no longer do RCU locking and pinning around
the function and just use the returned css.
This will also ease converting cgroup to kernfs.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
This patch introduces "devm_kstrdup" API so that the
device's driver can allocate memory and copy string.
Signed-off-by: Manish Badarkhe <badarkhe.manish@gmail.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Pull for-3.14-fixes to receive 0ab02ca8f8 ("cgroup: protect
modifications to cgroup_idr with cgroup_mutex") prior to kernfs
conversion series to avoid non-trivial conflicts.
Signed-off-by: Tejun Heo <tj@kernel.org>
Immutable biovecs changed the way bio segments are treated in such a way that
bio_for_each_segment() cannot now do what we want for discard/write same bios,
since bi_size means something completely different for them.
Fortunately discard and write same bios never have more than a single biovec, so
bio_for_each_segment() is unnecessary and not terribly meaningful for them, but
we still have to special case them in a few places.
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Setup cgroupfs like this:
# mount -t cgroup -o cpuacct xxx /cgroup
# mkdir /cgroup/sub1
# mkdir /cgroup/sub2
Then run these two commands:
# for ((; ;)) { mkdir /cgroup/sub1/tmp && rmdir /mnt/sub1/tmp; } &
# for ((; ;)) { mkdir /cgroup/sub2/tmp && rmdir /mnt/sub2/tmp; } &
After seconds you may see this warning:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 25243 at lib/idr.c:527 sub_remove+0x87/0x1b0()
idr_remove called for id=6 which is not allocated.
...
Call Trace:
[<ffffffff8156063c>] dump_stack+0x7a/0x96
[<ffffffff810591ac>] warn_slowpath_common+0x8c/0xc0
[<ffffffff81059296>] warn_slowpath_fmt+0x46/0x50
[<ffffffff81300aa7>] sub_remove+0x87/0x1b0
[<ffffffff810f3f02>] ? css_killed_work_fn+0x32/0x1b0
[<ffffffff81300bf5>] idr_remove+0x25/0xd0
[<ffffffff810f2bab>] cgroup_destroy_css_killed+0x5b/0xc0
[<ffffffff810f4000>] css_killed_work_fn+0x130/0x1b0
[<ffffffff8107cdbc>] process_one_work+0x26c/0x550
[<ffffffff8107eefe>] worker_thread+0x12e/0x3b0
[<ffffffff81085f96>] kthread+0xe6/0xf0
[<ffffffff81570bac>] ret_from_fork+0x7c/0xb0
---[ end trace 2d1577ec10cf80d0 ]---
It's because allocating/removing cgroup ID is not properly synchronized.
The bug was introduced when we converted cgroup_ida to cgroup_idr.
While synchronization is already done inside ida_simple_{get,remove}(),
users are responsible for concurrent calls to idr_{alloc,remove}().
tj: Refreshed on top of b58c89986a ("cgroup: fix error return from
cgroup_create()").
Fixes: 4e96ee8e98 ("cgroup: convert cgroup_ida to cgroup_idr")
Cc: <stable@vger.kernel.org> #3.12+
Reported-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Commit d52eefb47d ("ia64/xen: Remove Xen support for ia64") removed
the Kconfig symbol XEN_XENCOMM. But it didn't remove the code depending
on that symbol. Remove that code now.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
No reason to have separate file in header in include/linux folder
if this is purely driver specific structure.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
This patch replaces custom lcd_power() callback with
regulator API over LCD class.
Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
In case of beacon_loss with IEEE80211_HW_CONNECTION_MONITOR
device, mac80211 probes the ap (and disconnects on timeout)
but ignores the ack.
If we already got an ack, there's no reason to continue
disconnecting. this can help devices that supports
IEEE80211_HW_CONNECTION_MONITOR only partially (e.g. take
care of keep alives, but does not probe the ap.
In case the device wants to disconnect without probing,
it can just call ieee80211_connection_loss.
Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Just minor stuff really, on vlv dp fix and two patches to tune down some
opregion sanity check. Plus MAINTAINERS update for the new git repo, which
is the only reason I've really bothered with this pull request.
* tag 'drm-intel-fixes-2014-02-06' of ssh://git.freedesktop.org/git/drm-intel:
drm/i915: demote opregion excessive timeout WARN_ONCE to DRM_INFO_ONCE
drm: add DRM_INFO_ONCE() to print a one-time DRM_INFO() message
MAINTAINERS: Update drm/i915 git repo
drm/i915: vlv: fix DP PHY lockup due to invalid PP sequencer setup
da9846ae15 ("kernfs: make kernfs_deactivate() honor KERNFS_LOCKDEP
flag") in driver-core-linus conflicts with kernfs_drain() updates in
driver-core-next. The former just adds the missing KERNFS_LOCKDEP
checks which are already handled by kernfs_lockdep() checks in
driver-core-next. The conflict can be resolved by taking code from
driver-core-next.
Conflicts:
fs/kernfs/dir.c
Use what we already do for arch_disable_smp_support() to fix these:
arch/x86/kernel/smpboot.c:1155:6: warning: symbol 'arch_enable_nonboot_cpus_begin' was not declared. Should it be static?
arch/x86/kernel/smpboot.c:1160:6: warning: symbol 'arch_enable_nonboot_cpus_end' was not declared. Should it be static?
kernel/cpu.c:512:13: warning: symbol 'arch_enable_nonboot_cpus_begin' was not declared. Should it be static?
kernel/cpu.c:516:13: warning: symbol 'arch_enable_nonboot_cpus_end' was not declared. Should it be static?
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rework dev_pm_qos_add_ancestor_request() so that device PM QoS type
is passed to it as the third argument and make it support the
DEV_PM_QOS_LATENCY_TOLERANCE device PM QoS type (in addition to
DEV_PM_QOS_RESUME_LATENCY).
That will allow the drivers of devices without latency tolerance
hardware support to use their ancestors having it as proxies for
their latency tolerance requirements.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In some cases it may be necessary to perform certain setup/cleanup
operations on a device object representing a physical device after
it has been associated with an ACPI companion by acpi_bind_one() or
before disassociating it from that companion by acpi_unbind_one(),
respectively. If there is a struct acpi_bus_type object for the
given device's bus type, the .setup()/.cleanup() callbacks from there
are executed for these purposes. However, an analogous mechanism will
be necessary for devices whose bus types don't have corresponding
struct acpi_bus_type objects and that have specific ACPI scan handlers.
For those devices, add new .bind() and .unbind() callbacks to struct
acpi_scan_handler that will be executed by acpi_platform_notify()
right after the given device has been associated with an ACPI
comapnion and by acpi_platform_notify_remove() right before calling
acpi_unbind_one() for that device, respectively.
To make that work for scan handlers registering new devices in their
.attach() callbacks, modify acpi_scan_attach_handler() to set the
ACPI device object's handler field before calling .attach() from the
scan handler at hand.
This changeset includes a fix from Mika Westerberg.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a new latency tolerance device PM QoS type to be use for
specifying active state (RPM_ACTIVE) memory access (DMA) latency
tolerance requirements for devices. It may be used to prevent
hardware from choosing overly aggressive energy-saving operation
modes (causing too much latency to appear) for the whole platform.
This feature reqiures hardware support, so it only will be
available for devices having a new .set_latency_tolerance()
callback in struct dev_pm_info populated, in which case the
routine pointed to by it should implement whatever is necessary
to transfer the effective requirement value to the hardware.
Whenever the effective latency tolerance changes for the device,
its .set_latency_tolerance() callback will be executed and the
effective value will be passed to it. If that value is negative,
which means that the list of latency tolerance requirements for
the device is empty, the callback is expected to switch the
underlying hardware latency tolerance control mechanism to an
autonomous mode if available. If that value is PM_QOS_LATENCY_ANY,
in turn, and the hardware supports a special "no requirement"
setting, the callback is expected to use it. That allows software
to prevent the hardware from automatically updating the device's
latency tolerance in response to its power state changes (e.g. during
transitions from D3cold to D0), which generally may be done in the
autonomous latency tolerance control mode.
If .set_latency_tolerance() is present for the device, a new
pm_qos_latency_tolerance_us attribute will be present in the
devivce's power directory in sysfs. Then, user space can use
that attribute to specify its latency tolerance requirement for
the device, if any. Writing "any" to it means "no requirement, but
do not let the hardware control latency tolerance" and writing
"auto" to it allows the hardware to be switched to the autonomous
mode if there are no other requirements from the kernel side in the
device's list.
This changeset includes a fix from Mika Westerberg.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a new field, no_constraints_value, to struct pm_qos_constraints
representing a list of PM QoS constraint requests to be returned by
pm_qos_get_value() when that list of requests is empty.
That field will be equal to default_value for all of the existing
global PM QoS classes and for the resume latency device PM QoS type,
but it will be different from default_value for the new latency
tolerance device PM QoS type introduced by the next changeset.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rename symbols, variables, functions and structure fields related do
the resume latency device PM QoS type so that it is clear where they
belong (in particular, to avoid confusion with the latency tolerance
device PM QoS type introduced by a subsequent changeset).
Update the PM QoS documentation to better reflect its current state.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If OSPMs have something should appear after actypes.h to reference type
definitions, the platform/acxxx.h is not sufficient as it is included by
platform/acenv.h before including actypes.h.
This patch introduces an OSPMs declarable headers to allow OSPMs to handle
such requirement for their own purposes.
This kind of header can also be used by Linux to collect the divergences
that haven't been back ported yet. Lv Zheng.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Konrad writes:
Please git pull the following branch:
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git stable/for-jens-3.14
which is based off v3.13-rc6. If you would like me to rebase it on
a different branch/tag I would be more than happy to do so.
The patches are all bug-fixes and hopefully can go in 3.14.
They deal with xen-blkback shutdown and cause memory leaks
as well as shutdown races. They should go to stable tree and if you
are OK with I will ask them to backport those fixes.
There is also a fix to xen-blkfront to deal with unexpected state
transition. And lastly a fix to the header where it was using the
__aligned__ unnecessarily.