linux

Author	SHA1	Message	Date
Arik Nemtsov	2cedd87960	mac80211: add BSS coex IE to TDLS setup frames Add the BSS coex IE in case we support HT40 channels, as mandated by section 8.5.13 in IEEE802.11 2012. Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:44:52 +01:00
Liad Kaufman	1277b4a9f5	mac80211: retransmit TDLS teardown packet through AP if not ACKed Since the TDLS peer station might not receive the teardown packet (e.g., when in PS), this makes sure the packet is retransmitted - this time through the AP - if the TDLS peer didn't ACK the packet. Signed-off-by: Liad Kaufman <liad.kaufman@intel.com> Signed-off-by: Arik Nemtsov <arik@wizery.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-19 18:44:29 +01:00
Mike Snitzer	ffcc393641	dm: enhance internal suspend and resume interface Rename dm_internal_{suspend,resume} to dm_internal_{suspend,resume}_fast -- dm-stats will continue using these methods to avoid all the extra suspend/resume logic that is not needed in order to quickly flush IO. Introduce dm_internal_suspend_noflush() variant that actually calls the mapped_device's target callbacks -- otherwise target-specific hooks are avoided (e.g. dm-thin's thin_presuspend and thin_postsuspend). Common code between dm_internal_{suspend_noflush,resume} and dm_{suspend,resume} was factored out as __dm_{suspend,resume}. Update dm_internal_{suspend_noflush,resume} to always take and release the mapped_device's suspend_lock. Also update dm_{suspend,resume} to be aware of potential for DM_INTERNAL_SUSPEND_FLAG to be set and respond accordingly by interruptibly waiting for the DM_INTERNAL_SUSPEND_FLAG to be cleared. Add lockdep annotation to dm_suspend() and dm_resume(). The existing DM_SUSPEND_FLAG remains unchanged. DM_INTERNAL_SUSPEND_FLAG is set by dm_internal_suspend_noflush() and cleared by dm_internal_resume(). Both DM_SUSPEND_FLAG and DM_INTERNAL_SUSPEND_FLAG may be set if a device was already suspended when dm_internal_suspend_noflush() was called -- this can be thought of as a "nested suspend". A "nested suspend" can occur with legacy userspace dm-thin code that might suspend all active thin volumes before suspending the pool for resize. But otherwise, in the normal dm-thin-pool suspend case moving forward: the thin-pool will have DM_SUSPEND_FLAG set and all active thins from that thin-pool will have DM_INTERNAL_SUSPEND_FLAG set. Also add DM_INTERNAL_SUSPEND_FLAG to status report. This new DM_INTERNAL_SUSPEND_FLAG state is being reported to assist with debugging (e.g. 'dmsetup info' will report an internally suspended device accordingly). Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Joe Thornber <ejt@redhat.com>	2014-11-19 12:31:17 -05:00
Daniel Vetter	54499b2a92	Merge tag 'drm-intel-fixes-2014-11-19' into drm-intel-next-queued So with all the code movement and extraction in intel_pm.c in -next git is hopelessly confused with commit `2208d655a9` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Fri Nov 14 09:25:29 2014 +0100 drm/i915: drop WaSetupGtModeTdRowDispatch:snb from -fixes. Worse even small changes in -next move around the conflict context so rerere is equally useless. Let's just backmerge and be done with it. Conflicts: drivers/gpu/drm/i915/i915_drv.c drivers/gpu/drm/i915/intel_pm.c Except for git getting lost no tricky conflicts really. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2014-11-19 18:17:38 +01:00
J. Bruce Fields	56429e9b3b	merge nfs bugfixes into nfsd for-3.19 branch In addition to nfsd bugfixes, there are some fixes in -rc5 for client bugs that can interfere with my testing.	2014-11-19 12:06:30 -05:00
Mike Snitzer	d67ee213fa	dm: add presuspend_undo hook to target_type The DM thin-pool target now must undo the changes performed during pool_presuspend() so introduce presuspend_undo hook in target_type. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Joe Thornber <ejt@redhat.com>	2014-11-19 11:24:59 -05:00
Johan Hedberg	0378b59770	Bluetooth: Convert link keys list to use RCU This patch converts the hdev->link_keys list to be protected through RCU, thereby eliminating the need to hold the hdev lock while accessing the list. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-19 16:19:47 +01:00
Dave Hansen	62e88b1c00	mm: Make arch_unmap()/bprm_mm_init() available to all architectures The x86 MPX patch set calls arch_unmap() and arch_bprm_mm_init() from fs/exec.c, so we need at least a stub for them in all architectures. They are only called under an #ifdef for CONFIG_MMU=y, so we can at least restict this to architectures with MMU support. blackfin/c6x have no MMU support, so do not call arch_unmap(). They also do not include mm_hooks.h or mmu_context.h at all and do not need to be touched. s390, um and unicore32 do not use asm-generic/mm_hooks.h, so got their own arch_unmap() versions. (I also moved um's arch_dup_mmap() to be closer to the other mm_hooks.h functions). xtensa only includes mm_hooks when MMU=y, which should be fine since arch_unmap() is called only from MMU=y code. For the rest, we use the stub copies of these functions in asm-generic/mm_hook.h. I cross compiled defconfigs for cris (to check NOMMU) and s390 to make sure that this works. I also checked a 64-bit build of UML and all my normal x86 builds including PARAVIRT on and off. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Hansen <dave@sr71.net> Cc: linux-arch@vger.kernel.org Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20141118182350.8B4AA2C2@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-11-19 11:54:13 +01:00
Mark Brown	e975cec295	Merge branch 'topic/regmap' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into asoc-ac97	2014-11-19 10:48:20 +00:00
Mark Brown	c534107fd2	Merge branch 'topic/ac97' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap into asoc-ac97	2014-11-19 10:47:58 +00:00
Lars-Peter Clausen	20feb88198	ASoC: Add helper functions for deferred regmap setup Some drivers (most notably the AC'97 drivers) do not have access to their regmap struct when the component/codec is registered. For those drivers the automatic regmap setup will not work and needs to be done manually, typically from the component/CODEC drivers probe callback. This patch adds a set of helper function to handle deferred regmap initialization as well as early regmap tear-down. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Mark Brown <broonie@kernel.org>	2014-11-19 10:46:03 +00:00
James Morris	a6aacbde40	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next	2014-11-19 21:36:07 +11:00
James Morris	b10778a00d	Merge commit 'v3.17' into next	2014-11-19 21:32:12 +11:00
Mark Brown	22853223d1	regmap: ac97: Add generic AC'97 callbacks Use the recently added support for bus operations to provide a standard mapping for AC'97 register I/O. Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>	2014-11-19 10:28:14 +00:00
Jens Axboe	b352172976	Merge branch 'master' into for-3.19/drivers	2014-11-18 19:43:46 -07:00
Yoshifumi Hosoya	dc3cf93d89	ARM: shmobile: r8a7794: Add MMP and VSP1 clocks to device tree Signed-off-by: Yoshifumi Hosoya <yoshifumi.hosoya.wj@renesas.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>	2014-11-19 09:22:12 +09:00
Kouei Abe	3e58a5424c	ARM: shmobile: r8a7794: Add SGX clock to device tree Signed-off-by: Kouei Abe <kouei.abe.cp@renesas.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>	2014-11-19 09:22:12 +09:00
Joe Stringer	11bf7828a5	vxlan: Inline vxlan_gso_check(). Suggested-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-18 15:38:44 -05:00
Rick Jones	e3e3217029	icmp: Remove some spurious dropped packet profile hits from the ICMP path If icmp_rcv() has successfully processed the incoming ICMP datagram, we should use consume_skb() rather than kfree_skb() because a hit on the likes of perf -e skb:kfree_skb is not called-for. Signed-off-by: Rick Jones <rick.jones2@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-18 15:28:28 -05:00
John W. Linville	f48ecb19bc	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg <johan.hedberg@gmail.com> says: "Here's another bluetooth-next pull request for 3.19. We've got: - Various fixes, cleanups and improvements to ieee802154/mac802154 - Support for a Broadcom BCM20702A1 variant - Lots of lockdep fixes - Fixed handling of LE CoC errors that should trigger SMP" Signed-off-by: John W. Linville <linville@tuxdriver.com>	2014-11-18 15:13:26 -05:00
Alexei Starovoitov	d0003ec01c	bpf: allow eBPF programs to use maps expose bpf_map_lookup_elem(), bpf_map_update_elem(), bpf_map_delete_elem() map accessors to eBPF programs Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-18 13:44:00 -05:00
Alexei Starovoitov	28fbcfa08d	bpf: add array type of eBPF maps add new map type BPF_MAP_TYPE_ARRAY and its implementation - optimized for fastest possible lookup() . in the future verifier/JIT may recognize lookup() with constant key and optimize it into constant pointer. Can optimize non-constant key into direct pointer arithmetic as well, since pointers and value_size are constant for the life of the eBPF program. In other words array_map_lookup_elem() may be 'inlined' by verifier/JIT while preserving concurrent access to this map from user space - two main use cases for array type: . 'global' eBPF variables: array of 1 element with key=0 and value is a collection of 'global' variables which programs can use to keep the state between events . aggregation of tracing events into fixed set of buckets - all array elements pre-allocated and zero initialized at init time - key as an index in array and can only be 4 byte - map_delete_elem() returns EINVAL, since elements cannot be deleted - map_update_elem() replaces elements in an non-atomic way (for atomic updates hashtable type should be used instead) Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-18 13:43:59 -05:00
Alexei Starovoitov	0f8e4bd8a1	bpf: add hashtable type of eBPF maps add new map type BPF_MAP_TYPE_HASH and its implementation - maps are created/destroyed by userspace. Both userspace and eBPF programs can lookup/update/delete elements from the map - eBPF programs can be called in_irq(), so use spin_lock_irqsave() mechanism for concurrent updates - key/value are opaque range of bytes (aligned to 8 bytes) - user space provides 3 configuration attributes via BPF syscall: key_size, value_size, max_entries - map takes care of allocating/freeing key/value pairs - map_update_elem() must fail to insert new element when max_entries limit is reached to make sure that eBPF programs cannot exhaust memory - map_update_elem() replaces elements in an atomic way - optimized for speed of lookup() which can be called multiple times from eBPF program which itself is triggered by high volume of events . in the future JIT compiler may recognize lookup() call and optimize it further, since key_size is constant for life of eBPF program Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-18 13:43:25 -05:00
Alexei Starovoitov	3274f52073	bpf: add 'flags' attribute to BPF_MAP_UPDATE_ELEM command the current meaning of BPF_MAP_UPDATE_ELEM syscall command is: either update existing map element or create a new one. Initially the plan was to add a new command to handle the case of 'create new element if it didn't exist', but 'flags' style looks cleaner and overall diff is much smaller (more code reused), so add 'flags' attribute to BPF_MAP_UPDATE_ELEM command with the following meaning: #define BPF_ANY 0 /* create new element or update existing / #define BPF_NOEXIST 1 / create new element if it didn't exist / #define BPF_EXIST 2 / update existing element / bpf_update_elem(fd, key, value, BPF_NOEXIST) call can fail with EEXIST if element already exists. bpf_update_elem(fd, key, value, BPF_EXIST) can fail with ENOENT if element doesn't exist. Userspace will call it as: int bpf_update_elem(int fd, void key, void *value, __u64 flags) { union bpf_attr attr = { .map_fd = fd, .key = ptr_to_u64(key), .value = ptr_to_u64(value), .flags = flags; }; return bpf(BPF_MAP_UPDATE_ELEM, &attr, sizeof(attr)); } First two bits of 'flags' are used to encode style of bpf_update_elem() command. Bits 2-63 are reserved for future use. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-11-18 13:43:25 -05:00
Kevin Cernekee	53a4ab96c6	of: Change of_device_is_available() to return bool This function can only return true or false; using a bool makes it more obvious to the reader. Signed-off-by: Kevin Cernekee <cernekee@gmail.com> Signed-off-by: Grant Likely <grant.likely@linaro.org>	2014-11-18 17:32:55 +00:00
Lars-Peter Clausen	358a8bb562	ASoC: ac97: Push snd_ac97 pointer to the driver level Now that the ASoC core no longer needs a handle to the AC'97 device that is associated with a CODEC we can remove it from the snd_soc_codec struct and push it into the individual driver state structs like we do for other communication buses. Doing so creates a clean separation between the AC'97 bus support and the ASoC core. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@kernel.org>	2014-11-18 15:38:03 +00:00
Lars-Peter Clausen	bc26321404	ASoC: Rename snd_soc_dai_driver struct ac97_control field to bus_control Setting the ac97_control field on a CPU DAI tells the ASoC core that this DAI in addition to audio data also transports control data to the CODEC. This causes the core to suspend the DAI after the CODEC and resume it before the CODEC so communication to the CODEC is still possible. This is not necessarily something that is specific to AC'97 and can be used by other buses with the same requirement. This patch renames the flag from ac97_control to bus_control to make this explicit. While we are at it also change the type from int to bool. The following semantich patch was used for automatic conversion of the drivers: // <smpl> @@ identifier drv; @@ struct snd_soc_dai_driver drv = { - .ac97_control + .bus_control = - 1 + true }; // </smpl> Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Mark Brown <broonie@kernel.org>	2014-11-18 15:38:03 +00:00
Lars-Peter Clausen	6794f709b7	ASoC: ac97: Drop delayed device registration We have all the information and dependencies we need to initialize and register the device available in snd_soc_new_ac97_codec(). So there is no need to delay the device registration until after the card itself as been registered. This makes the code significantly simpler and also makes it possible to use the AC'97 device in the CODECs probe function. The later will be required to be able to convert the AC'97 CODEC drivers to regmap. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Mark Brown <broonie@kernel.org>	2014-11-18 15:37:58 +00:00
Lars-Peter Clausen	ca005f324e	ASoC: ac97: Drop support for setting platform data via the CPU DAI This has no users since commit `f0fba2ad1b` ("ASoC: multi-component - ASoC Multi-Component Support") which was almost 5 years ago. Given that this runs after CODEC probe functions have been run it also doesn't seem to be that useful. So drop it altogether to make the code simpler. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Mark Brown <broonie@kernel.org>	2014-11-18 15:37:58 +00:00
Lars-Peter Clausen	eda1a701fd	ASoC: ac97: Use static ac97_bus We always pass soc_ac97_ops to snd_soc_new_ac97_codec(). So instead of allocating a snd_ac97_bus in snd_soc_new_ac97_codec() just use a static one that gets initialized when snd_soc_set_ac97_ops() is called. Also drop the device number parameter from snd_soc_new_ac97_codec(). We currently only support one device per bus and all drivers pass 0 for the device number. And if we should ever support multiple devices per bus it wouldn't be up to individual AC'97 device drivers to pick their number, but rather either the AC'97 adapter driver or the core code will assign them. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@kernel.org>	2014-11-18 15:37:46 +00:00
Lars-Peter Clausen	336b8423e2	ASoC: Move AC'97 support to its own file Currently the AC'97 support is splattered all throughout soc-core.c. Some parts are #ifdef'd some parts are not. This patch moves the AC'97 support to its own file, this should make the code a bit more clearer and also makes it possible to easily not compile it into the kernel when not needed. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Mark Brown <broonie@kernel.org>	2014-11-18 15:26:06 +00:00
Dong Aisheng	98e69016a1	can: dev: add can_is_canfd_skb() API The CAN device drivers can use can_is_canfd_skb() to check if the frame to send is on CAN FD mode or normal CAN mode. Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Dong Aisheng <b29396@freescale.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2014-11-18 13:23:31 +01:00
Peter Griffin	cc149f7ae2	ARM: STi: DT: STiH410: Add defines for STiH410 DT clocks Although most clock outputs are the same as stih407 SoC, stih410 also has some additional new clock outputs. Signed-off-by: Peter Griffin <peter.griffin@linaro.org> Acked-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Maxime Coquelin <maxime.coquelin@st.com>	2014-11-18 12:54:55 +01:00
Jiang Liu	6b1972493a	iommu/vt-d: Implement DMAR unit hotplug framework On Intel platforms, an IO Hub (PCI/PCIe host bridge) may contain DMAR units, so we need to support DMAR hotplug when supporting PCI host bridge hotplug on Intel platforms. According to Section 8.8 "Remapping Hardware Unit Hot Plug" in "Intel Virtualization Technology for Directed IO Architecture Specification Rev 2.2", ACPI BIOS should implement ACPI _DSM method under the ACPI object for the PCI host bridge to support DMAR hotplug. This patch introduces interfaces to parse ACPI _DSM method for DMAR unit hotplug. It also implements state machines for DMAR unit hot-addition and hot-removal. The PCI host bridge hotplug driver should call dmar_hotplug_hotplug() before scanning PCI devices connected for hot-addition and after destroying all PCI devices for hot-removal. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Reviewed-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-11-18 11:18:35 +01:00
Jiang Liu	78d8e70461	iommu/vt-d: Dynamically allocate and free seq_id for DMAR units Introduce functions to support dynamic IOMMU seq_id allocating and releasing, which will be used to support DMAR hotplug. Also rename IOMMU_UNITS_SUPPORTED as DMAR_UNITS_SUPPORTED. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Reviewed-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-11-18 11:18:35 +01:00
Jiang Liu	c2a0b538d2	iommu/vt-d: Introduce helper function dmar_walk_resources() Introduce helper function dmar_walk_resources to walk resource entries in DMAR table and ACPI buffer object returned by ACPI _DSM method for IOMMU hot-plug. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>	2014-11-18 11:18:35 +01:00
Johannes Berg	760a52e80f	Merge remote-tracking branch 'wireless-next/master' into mac80211-next This brings in some mwifiex changes that further patches will need to work on top to not cause merge conflicts. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-11-18 09:32:44 +01:00
Tejun Heo	eeecbd1971	cgroup: implement cgroup_get_e_css() Implement cgroup_get_e_css() which finds and gets the effective css for the specified cgroup and subsystem combination. This function always returns a valid pinned css. This will be used by cgroup writeback support. While at it, add comment to cgroup_e_css() to explain why that function is different from cgroup_get_e_css() and has to test cgrp->child_subsys_mask instead of cgroup_css(cgrp, ss). Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Zefan Li <lizefan@huawei.com>	2014-11-18 02:49:52 -05:00
Tejun Heo	56c807ba4e	cgroup: add cgroup_subsys->css_e_css_changed() Add a new cgroup_subsys operatoin ->css_e_css_changed(). This is invoked if any of the effective csses seen from the css's cgroup may have changed. This will be used to implement cgroup writeback support. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Zefan Li <lizefan@huawei.com>	2014-11-18 02:49:51 -05:00
Tejun Heo	7d172cc89b	cgroup: add cgroup_subsys->css_released() Add a new cgroup subsys callback css_released(). This is called when the reference count of the css (cgroup_subsys_state) reaches zero before RCU scheduling free. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Zefan Li <lizefan@huawei.com>	2014-11-18 02:49:51 -05:00
Dmitry Kasatkin	6fb5032ebb	VFS: refactor vfs_read() integrity_kernel_read() duplicates the file read operations code in vfs_read(). This patch refactors vfs_read() code creating a helper function __vfs_read(). It is used by both vfs_read() and integrity_kernel_read(). Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2014-11-17 23:14:22 -05:00
Dmitry Kasatkin	c9cd2ce2bc	integrity: provide a hook to load keys when rootfs is ready Keys can only be loaded once the rootfs is mounted. Initcalls are not suitable for that. This patch defines a special hook to load the x509 public keys onto the IMA keyring, before attempting to access any file. The keys are required for verifying the file's signature. The hook is called after the root filesystem is mounted and before the kernel calls 'init'. Changes in v3: * added more explanation to the patch description (Mimi) Changes in v2: * Hook renamed as 'integrity_load_keys()' to handle both IMA and EVM keys by integrity subsystem. * Hook patch moved after defining loading functions Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>	2014-11-17 23:12:01 -05:00
Johan Hedberg	38da170306	Bluetooth: Use shorter "rand" name for "randomizer" The common short form of "randomizer" is "rand" in many places (including the Bluetooth specification). The shorter version also makes for easier to read code with less forced line breaks. This patch renames all occurences of "randomizer" to "rand" in the Bluetooth subsystem code. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2014-11-18 01:53:15 +01:00
Rafael J. Wysocki	2e015da0d5	Merge back 'pm-domains' material for 3.19-rc1.	2014-11-18 01:21:39 +01:00
Ulf Hansson	00e7c29596	PM / Domains: Move struct pm_domain_data to pm_domain.h The definition of the struct pm_domain_data better belongs in the header for the PM domains, let's move it there. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-11-18 01:20:15 +01:00
Dave Hansen	1de4fa14ee	x86, mpx: Cleanup unused bound tables The previous patch allocates bounds tables on-demand. As noted in an earlier description, these can add up to HUGE amounts of memory. This has caused OOMs in practice when running tests. This patch adds support for freeing bounds tables when they are no longer in use. There are two types of mappings in play when unmapping tables: 1. The mapping with the actual data, which userspace is munmap()ing or brk()ing away, etc... 2. The mapping for the bounds table backing the data (is tagged with VM_MPX, see the patch "add MPX specific mmap interface"). If userspace use the prctl() indroduced earlier in this patchset to enable the management of bounds tables in kernel, when it unmaps the first type of mapping with the actual data, the kernel needs to free the mapping for the bounds table backing the data. This patch hooks in at the very end of do_unmap() to do so. We look at the addresses being unmapped and find the bounds directory entries and tables which cover those addresses. If an entire table is unused, we clear associated directory entry and free the table. Once we unmap the bounds table, we would have a bounds directory entry pointing at empty address space. That address space might now be allocated for some other (random) use, and the MPX hardware might now try to walk it as if it were a bounds table. That would be bad. So any unmapping of an enture bounds table has to be accompanied by a corresponding write to the bounds directory entry to invalidate it. That write to the bounds directory can fault, which causes the following problem: Since we are doing the freeing from munmap() (and other paths like it), we hold mmap_sem for write. If we fault, the page fault handler will attempt to acquire mmap_sem for read and we will deadlock. To avoid the deadlock, we pagefault_disable() when touching the bounds directory entry and use a get_user_pages() to resolve the fault. The unmapping of bounds tables happends under vm_munmap(). We also (indirectly) call vm_munmap() to _do_ the unmapping of the bounds tables. We avoid unbounded recursion by disallowing freeing of bounds tables for bounds tables. This would not occur normally, so should not have any practical impact. Being strict about it here helps ensure that we do not have an exploitable stack overflow. Based-on-patch-by: Qiaowei Ren <qiaowei.ren@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-mm@kvack.org Cc: linux-mips@linux-mips.org Cc: Dave Hansen <dave@sr71.net> Link: http://lkml.kernel.org/r/20141114151831.E4531C4A@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-11-18 00:58:54 +01:00
Dave Hansen	fe3d197f84	x86, mpx: On-demand kernel allocation of bounds tables This is really the meat of the MPX patch set. If there is one patch to review in the entire series, this is the one. There is a new ABI here and this kernel code also interacts with userspace memory in a relatively unusual manner. (small FAQ below). Long Description: This patch adds two prctl() commands to provide enable or disable the management of bounds tables in kernel, including on-demand kernel allocation (See the patch "on-demand kernel allocation of bounds tables") and cleanup (See the patch "cleanup unused bound tables"). Applications do not strictly need the kernel to manage bounds tables and we expect some applications to use MPX without taking advantage of this kernel support. This means the kernel can not simply infer whether an application needs bounds table management from the MPX registers. The prctl() is an explicit signal from userspace. PR_MPX_ENABLE_MANAGEMENT is meant to be a signal from userspace to require kernel's help in managing bounds tables. PR_MPX_DISABLE_MANAGEMENT is the opposite, meaning that userspace don't want kernel's help any more. With PR_MPX_DISABLE_MANAGEMENT, the kernel won't allocate and free bounds tables even if the CPU supports MPX. PR_MPX_ENABLE_MANAGEMENT will fetch the base address of the bounds directory out of a userspace register (bndcfgu) and then cache it into a new field (->bd_addr) in the 'mm_struct'. PR_MPX_DISABLE_MANAGEMENT will set "bd_addr" to an invalid address. Using this scheme, we can use "bd_addr" to determine whether the management of bounds tables in kernel is enabled. Also, the only way to access that bndcfgu register is via an xsaves, which can be expensive. Caching "bd_addr" like this also helps reduce the cost of those xsaves when doing table cleanup at munmap() time. Unfortunately, we can not apply this optimization to #BR fault time because we need an xsave to get the value of BNDSTATUS. ==== Why does the hardware even have these Bounds Tables? ==== MPX only has 4 hardware registers for storing bounds information. If MPX-enabled code needs more than these 4 registers, it needs to spill them somewhere. It has two special instructions for this which allow the bounds to be moved between the bounds registers and some new "bounds tables". They are similar conceptually to a page fault and will be raised by the MPX hardware during both bounds violations or when the tables are not present. This patch handles those #BR exceptions for not-present tables by carving the space out of the normal processes address space (essentially calling the new mmap() interface indroduced earlier in this patch set.) and then pointing the bounds-directory over to it. The tables need to be accessed and controlled by userspace because the instructions for moving bounds in and out of them are extremely frequent. They potentially happen every time a register pointing to memory is dereferenced. Any direct kernel involvement (like a syscall) to access the tables would obviously destroy performance. ==== Why not do this in userspace? ==== This patch is obviously doing this allocation in the kernel. However, MPX does not strictly require anything in the kernel. It can theoretically be done completely from userspace. Here are a few ways this could be done. I don't think any of them are practical in the real-world, but here they are. Q: Can virtual space simply be reserved for the bounds tables so that we never have to allocate them? A: As noted earlier, these tables are HUGE. An X-GB virtual area needs 4X GB of virtual space, plus 2GB for the bounds directory. If we were to preallocate them for the 128TB of user virtual address space, we would need to reserve 512TB+2GB, which is larger than the entire virtual address space today. This means they can not be reserved ahead of time. Also, a single process's pre-popualated bounds directory consumes 2GB of virtual AND* physical memory. IOW, it's completely infeasible to prepopulate bounds directories. Q: Can we preallocate bounds table space at the same time memory is allocated which might contain pointers that might eventually need bounds tables? A: This would work if we could hook the site of each and every memory allocation syscall. This can be done for small, constrained applications. But, it isn't practical at a larger scale since a given app has no way of controlling how all the parts of the app might allocate memory (think libraries). The kernel is really the only place to intercept these calls. Q: Could a bounds fault be handed to userspace and the tables allocated there in a signal handler instead of in the kernel? A: (thanks to tglx) mmap() is not on the list of safe async handler functions and even if mmap() would work it still requires locking or nasty tricks to keep track of the allocation state there. Having ruled out all of the userspace-only approaches for managing bounds tables that we could think of, we create them on demand in the kernel. Based-on-patch-by: Qiaowei Ren <qiaowei.ren@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-mm@kvack.org Cc: linux-mips@linux-mips.org Cc: Dave Hansen <dave@sr71.net> Link: http://lkml.kernel.org/r/20141114151829.AD4310DE@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-11-18 00:58:53 +01:00
Qiaowei Ren	4aae7e436f	x86, mpx: Introduce VM_MPX to indicate that a VMA is MPX specific MPX-enabled applications using large swaths of memory can potentially have large numbers of bounds tables in process address space to save bounds information. These tables can take up huge swaths of memory (as much as 80% of the memory on the system) even if we clean them up aggressively. In the worst-case scenario, the tables can be 4x the size of the data structure being tracked. IOW, a 1-page structure can require 4 bounds-table pages. Being this huge, our expectation is that folks using MPX are going to be keen on figuring out how much memory is being dedicated to it. So we need a way to track memory use for MPX. If we want to specifically track MPX VMAs we need to be able to distinguish them from normal VMAs, and keep them from getting merged with normal VMAs. A new VM_ flag set only on MPX VMAs does both of those things. With this flag, MPX bounds-table VMAs can be distinguished from other VMAs, and userspace can also walk /proc/$pid/smaps to get memory usage for MPX. In addition to this flag, we also introduce a special ->vm_ops specific to MPX VMAs (see the patch "add MPX specific mmap interface"), but currently different ->vm_ops do not by themselves prevent VMA merging, so we still need this flag. We understand that VM_ flags are scarce and are open to other options. Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-mm@kvack.org Cc: linux-mips@linux-mips.org Cc: Dave Hansen <dave@sr71.net> Link: http://lkml.kernel.org/r/20141114151825.565625B3@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-11-18 00:58:53 +01:00
Qiaowei Ren	ee1b58d36a	mpx: Extend siginfo structure to include bound violation information This patch adds new fields about bound violation into siginfo structure. si_lower and si_upper are respectively lower bound and upper bound when bound violation is caused. Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-mm@kvack.org Cc: linux-mips@linux-mips.org Cc: Dave Hansen <dave@sr71.net> Link: http://lkml.kernel.org/r/20141114151819.1908C900@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-11-18 00:58:53 +01:00
Richard Guy Briggs	0288d7183c	audit: convert status version to a feature bitmap The version field defined in the audit status structure was found to have limitations in terms of its expressibility of features supported. This is distict from the get/set features call to be able to command those features that are present. Converting this field from a version number to a feature bitmap will allow distributions to selectively backport and support certain features and will allow upstream to be able to deprecate features in the future. It will allow userspace clients to first query the kernel for which features are actually present and supported. Currently, EINVAL is returned rather than EOPNOTSUP, which isn't helpful in determining if there was an error in the command, or if it simply isn't supported yet. Past features are not represented by this bitmap, but their use may be converted to EOPNOTSUP if needed in the future. Since "version" is too generic to convert with a #define, use a union in the struct status, introducing the member "feature_bitmap" unionized with "version". Convert existing AUDIT_VERSION_* macros over to AUDIT_FEATURE_BITMAP* counterparts, leaving the former for backwards compatibility. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> [PM: minor whitespace tweaks] Signed-off-by: Paul Moore <pmoore@redhat.com>	2014-11-17 16:53:51 -05:00

... 43 44 45 46 47 ...

73029 Commits