linux

Author	SHA1	Message	Date
Ingo Molnar	786133f6e8	Merge branch 'core/irq_work' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into irq/core irq_work fixes and cleanups, in preparation for full dyntics support. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-01-24 12:48:41 +01:00
Greg Kroah-Hartman	ed408f7c0f	Merge 3.9-rc4 into driver-core-next This is to fix up a build problem with a wireless driver due to the dynamic-debug patches in this branch. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-17 19:48:18 -08:00
Kirill Smelkov	3a55fb0d9f	Tell the world we gave up on pushing CC_OPTIMIZE_FOR_SIZE In commit `281dc5c5ec` ("Give up on pushing CC_OPTIMIZE_FOR_SIZE") we already changed the actual default value, but the help-text still suggested 'y'. Fix the help text too, for all the same reasons. Sadly, -Os keeps on generating some very suboptimal code for certain cases, to the point where any I$ miss upside is swamped by the downside. The main ones are: - using "rep movsb" for memcpy, even on CPU's where that is horrendously bad for performance. - not honoring branch prediction information, so any I$ footprint you win from smaller code, you lose from less code density in the I$. - using divide instructions when that is very expensive. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-01-16 12:42:57 -08:00
Kees Cook	19c9239981	init: remove depends on CONFIG_EXPERIMENTAL The CONFIG_EXPERIMENTAL config item has not carried much meaning for a while now and is almost always enabled by default. As agreed during the Linux kernel summit, remove it from any "depends on" lines in Kconfigs. CC: "Eric W. Biederman" <ebiederm@xmission.com> CC: Serge Hallyn <serge.hallyn@canonical.com> CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>	2013-01-11 11:39:33 -08:00
Kees Cook	5a958db311	make CONFIG_EXPERIMENTAL invisible and default This config item has not carried much meaning for a while now and is almost always enabled by default (especially in distro builds). As agreed during the Linux kernel summit, it should be removed. As a first step, remove it from being listed, and default it to on. Once it has been removed from all subsystem Kconfigs, it will be dropped entirely. For items that really are experimental, maintainers should use "default n", optionally include "(EXPERIMENTAL)" in the title, and add language to the help text indicating why the item should be considered experimental. For items that are dangerously experimental, the maintainer is encouraged to follow the above title recommendation, add stronger language to the help text, and optionally use (depending on the extent of the danger, from least to most dangerous): printk(), add_taint(TAINT_WARN), add_taint(TAINT_CRAP), WARN_ON(1), and CONFIG_BROKEN. CC: Greg KH <gregkh@linuxfoundation.org> CC: "Eric W. Biederman" <ebiederm@xmission.com> CC: Serge Hallyn <serge.hallyn@canonical.com> CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-01-11 11:38:03 -08:00
Vineet Gupta	b6fca72536	sysctl: Enable IA64 "ignore-unaligned-usertrap" to be used cross-arch IA64 defines /proc/sys/kernel/ignore-unaligned-usertrap to control verbose warnings on unaligned access emulation. Although the exact mechanics of what to do with sysctl (ignore/shout) are arch specific, this change enables the sysctl to be usable cross-arch. Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2013-01-09 10:48:34 -08:00
Glauber Costa	d7f25f8a2f	memcg: infrastructure to match an allocation to the right cache The page allocator is able to bind a page to a memcg when it is allocated. But for the caches, we'd like to have as many objects as possible in a page belonging to the same cache. This is done in this patch by calling memcg_kmem_get_cache in the beginning of every allocation function. This function is patched out by static branches when kernel memory controller is not being used. It assumes that the task allocating, which determines the memcg in the page allocator, belongs to the same cgroup throughout the whole process. Misaccounting can happen if the task calls memcg_kmem_get_cache() while belonging to a cgroup, and later on changes. This is considered acceptable, and should only happen upon task migration. Before the cache is created by the memcg core, there is also a possible imbalance: the task belongs to a memcg, but the cache being allocated from is the global cache, since the child cache is not yet guaranteed to be ready. This case is also fine, since in this case the GFP_KMEMCG will not be passed and the page allocator will not attempt any cgroup accounting. Signed-off-by: Glauber Costa <glommer@parallels.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: JoonSoo Kim <js1304@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Michal Hocko <mhocko@suse.cz> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-18 15:02:14 -08:00
Glauber Costa	510fc4e11b	memcg: kmem accounting basic infrastructure Add the basic infrastructure for the accounting of kernel memory. To control that, the following files are created: * memory.kmem.usage_in_bytes * memory.kmem.limit_in_bytes * memory.kmem.failcnt * memory.kmem.max_usage_in_bytes They have the same meaning of their user memory counterparts. They reflect the state of the "kmem" res_counter. Per cgroup kmem memory accounting is not enabled until a limit is set for the group. Once the limit is set the accounting cannot be disabled for that group. This means that after the patch is applied, no behavioral changes exists for whoever is still using memcg to control their memory usage, until memory.kmem.limit_in_bytes is set for the first time. We always account to both user and kernel resource_counters. This effectively means that an independent kernel limit is in place when the limit is set to a lower value than the user memory. A equal or higher value means that the user limit will always hit first, meaning that kmem is effectively unlimited. People who want to track kernel memory but not limit it, can set this limit to a very high number (like RESOURCE_MAX - 1page - that no one will ever hit, or equal to the user memory) [akpm@linux-foundation.org: MEMCG_MMEM only works with slab and slub] Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: JoonSoo Kim <js1304@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-18 15:02:12 -08:00
Linus Torvalds	6a2b60b17b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "While small this set of changes is very significant with respect to containers in general and user namespaces in particular. The user space interface is now complete. This set of changes adds support for unprivileged users to create user namespaces and as a user namespace root to create other namespaces. The tyranny of supporting suid root preventing unprivileged users from using cool new kernel features is broken. This set of changes completes the work on setns, adding support for the pid, user, mount namespaces. This set of changes includes a bunch of basic pid namespace cleanups/simplifications. Of particular significance is the rework of the pid namespace cleanup so it no longer requires sending out tendrils into all kinds of unexpected cleanup paths for operation. At least one case of broken error handling is fixed by this cleanup. The files under /proc/<pid>/ns/ have been converted from regular files to magic symlinks which prevents incorrect caching by the VFS, ensuring the files always refer to the namespace the process is currently using and ensuring that the ptrace_mayaccess permission checks are always applied. The files under /proc/<pid>/ns/ have been given stable inode numbers so it is now possible to see if different processes share the same namespaces. Through the David Miller's net tree are changes to relax many of the permission checks in the networking stack to allowing the user namespace root to usefully use the networking stack. Similar changes for the mount namespace and the pid namespace are coming through my tree. Two small changes to add user namespace support were commited here adn in David Miller's -net tree so that I could complete the work on the /proc/<pid>/ns/ files in this tree. Work remains to make it safe to build user namespaces and 9p, afs, ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the Kconfig guard remains in place preventing that user namespaces from being built when any of those filesystems are enabled. Future design work remains to allow root users outside of the initial user namespace to mount more than just /proc and /sys." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits) proc: Usable inode numbers for the namespace file descriptors. proc: Fix the namespace inode permission checks. proc: Generalize proc inode allocation userns: Allow unprivilged mounts of proc and sysfs userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file procfs: Print task uids and gids in the userns that opened the proc file userns: Implement unshare of the user namespace userns: Implent proc namespace operations userns: Kill task_user_ns userns: Make create_new_namespaces take a user_ns parameter userns: Allow unprivileged use of setns. userns: Allow unprivileged users to create new namespaces userns: Allow setting a userns mapping to your current uid. userns: Allow chown and setgid preservation userns: Allow unprivileged users to create user namespaces. userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped userns: fix return value on mntns_install() failure vfs: Allow unprivileged manipulation of the mount namespace. vfs: Only support slave subtrees across different user namespaces vfs: Add a user namespace reference from struct mnt_namespace ...	2012-12-17 15:44:47 -08:00
Linus Torvalds	3d59eebc5e	Automatic NUMA Balancing V11 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQIcBAABAgAGBQJQx0kQAAoJEHzG/DNEskfi4fQP/R5PRovayroZALBMLnVJDaLD Ttr9p40VNXbiJ+MfRgatJjSSJZ4Jl+fC3NEqBhcwVZhckZZb9R2s0WtrSQo5+ZbB vdRfiuKoCaKM4cSZ08C12uTvsF6xjhjd27CTUlMkyOcDoKxMEFKelv0hocSxe4Wo xqlv3eF+VsY7kE1BNbgBP06SX4tDpIHRxXfqJPMHaSKQmre+cU0xG2GcEu3QGbHT DEDTI788YSaWLmBfMC+kWoaQl1+bV/FYvavIAS8/o4K9IKvgR42VzrXmaFaqrbgb 72ksa6xfAi57yTmZHqyGmts06qYeBbPpKI+yIhCMInxA9CY3lPbvHppRf0RQOyzj YOi4hovGEMJKE+BCILukhJcZ9jCTtS3zut6v1rdvR88f4y7uhR9RfmRfsxuW7PNj 3Rmh191+n0lVWDmhOs2psXuCLJr3LEiA0dFffN1z8REUTtTAZMsj8Rz+SvBNAZDR hsJhERVeXB6X5uQ5rkLDzbn1Zic60LjVw7LIp6SF2OYf/YKaF8vhyWOA8dyCEu8W CGo7AoG0BO8tIIr8+LvFe8CweypysZImx4AjCfIs4u9pu/v11zmBvO9NO5yfuObF BreEERYgTes/UITxn1qdIW4/q+Nr0iKO3CTqsmu6L1GfCz3/XzPGs3U26fUhllqi Ka0JKgnWvsa6ez6FSzKI =ivQa -----END PGP SIGNATURE----- Merge tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma Pull Automatic NUMA Balancing bare-bones from Mel Gorman: "There are three implementations for NUMA balancing, this tree (balancenuma), numacore which has been developed in tip/master and autonuma which is in aa.git. In almost all respects balancenuma is the dumbest of the three because its main impact is on the VM side with no attempt to be smart about scheduling. In the interest of getting the ball rolling, it would be desirable to see this much merged for 3.8 with the view to building scheduler smarts on top and adapting the VM where required for 3.9. The most recent set of comparisons available from different people are mel: https://lkml.org/lkml/2012/12/9/108 mingo: https://lkml.org/lkml/2012/12/7/331 tglx: https://lkml.org/lkml/2012/12/10/437 srikar: https://lkml.org/lkml/2012/12/10/397 The results are a mixed bag. In my own tests, balancenuma does reasonably well. It's dumb as rocks and does not regress against mainline. On the other hand, Ingo's tests shows that balancenuma is incapable of converging for this workloads driven by perf which is bad but is potentially explained by the lack of scheduler smarts. Thomas' results show balancenuma improves on mainline but falls far short of numacore or autonuma. Srikar's results indicate we all suffer on a large machine with imbalanced node sizes. My own testing showed that recent numacore results have improved dramatically, particularly in the last week but not universally. We've butted heads heavily on system CPU usage and high levels of migration even when it shows that overall performance is better. There are also cases where it regresses. Of interest is that for specjbb in some configurations it will regress for lower numbers of warehouses and show gains for higher numbers which is not reported by the tool by default and sometimes missed in treports. Recently I reported for numacore that the JVM was crashing with NullPointerExceptions but currently it's unclear what the source of this problem is. Initially I thought it was in how numacore batch handles PTEs but I'm no longer think this is the case. It's possible numacore is just able to trigger it due to higher rates of migration. These reports were quite late in the cycle so I/we would like to start with this tree as it contains much of the code we can agree on and has not changed significantly over the last 2-3 weeks." * tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits) mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable mm/rmap: Convert the struct anon_vma::mutex to an rwsem mm: migrate: Account a transhuge page properly when rate limiting mm: numa: Account for failed allocations and isolations as migration failures mm: numa: Add THP migration for the NUMA working set scanning fault case build fix mm: numa: Add THP migration for the NUMA working set scanning fault case. mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG mm: sched: numa: Control enabling and disabling of NUMA balancing mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships mm: numa: migrate: Set last_nid on newly allocated page mm: numa: split_huge_page: Transfer last_nid on tail page mm: numa: Introduce last_nid to the page frame sched: numa: Slowly increase the scanning period as NUMA faults are handled mm: numa: Rate limit setting of pte_numa if node is saturated mm: numa: Rate limit the amount of memory that is migrated between nodes mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting mm: numa: Migrate pages handled during a pmd_numa hinting fault mm: numa: Migrate on reference policy ...	2012-12-16 15:18:08 -08:00
Mel Gorman	1a687c2e9a	mm: sched: numa: Control enabling and disabling of NUMA balancing This patch adds Kconfig options and kernel parameters to allow the enabling and disabling of automatic NUMA balancing. The existance of such a switch was and is very important when debugging problems related to transparent hugepages and we should have the same for automatic NUMA placement. Signed-off-by: Mel Gorman <mgorman@suse.de>	2012-12-11 14:42:55 +00:00
Andrea Arcangeli	be3a728427	mm: numa: pte_numa() and pmd_numa() Implement pte_numa and pmd_numa. We must atomically set the numa bit and clear the present bit to define a pte_numa or pmd_numa. Once a pte or pmd has been set as pte_numa or pmd_numa, the next time a thread touches a virtual address in the corresponding virtual range, a NUMA hinting page fault will trigger. The NUMA hinting page fault will clear the NUMA bit and set the present bit again to resolve the page fault. The expectation is that a NUMA hinting page fault is used as part of a placement policy that decides if a page should remain on the current node or migrated to a different node. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Mel Gorman <mgorman@suse.de>	2012-12-11 14:42:36 +00:00
Frederic Weisbecker	91d1aa43d3	context_tracking: New context tracking susbsystem Create a new subsystem that probes on kernel boundaries to keep track of the transitions between level contexts with two basic initial contexts: user or kernel. This is an abstraction of some RCU code that use such tracking to implement its userspace extended quiescent state. We need to pull this up from RCU into this new level of indirection because this tracking is also going to be used to implement an "on demand" generic virtual cputime accounting. A necessary step to shutdown the tick while still accounting the cputime. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Gilad Ben-Yossef <gilad@benyossef.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> [ paulmck: fix whitespace error and email address. ] Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-11-30 11:40:07 -08:00
Frederic Weisbecker	74876a98a8	printk: Wake up klogd using irq_work klogd is woken up asynchronously from the tick in order to do it safely. However if printk is called when the tick is stopped, the reader won't be woken up until the next interrupt, which might not fire for a while. As a result, the user may miss some message. To fix this, lets implement the printk tick using a lazy irq work. This subsystem takes care of the timer tick state and can fix up accordingly. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-11-18 01:01:49 +01:00
Frederic Weisbecker	6147a9d807	irq_work: Remove CONFIG_HAVE_IRQ_WORK irq work can run on any arch even without IPI support because of the hook on update_process_times(). So lets remove HAVE_IRQ_WORK because it doesn't reflect any backend requirement. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-11-17 19:25:12 +01:00
Paul E. McKenney	3fbfbf7a3b	rcu: Add callback-free CPUs RCU callback execution can add significant OS jitter and also can degrade both scheduling latency and, in asymmetric multiprocessors, energy efficiency. This commit therefore adds the ability for selected CPUs ("rcu_nocbs=" boot parameter) to have their callbacks offloaded to kthreads. If the "rcu_nocb_poll" boot parameter is also specified, these kthreads will do polling, removing the need for the offloaded CPUs to do wakeups. At least one CPU must be doing normal callback processing: currently CPU 0 cannot be selected as a no-CBs CPU. In addition, attempts to offline the last normal-CBs CPU will fail. This feature was inspired by Jim Houston's and Joe Korty's JRCU, and this commit includes fixes to problems located by Fengguang Wu's kbuild test robot. [ paulmck: Added gfp.h include file as suggested by Fengguang Wu. ] Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-11-16 10:05:56 -08:00
Eric W. Biederman	499dcf2024	userns: Support fuse interacting with multiple user namespaces Use kuid_t and kgid_t in struct fuse_conn and struct fuse_mount_data. The connection between between a fuse filesystem and a fuse daemon is established when a fuse filesystem is mounted and provided with a file descriptor the fuse daemon created by opening /dev/fuse. For now restrict the communication of uids and gids between the fuse filesystem and the fuse daemon to the initial user namespace. Enforce this by verifying the file descriptor passed to the mount of fuse was opened in the initial user namespace. Ensuring the mount happens in the initial user namespace is not necessary as mounts from non-initial user namespaces are not yet allowed. In fuse_req_init_context convert the currrent fsuid and fsgid into the initial user namespace for the request that will be sent to the fuse daemon. In fuse_fill_attr convert the uid and gid passed from the fuse daemon from the initial user namespace into kuids and kgids. In iattr_to_fattr called from fuse_setattr convert kuids and kgids into the uids and gids in the initial user namespace before passing them to the fuse filesystem. In fuse_change_attributes_common called from fuse_dentry_revalidate, fuse_permission, fuse_geattr, and fuse_setattr, and fuse_iget convert the uid and gid from the fuse daemon into a kuid and a kgid to store on the fuse inode. By default fuse mounts are restricted to task whose uid, suid, and euid matches the fuse user_id and whose gid, sgid, and egid matches the fuse group id. Convert the user_id and group_id mount options into kuids and kgids at mount time, and use uid_eq and gid_eq to compare the in fuse_allow_task. Cc: Miklos Szeredi <miklos@szeredi.hu> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-11-14 22:05:33 -08:00
Eric W. Biederman	45634cd8cb	userns: Support autofs4 interacing with multiple user namespaces Use kuid_t and kgid_t in struct autofs_info and struct autofs_wait_queue. When creating directories and symlinks default the uid and gid of the mount requester to the global root uid and gid. autofs4_wait will update these fields when a mount is requested. When generating autofsv5 packets report the uid and gid of the mount requestor in user namespace of the process that opened the pipe, reporting unmapped uids and gids as overflowuid and overflowgid. In autofs_dev_ioctl_requester return the uid and gid of the last mount requester converted into the calling processes user namespace. When the uid or gid don't map return overflowuid and overflowgid as appropriate, allowing failure to find a mount requester to be distinguished from failure to map a mount requester. The uid and gid mount options specifying the user and group of the root autofs inode are converted into kuid and kgid as they are parsed defaulting to the current uid and current gid of the process that mounts autofs. Mounting of autofs for the present remains confined to processes in the initial user namespace. Cc: Ian Kent <raven@themaw.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-11-14 22:05:32 -08:00
Paul Gortmaker	af71befa28	rcu: Wordsmith help text for RCU_USER_QS kernel parameter This commit adds a "try" missing from the end of the first paragraph of the RCU_USER_QS help text. [ paulmck: Also fix up the last paragraph a bit. ] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-10-24 11:07:09 -07:00
Paul E. McKenney	ba49df4767	rcu: Update RCU_FAST_NO_HZ help text The RCU_FAST_NO_HZ help text included a warning about overhead on large systems, but that issue has since been resolved. The main remaining issue with RCU_FAST_NO_HZ is increased real-time latency. This commit therefore updates the help text accordingly. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-10-23 14:54:07 -07:00
Linus Torvalds	d25282d1c9	Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module signing support from Rusty Russell: "module signing is the highlight, but it's an all-over David Howells frenzy..." Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG. * 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits) X.509: Fix indefinite length element skip error handling X.509: Convert some printk calls to pr_devel asymmetric keys: fix printk format warning MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking MODSIGN: Make mrproper should remove generated files. MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs MODSIGN: Use the same digest for the autogen key sig as for the module sig MODSIGN: Sign modules during the build process MODSIGN: Provide a script for generating a key ID from an X.509 cert MODSIGN: Implement module signature checking MODSIGN: Provide module signing public keys to the kernel MODSIGN: Automatically generate module signing keys if missing MODSIGN: Provide Kconfig options MODSIGN: Provide gitignore and make clean rules for extra files MODSIGN: Add FIPS policy module: signature checking hook X.509: Add a crypto key parser for binary (DER) X.509 certificates MPILIB: Provide a function to read raw data into an MPI X.509: Add an ASN.1 decoder X.509: Add simple ASN.1 grammar compiler ...	2012-10-14 13:39:34 -07:00
Linus Torvalds	9d55ab71b7	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU fixes from Ingo Molnar: "This tree includes a shutdown/cpu-hotplug deadlock fix and a documentation fix." * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rcu: Advise most users not to enable RCU user mode rcu: Grace-period initialization excludes only RCU notifier	2012-10-12 22:12:07 +09:00
Frederic Weisbecker	d677124b1f	rcu: Advise most users not to enable RCU user mode Discourage distros from enabling CONFIG_RCU_USER_QS because it brings overhead for no benefits yet. It's not a useful feature on its own until we can fully run an adaptive tickless kernel. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-10-10 17:35:01 -07:00
David Howells	48ba2462ac	MODSIGN: Implement module signature checking Check the signature on the module against the keys compiled into the kernel or available in a hardware key store. Currently, only RSA keys are supported - though that's easy enough to change, and the signature is expected to contain raw components (so not a PGP or PKCS#7 formatted blob). The signature blob is expected to consist of the following pieces in order: (1) The binary identifier for the key. This is expected to match the SubjectKeyIdentifier from an X.509 certificate. Only X.509 type identifiers are currently supported. (2) The signature data, consisting of a series of MPIs in which each is in the format of a 2-byte BE word sizes followed by the content data. (3) A 12 byte information block of the form: struct module_signature { enum pkey_algo algo : 8; enum pkey_hash_algo hash : 8; enum pkey_id_type id_type : 8; u8 __pad; __be32 id_length; __be32 sig_length; }; The three enums are defined in crypto/public_key.h. 'algo' contains the public-key algorithm identifier (0->DSA, 1->RSA). 'hash' contains the digest algorithm identifier (0->MD4, 1->MD5, 2->SHA1, etc.). 'id_type' contains the public-key identifier type (0->PGP, 1->X.509). '__pad' should be 0. 'id_length' should contain in the binary identifier length in BE form. 'sig_length' should contain in the signature data length in BE form. The lengths are in BE order rather than CPU order to make dealing with cross-compilation easier. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (minor Kconfig fix)	2012-10-10 20:06:10 +10:30
David Howells	ea0b6dcf71	MODSIGN: Provide Kconfig options Provide kernel configuration options for module signing. The following configuration options are added: CONFIG_MODULE_SIG_SHA1 CONFIG_MODULE_SIG_SHA224 CONFIG_MODULE_SIG_SHA256 CONFIG_MODULE_SIG_SHA384 CONFIG_MODULE_SIG_SHA512 These select the cryptographic hash used to digest the data prior to signing. Additionally, the crypto module selected will be built into the kernel as it won't be possible to load it as a module without incurring a circular dependency when the kernel tries to check its signature. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-10-10 20:01:20 +10:30
Rusty Russell	106a4ee258	module: signature checking hook We do a very simple search for a particular string appended to the module (which is cache-hot and about to be SHA'd anyway). There's both a config option and a boot parameter which control whether we accept or fail with unsigned modules and modules that are signed with an unknown key. If module signing is enabled, the kernel will be tainted if a module is loaded that is unsigned or has a signature for which we don't have the key. (Useful feedback and tweaks by David Howells <dhowells@redhat.com>) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-10-10 20:00:55 +10:30
Catalin Marinas	7ac57a89de	Kconfig: clean up the "#if defined(arch)" list for exception-trace sysctl entry Introduce SYSCTL_EXCEPTION_TRACE config option and selec it in the architectures requiring support for the "exception-trace" debug_table entry in kernel/sysctl.c. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:22:14 +09:00
Catalin Marinas	af1839eb4b	Kconfig: clean up the long arch list for the UID16 config option Introduce HAVE_UID16 config option and select it in corresponding architecture Kconfig files. UID16 now only depends on HAVE_UID16. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:22:13 +09:00
David Howells	4520c6a49a	X.509: Add simple ASN.1 grammar compiler Add a simple ASN.1 grammar compiler. This produces a bytecode output that can be fed to a decoder to inform the decoder how to interpret the ASN.1 stream it is trying to parse. Action functions can be specified in the grammar by interpolating: ({ foo }) after a type, for example: SubjectPublicKeyInfo ::= SEQUENCE { algorithm AlgorithmIdentifier, subjectPublicKey BIT STRING ({ do_key_data }) } The decoder is expected to call these after matching this type and parsing the contents if it is a constructed type. The grammar compiler does not currently support the SET type (though it does support SET OF) as I can't see a good way of tracking which members have been encountered yet without using up extra stack space. Currently, the grammar compiler will fail if more than 256 bytes of bytecode would be produced or more than 256 actions have been specified as it uses 8-bit jump values and action indices to keep space usage down. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-10-08 13:50:19 +10:30
Alex Kelly	046d662f48	coredump: make core dump functionality optional Adds an expert Kconfig option, CONFIG_COREDUMP, which allows disabling of core dump. This saves approximately 2.6k in the compiled kernel, and complements CONFIG_ELF_CORE, which now depends on it. CONFIG_COREDUMP also disables coredump-related sysctls, except for suid_dumpable and related functions, which are necessary for ptrace. [akpm@linux-foundation.org: fix binfmt_aout.c build] Signed-off-by: Alex Kelly <alex.page.kelly@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-06 03:05:15 +09:00
Andi Kleen	754b7b63d1	sections: disable const sections for PA-RISC v2 The PA-RISC tool chain seems to have some problem with correct read/write attributes on sections. This causes problems when the const sections are fixed up for other architecture to only contain truly read-only data. Disable const sections for PA-RISC This can cause a bit of noise with modpost. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-06 03:04:37 +09:00
Linus Torvalds	437589a74b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "This is a mostly modest set of changes to enable basic user namespace support. This allows the code to code to compile with user namespaces enabled and removes the assumption there is only the initial user namespace. Everything is converted except for the most complex of the filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs, nfs, ocfs2 and xfs as those patches need a bit more review. The strategy is to push kuid_t and kgid_t values are far down into subsystems and filesystems as reasonable. Leaving the make_kuid and from_kuid operations to happen at the edge of userspace, as the values come off the disk, and as the values come in from the network. Letting compile type incompatible compile errors (present when user namespaces are enabled) guide me to find the issues. The most tricky areas have been the places where we had an implicit union of uid and gid values and were storing them in an unsigned int. Those places were converted into explicit unions. I made certain to handle those places with simple trivial patches. Out of that work I discovered we have generic interfaces for storing quota by projid. I had never heard of the project identifiers before. Adding full user namespace support for project identifiers accounts for most of the code size growth in my git tree. Ultimately there will be work to relax privlige checks from "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing root in a user names to do those things that today we only forbid to non-root users because it will confuse suid root applications. While I was pushing kuid_t and kgid_t changes deep into the audit code I made a few other cleanups. I capitalized on the fact we process netlink messages in the context of the message sender. I removed usage of NETLINK_CRED, and started directly using current->tty. Some of these patches have also made it into maintainer trees, with no problems from identical code from different trees showing up in linux-next. After reading through all of this code I feel like I might be able to win a game of kernel trivial pursuit." Fix up some fairly trivial conflicts in netfilter uid/git logging code. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits) userns: Convert the ufs filesystem to use kuid/kgid where appropriate userns: Convert the udf filesystem to use kuid/kgid where appropriate userns: Convert ubifs to use kuid/kgid userns: Convert squashfs to use kuid/kgid where appropriate userns: Convert reiserfs to use kuid and kgid where appropriate userns: Convert jfs to use kuid/kgid where appropriate userns: Convert jffs2 to use kuid and kgid where appropriate userns: Convert hpfs to use kuid and kgid where appropriate userns: Convert btrfs to use kuid/kgid where appropriate userns: Convert bfs to use kuid/kgid where appropriate userns: Convert affs to use kuid/kgid wherwe appropriate userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids userns: On ia64 deal with current_uid and current_gid being kuid and kgid userns: On ppc convert current_uid from a kuid before printing. userns: Convert s390 getting uid and gid system calls to use kuid and kgid userns: Convert s390 hypfs to use kuid and kgid where appropriate userns: Convert binder ipc to use kuids userns: Teach security_path_chown to take kuids and kgids userns: Add user namespace support to IMA userns: Convert EVM to deal with kuids and kgids in it's hmac computation ...	2012-10-02 11:11:09 -07:00
Linus Torvalds	06d2fe153b	Driver core merge for 3.7-rc1 Here is the big driver core update for 3.7-rc1. A number of firmware_class.c updates (as you saw a month or so ago), and some hyper-v updates and some printk fixes as well. All patches that are outside of the drivers/base area have been acked by the respective maintainers, and have all been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAlBp3vkACgkQMUfUDdst+ylQoACgldktGFgkCLzH+rGYthrXOC5P 9hUAnjmOhdoHlMTL81vWTlH+BrGernym =khrr -----END PGP SIGNATURE----- Merge tag 'driver-core-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core merge from Greg Kroah-Hartman: "Here is the big driver core update for 3.7-rc1. A number of firmware_class.c updates (as you saw a month or so ago), and some hyper-v updates and some printk fixes as well. All patches that are outside of the drivers/base area have been acked by the respective maintainers, and have all been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'driver-core-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (95 commits) memory: tegra{20,30}-mc: Fix reading incorrect register in mc_readl() device.h: Add missing inline to #ifndef CONFIG_PRINTK dev_vprintk_emit memory: emif: Add ifdef CONFIG_DEBUG_FS guard for emif_debugfs_[init\|exit] Documentation: Fixes some translation error in Documentation/zh_CN/gpio.txt Documentation: Remove 3 byte redundant code at the head of the Documentation/zh_CN/arm/booting Documentation: Chinese translation of Documentation/video4linux/omap3isp.txt device and dynamic_debug: Use dev_vprintk_emit and dev_printk_emit dev: Add dev_vprintk_emit and dev_printk_emit netdev_printk/netif_printk: Remove a superfluous logging colon netdev_printk/dynamic_netdev_dbg: Directly call printk_emit dev_dbg/dynamic_debug: Update to use printk_emit, optimize stack driver-core: Shut up dev_dbg_reatelimited() without DEBUG tools/hv: Parse /etc/os-release tools/hv: Check for read/write errors tools/hv: Fix exit() error code tools/hv: Fix file handle leak Tools: hv: Implement the KVP verb - KVP_OP_GET_IP_INFO Tools: hv: Rename the function kvp_get_ip_address() Tools: hv: Implement the KVP verb - KVP_OP_SET_IP_INFO Tools: hv: Add an example script to configure an interface ...	2012-10-01 12:10:44 -07:00
Linus Torvalds	81f56e5375	Linux support for the 64-bit ARM architecture (AArch64) Features currently supported: - 39-bit address space for user and kernel (each) - 4KB and 64KB page configurations - Compat (32-bit) user applications (ARMv7, EABI only) - Flattened Device Tree (mandated for all AArch64 platforms) - ARM generic timers -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iQIcBAABAgAGBQJQabRiAAoJEGvWsS0AyF7xXgcQAK+FTXt0ikdQYMkV5AIZXb9i xHRhuiZWx2vKyk0mCqpyGLY58GSmSb6uTBg/2P2Ej7vXdH/RB2goPzjlspfjkDL4 o8RJp7eQ07Uz3KRDYEJgMP8xKZid6KFG93RJ6TjjpKZLuDBdwiG1GP1vb0jVcWfo ttZrj/aI8lMcqrh3Vq5qefP7GWP1OVATqeaGTiT7oo38pXwF3t237xfBr2iDGFBp ZgIRddrxpa7JYUesfJDDDdGHvLq7Vh2jJV+io9qasBZDrtppGJIhZ0vUni2DgIi7 r4i1LcynDN4JaG0maZ4U/YQm74TCD4BqxV8GJ7zwLPTWeN+of+skjhPSLOkA+0fp I+sWjXlv200gDfJZ9qnUld2kFpoDfJi2b7fNDouSDd2OhmVOVWG3jnVP4Z7meVSb O8BYzWDdsAiabuwciUY3OsmW6424lT93b2v86Vncs4unKMvEjOPxYZbUxhqX8f2j gsmWwwD/yS4THx2B6OyW9VT3I5J6miqs2Glt/GG6vPWT5AKQJn9jCxKaBGhPMPIs xe5/GycBYjdk/Y8qRjegxFbEqzQuiRzmkeFn5jwjmBLqpGNbZDpvMaL6adhAKM5/ v6UIKa91ra4fC9N0h6G61pOc9N9DbT8wPbCbdYY0RMTMRuLDZDgAM3Bvz0r2APdD 96leNy6vx684hbkCSLJs =buJB -----END PGP SIGNATURE----- Merge tag 'arm64-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64 Pull arm64 support from Catalin Marinas: "Linux support for the 64-bit ARM architecture (AArch64) Features currently supported: - 39-bit address space for user and kernel (each) - 4KB and 64KB page configurations - Compat (32-bit) user applications (ARMv7, EABI only) - Flattened Device Tree (mandated for all AArch64 platforms) - ARM generic timers" * tag 'arm64-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: (35 commits) arm64: ptrace: remove obsolete ptrace request numbers from user headers arm64: Do not set the SMP/nAMP processor bit arm64: MAINTAINERS update arm64: Build infrastructure arm64: Miscellaneous header files arm64: Generic timers support arm64: Loadable modules arm64: Miscellaneous library functions arm64: Performance counters support arm64: Add support for /proc/sys/debug/exception-trace arm64: Debugging support arm64: Floating point and SIMD arm64: 32-bit (compat) applications support arm64: User access library functions arm64: Signal handling support arm64: VDSO support arm64: System calls handling arm64: ELF definitions arm64: SMP support arm64: DMA mapping API ...	2012-10-01 11:51:57 -07:00
Linus Torvalds	0b981cb94b	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: "Continued quest to clean up and enhance the cputime code by Frederic Weisbecker, in preparation for future tickless kernel features. Other than that, smallish changes." Fix up trivial conflicts due to additions next to each other in arch/{x86/}Kconfig * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) cputime: Make finegrained irqtime accounting generally available cputime: Gather time/stats accounting config options into a single menu ia64: Reuse system and user vtime accounting functions on task switch ia64: Consolidate user vtime accounting vtime: Consolidate system/idle context detection cputime: Use a proper subsystem naming for vtime related APIs sched: cpu_power: enable ARCH_POWER sched/nohz: Clean up select_nohz_load_balancer() sched: Fix load avg vs. cpu-hotplug sched: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW sched: Fix nohz_idle_balance() sched: Remove useless code in yield_to() sched: Add time unit suffix to sched sysctl knobs sched/debug: Limit sd->*_idx range on sysctl sched: Remove AFFINE_WAKEUPS feature flag s390: Remove leftover account_tick_vtime() header cputime: Consolidate vtime handling on context switch sched: Move cputime code to its own file cputime: Generalize CONFIG_VIRT_CPU_ACCOUNTING tile: Remove SD_PREFER_LOCAL leftover ...	2012-10-01 10:43:39 -07:00
Frederic Weisbecker	1fd2b4425a	rcu: Userspace RCU extended QS selftest Provide a config option that enables the userspace RCU extended quiescent state on every CPUs by default. This is for testing purpose. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2012-09-26 15:47:16 +02:00
Frederic Weisbecker	2b1d5024e1	rcu: Settle config for userspace extended quiescent state Create a new config option under the RCU menu that put CPUs under RCU extended quiescent state (as in dynticks idle mode) when they run in userspace. This require some contribution from architectures to hook into kernel and userspace boundaries. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2012-09-26 15:44:04 +02:00
Frederic Weisbecker	fdf9c35650	cputime: Make finegrained irqtime accounting generally available There is no known reason for this option to be unavailable on other archs than x86. They just need to call enable_sched_clock_irqtime() if they have a sufficiently finegrained clock to make it working. Move it to the general option and let the user choose between it and pure tick based or virtual cputime accounting. Note that virtual cputime accounting already performs a finegrained irqtime accounting. CONFIG_IRQ_TIME_ACCOUNTING is a kind of middle ground between tick and virtual based accounting. So CONFIG_IRQ_TIME_ACCOUNTING and CONFIG_VIRT_CPU_ACCOUNTING are mutually exclusive choices. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>	2012-09-25 16:01:36 +02:00
Frederic Weisbecker	391dc69c68	cputime: Gather time/stats accounting config options into a single menu This debloats a bit the general config menu and make these config options easier to find. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>	2012-09-25 15:43:00 +02:00
Eric W. Biederman	7223546586	userns: Convert the ufs filesystem to use kuid/kgid where appropriate Cc: Evgeniy Dushistov <dushistov@mail.ru> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 04:28:00 -07:00
Eric W. Biederman	c2ba138a27	userns: Convert the udf filesystem to use kuid/kgid where appropriate Cc: Jan Kara <jack@suse.cz> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 04:18:54 -07:00
Eric W. Biederman	39241beb78	userns: Convert ubifs to use kuid/kgid Cc: Artem Bityutskiy <dedekind1@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:36 -07:00
Eric W. Biederman	61293ee274	userns: Convert squashfs to use kuid/kgid where appropriate Cc: Phillip Lougher <phillip@squashfs.org.uk> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:35 -07:00
Eric W. Biederman	df814654f3	userns: Convert reiserfs to use kuid and kgid where appropriate Cc: reiserfs-devel@vger.kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:34 -07:00
Eric W. Biederman	c18cdc1a3e	userns: Convert jfs to use kuid/kgid where appropriate Cc: Dave Kleikamp <shaggy@kernel.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:33 -07:00
Eric W. Biederman	0cfe53d3c3	userns: Convert jffs2 to use kuid and kgid where appropriate - General routine uid/gid conversion work - When storing posix acls treat ACL_USER and ACL_GROUP separately so I can call from_kuid or from_kgid as appropriate. - When reading posix acls treat ACL_USER and ACL_GROUP separately so I can call make_kuid or make_kgid as appropriate. Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:33 -07:00
Eric W. Biederman	0e1a43c716	userns: Convert hpfs to use kuid and kgid where appropriate Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:32 -07:00
Eric W. Biederman	2f2f43d3c7	userns: Convert btrfs to use kuid/kgid where appropriate Cc: Chris Mason <chris.mason@fusionio.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:31 -07:00
Eric W. Biederman	7f5b82b835	userns: Convert bfs to use kuid/kgid where appropriate Cc: "Tigran A. Aivazian" <tigran@aivazian.fsnet.co.uk> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:31 -07:00
Eric W. Biederman	8fed10be00	userns: Convert affs to use kuid/kgid wherwe appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:30 -07:00
Eric W. Biederman	4a2ebb93bf	userns: Convert binder ipc to use kuids Cc: Arve Hjønnevåg <arve@android.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:26 -07:00
Eric W. Biederman	8b94eea4bf	userns: Add user namespace support to IMA Use kuid's in the IMA rules. When reporting the current uid in audit logs use from_kuid to get a usable value. Cc: Mimi Zohar <zohar@us.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:24 -07:00
Eric W. Biederman	cf9c93526f	userns: Convert EVM to deal with kuids and kgids in it's hmac computation Cc: Mimi Zohar <zohar@us.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:24 -07:00
Eric W. Biederman	29f82ae56e	userns: Convert hostfs to use kuid and kgid where appropriate Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:23 -07:00
Eric W. Biederman	609fcd1b3a	userns: Convert tomoyo to use kuid and kgid where appropriate Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:22 -07:00
Eric W. Biederman	2db8145293	userns: Convert apparmor to use kuid and kgid where appropriate Cc: John Johansen <john.johansen@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:21 -07:00
Eric W. Biederman	e4849737f7	userns: Convert loop to use kuid_t instead of uid_t Cc: Jens Axboe <jaxboe@fusionio.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:20 -07:00
Eric W. Biederman	d03ca5820d	userns: Convert ipathfs to use GLOBAL_ROOT_UID and GLOBAL_ROOT_GID Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:19 -07:00
Eric W. Biederman	2f83ffa874	userns: Convert freevxfs to use kuid/kgid where appropriate Cc: Christoph Hellwig <hch@infradead.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:19 -07:00
Eric W. Biederman	a726ecce75	userns: Convert the sysv filesystem to use kuid/kgid where appropriate Cc: Christoph Hellwig <hch@infradead.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:18 -07:00
Eric W. Biederman	85a03d1bba	userns: Convert the qnx6 filesystem to use kuid/kgid where appropriate Cc: Kai Bankett <chaosman@ontika.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:17 -07:00
Eric W. Biederman	511728d778	userns: Convert the qnx4 filesystem to use kuid/kgid where appropriate Acked-by: Anders Larsen <al@alarsen.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:17 -07:00
Eric W. Biederman	80fcbe751f	userns: Convert omfs to use kuid and kgid where appropriate Acked-by: Bob Copeland <me@bobcopeland.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:16 -07:00
Eric W. Biederman	b29f7751c9	userns: Convert ntfs to use kuid and kgid where appropriate Cc: Anton Altaparmakov <anton@tuxera.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:15 -07:00
Eric W. Biederman	305d3d0dbc	userns: Convert nillfs2 to use kuid/kgid where appropriate Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:15 -07:00
Eric W. Biederman	f303bdc55e	userns: Convert minix to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:14 -07:00
Eric W. Biederman	1a0a994ebe	userns: Convert logfs to use kuid/kgid where appropriate Cc: Joern Engel <joern@logfs.org> Cc: Prasad Joshi <prasadjoshi.linux@gmail.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:13 -07:00
Eric W. Biederman	ba64e2b9e3	userns: Convert isofs to use kuid/kgid where appropriate Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:12 -07:00
Eric W. Biederman	16525e3f14	userns: Convert hfsplus to use kuid and kgid where appropriate Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:12 -07:00
Eric W. Biederman	43b5e4ccd4	userns: Convert hfs to use kuid and kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:11 -07:00
Eric W. Biederman	d001b05365	userns: Convert exofs to use kuid/kgid where appropriate Cc: Benny Halevy <bhalevy@tonian.com> Acked-by: Boaz Harrosh <bharrosh@panasas.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:10 -07:00
Eric W. Biederman	5d4ea4da6a	userns: Convert efs to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:10 -07:00
Eric W. Biederman	cdf8c58a35	userns: Convert ecryptfs to use kuid/kgid where appropriate Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Dustin Kirkland <dustin.kirkland@gazzang.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:09 -07:00
Eric W. Biederman	a7d9cfe97b	userns: Convert cramfs to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:08 -07:00
Eric W. Biederman	31aba059bb	userns: Convert befs to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:08 -07:00
Eric W. Biederman	c010d1ff4f	userns: Convert adfs to use kuid and kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:07 -07:00
Eric W. Biederman	9a11f4513c	userns: Convert xenfs to use kuid and kgid where appropriate Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:06 -07:00
Eric W. Biederman	a0eb3a05a8	userns: Convert hugetlbfs to use kuid/kgid where appropriate Note sysctl_hugetlb_shm_group can only be written in the root user in the initial user namespace, so we can assume sysctl_hugetlb_shm_group is in the initial user namespace. Cc: William Irwin <wli@holomorphy.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:05 -07:00
Eric W. Biederman	91fa2ccaa8	userns: Convert devtmpfs to use GLOBAL_ROOT_UID and GLOBAL_ROOT_GID Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:05 -07:00
Eric W. Biederman	b9b73f7c4d	userns: Convert usb functionfs to use kuid/kgid where appropriate Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Felipe Balbi <balbi@ti.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:12:51 -07:00
Eric W. Biederman	32d639c66e	userns: Convert gadgetfs to use kuid and kgid where appropriate Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Felipe Balbi <balbi@ti.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:12:16 -07:00
Eric W. Biederman	170782eb89	userns: Convert fat to use kuid/kgid where appropriate Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-20 06:11:55 -07:00
Eric W. Biederman	1a06d420ce	userns: Convert quota Now that the type changes are done, here is the final set of changes to make the quota code work when user namespaces are enabled. Small cleanups and fixes to make the code build when user namespaces are enabled. Cc: Jan Kara <jack@suse.cz> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-09-18 01:01:42 -07:00
Eric W. Biederman	431f19744d	userns: Convert quota netlink aka quota_send_warning Modify quota_send_warning to take struct kqid instead a type and identifier pair. When sending netlink broadcasts always convert uids and quota identifiers into the intial user namespace. There is as yet no way to send a netlink broadcast message with different contents to receivers in different namespaces, so for the time being just map all of the identifiers into the initial user namespace which preserves the current behavior. Change the callers of quota_send_warning in gfs2, xfs and dquot to generate a struct kqid to pass to quota send warning. When all of the user namespaces convesions are complete a struct kqid values will be availbe without need for conversion, but a conversion is needed now to avoid needing to convert everything at once. Cc: Ben Myers <bpm@sgi.com> Cc: Alex Elder <elder@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-09-18 01:01:40 -07:00
Eric W. Biederman	74a8a10378	userns: Convert qutoactl Update the quotactl user space interface to successfull compile with user namespaces support enabled and to hand off quota identifiers to lower layers of the kernel in struct kqid instead of type and qid pairs. The quota on function is not converted because while it takes a quota type and an id. The id is the on disk quota format to use, which is something completely different. The signature of two struct quotactl_ops methods were changed to take struct kqid argumetns get_dqblk and set_dqblk. The dquot, xfs, and ocfs2 implementations of get_dqblk and set_dqblk are minimally changed so that the code continues to work with the change in parameter type. This is the first in a series of changes to always store quota identifiers in the kernel in struct kqid and only use raw type and qid values when interacting with on disk structures or userspace. Always using struct kqid internally makes it hard to miss places that need conversion to or from the kernel internal values. Cc: Jan Kara <jack@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Ben Myers <bpm@sgi.com> Cc: Alex Elder <elder@kernel.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-09-18 01:01:39 -07:00
Eric W. Biederman	69552c0c50	userns: Convert configfs to use kuid and kgid where appropriate Cc: Joel Becker <jlbec@evilplan.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-18 01:01:37 -07:00
Eric W. Biederman	af84df93ff	userns: Convert extN to support kuids and kgids in posix acls Convert ext2, ext3, and ext4 to fully support the posix acl changes, using e_uid e_gid instead e_id. Enabled building with posix acls enabled, all filesystems supporting user namespaces, now also support posix acls when user namespaces are enabled. Cc: Theodore Tso <tytso@mit.edu> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-18 01:01:36 -07:00
Eric W. Biederman	d20b92ab66	userns: Teach trace to use from_kuid - When tracing capture the kuid. - When displaying the data to user space convert the kuid into the user namespace of the process that opened the report file. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-18 01:01:34 -07:00
Eric W. Biederman	f8f3d4de2d	userns: Convert bsd process accounting to use kuid and kgid where appropriate BSD process accounting conveniently passes the file the accounting records will be written into to do_acct_process. The file credentials captured the user namespace of the opener of the file. Use the file credentials to format the uid and the gid of the current process into the user namespace of the user that started the bsd process accounting. Cc: Pavel Emelyanov <xemul@openvz.org> Reviewed-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-18 01:01:33 -07:00
Eric W. Biederman	4bd6e32ace	userns: Convert taskstats to handle the user and pid namespaces. - Explicitly limit exit task stat broadcast to the initial user and pid namespaces, as it is already limited to the initial network namespace. - For broadcast task stats explicitly generate all of the idenitiers in terms of the initial user namespace and the initial pid namespace. - For request stats report them in terms of the current user namespace and the current pid namespace. Netlink messages are delivered syncrhonously to the kernel allowing us to get the user namespace and the pid namespace from the current task. - Pass the namespaces for representing pids and uids and gids into bacct_add_task. Cc: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-18 01:01:32 -07:00
Eric W. Biederman	cca080d9b6	userns: Convert audit to work with user namespaces enabled - Explicitly format uids gids in audit messges in the initial user namespace. This is safe because auditd is restrected to be in the initial user namespace. - Convert audit_sig_uid into a kuid_t. - Enable building the audit code and user namespaces at the same time. The net result is that the audit subsystem now uses kuid_t and kgid_t whenever possible making it almost impossible to confuse a raw uid_t with a kuid_t preventing bugs. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-18 01:00:26 -07:00
Catalin Marinas	8c2c3df31e	arm64: Build infrastructure This patch adds Makefile and Kconfig files required for building an AArch64 kernel. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Tony Lindgren <tony@atomide.com> Acked-by: Nicolas Pitre <nico@linaro.org> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Acked-by: Arnd Bergmann <arnd@arndb.de>	2012-09-17 13:42:21 +01:00
Eric W. Biederman	c6089735e7	userns: net: Call key_alloc with GLOBAL_ROOT_UID, GLOBAL_ROOT_GID instead of 0, 0 In net/dns_resolver/dns_key.c and net/rxrpc/ar-key.c make them work with user namespaces enabled where key_alloc takes kuids and kgids. Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID instead of bare 0's. Cc: Sage Weil <sage@inktank.com> Cc: ceph-devel@vger.kernel.org Cc: David Howells <dhowells@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: linux-afs@lists.infradead.org Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-13 18:28:04 -07:00
Eric W. Biederman	9a56c2db49	userns: Convert security/keys to the new userns infrastructure - Replace key_user ->user_ns equality checks with kuid_has_mapping checks. - Use from_kuid to generate key descriptions - Use kuid_t and kgid_t and the associated helpers instead of uid_t and gid_t - Avoid potential problems with file descriptor passing by displaying keys in the user namespace of the opener of key status proc files. Cc: linux-security-module@vger.kernel.org Cc: keyrings@linux-nfs.org Cc: David Howells <dhowells@redhat.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-13 18:28:02 -07:00
Eric W. Biederman	5fce5e0bbd	userns: Convert drm to use kuid and kgid and struct pid where appropriate Blink Blink this had not been converted to use struct pid ages ago? - On drm open capture the openers kuid and struct pid. - On drm close release the kuid and struct pid - When reporting the uid and pid convert the kuid and struct pid into values in the appropriate namespace. Cc: dri-devel@lists.freedesktop.org Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-13 14:32:24 -07:00
Eric W. Biederman	1efdb69b0b	userns: Convert ipc to use kuid and kgid where appropriate - Store the ipc owner and creator with a kuid - Store the ipc group and the crators group with a kgid. - Add error handling to ipc_update_perms, allowing it to fail if the uids and gids can not be converted to kuids or kgids. - Modify the proc files to display the ipc creator and owner in the user namespace of the opener of the proc file. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-06 22:17:20 -07:00
Eric W. Biederman	9582d90196	userns: Convert process event connector to handle kuids and kgids - Only allow asking for events from the initial user and pid namespace, where we generate the events in. - Convert kuids and kgids into the initial user namespace to report them via the process event connector. Cc: David Miller <davem@davemloft.net> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-06 19:37:10 -07:00
Eric W. Biederman	7dc05881b6	userns: Convert debugfs to use kuid/kgid where appropriate. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-06 19:02:52 -07:00
Greg Kroah-Hartman	45f035ab9b	CONFIG_HOTPLUG should be always on CONFIG_HOTPLUG is a very old option, back when we had static systems and it was odd that any type of device would be removed or added after the system had started up. It is quite hard to disable it these days, and even if you do, it only saves you about 200 bytes. However, if it is disabled, lots of bugs show up because it is almost never tested if the option is disabled. This is a step to eventually just remove the option entirely, which will clean up all of the devinit* variable and function pointer options, that everyone (myself include) ends up getting wrong eventually, causing real problems when memory segments are removed yet we don't expect them to be. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Bjorn Helgaas <bhelgaas@google.com>	2012-09-06 13:26:16 -07:00
Eric W. Biederman	c9235f4872	userns: Make credential debugging user namespace safe. Cc: David Howells <dhowells@redhat.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-23 22:54:18 -07:00
Eric W. Biederman	bc45dae323	userns: Enable building of pf_key sockets when user namespace support is enabled. Enable building of pf_key sockets and user namespace support at the same time. This combination builds successfully so there is no reason to forbid it. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-08-23 22:52:54 -07:00
Frederic Weisbecker	b952741c80	cputime: Generalize CONFIG_VIRT_CPU_ACCOUNTING S390, ia64 and powerpc all define their own version of CONFIG_VIRT_CPU_ACCOUNTING. Generalize the config and its description to a single place to avoid duplication. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>	2012-08-17 16:31:08 +02:00
Eric W. Biederman	0625c883bc	userns: Convert tun/tap to use kuid and kgid where appropriate Cc: Maxim Krasnyansky <maxk@qualcomm.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:31 -07:00
Eric W. Biederman	1efa29cd41	userns: Make the airo wireless driver use kuids for proc uids and gids Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: John W. Linville <linville@tuxdriver.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:31 -07:00
Eric W. Biederman	26711a791e	userns: xt_owner: Add basic user namespace support. - Only allow adding matches from the initial user namespace - Add the appropriate conversion functions to handle matches against sockets in other user namespaces. Cc: Jan Engelhardt <jengelh@medozas.de> Cc: Patrick McHardy <kaber@trash.net> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:30 -07:00
Eric W. Biederman	da7428080a	userns xt_recent: Specify the owner/group of ip_list_perms in the initial user namespace xt_recent creates a bunch of proc files and initializes their uid and gids to the values of ip_list_uid and ip_list_gid. When initialize those proc files convert those values to kuids so they can continue to reside on the /proc inode. Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Jan Engelhardt <jengelh@medozas.de> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:29 -07:00
Eric W. Biederman	8c6e2a941a	userns: Convert xt_LOG to print socket kuids and kgids as uids and gids xt_LOG always writes messages via sb_add via printk. Therefore when xt_LOG logs the uid and gid of a socket a packet came from the values should be converted to be in the initial user namespace. Thus making xt_LOG as user namespace safe as possible. Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:29 -07:00
Eric W. Biederman	a6c6796c71	userns: Convert cls_flow to work with user namespaces enabled The flow classifier can use uids and gids of the sockets that are transmitting packets and do insert those uids and gids into the packet classification calcuation. I don't fully understand the details but it appears that we can depend on specific uids and gids when making traffic classification decisions. To work with user namespaces enabled map from kuids and kgids into uids and gids in the initial user namespace giving raw integer values the code can play with and depend on. To avoid issues of userspace depending on uids and gids in packet classifiers installed from other user namespaces and getting confused deny all packet classifiers that use uids or gids that are not comming from a netlink socket in the initial user namespace. Cc: Patrick McHardy <kaber@trash.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Changli Gao <xiaosuo@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:28 -07:00
Eric W. Biederman	9eea9515cb	userns: nfnetlink_log: Report socket uids in the log sockets user namespace At logging instance creation capture the peer netlink socket's user namespace. Use the captured peer user namespace when reporting socket uids to the peer. The peer socket's user namespace is guaranateed to be valid until the user closes the netlink socket. nfnetlink_log removes instances during the final close of a socket. __build_packet_message does not get called after an instance is destroyed. Therefore it is safe to let the peer netlink socket take care of the user namespace reference counting for us. Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:27 -07:00
Eric W. Biederman	d06ca95643	userns: Teach inet_diag to work with user namespaces Compute the user namespace of the socket that we are replying to and translate the kuids of reported sockets into that user namespace. Cc: Andrew Vagin <avagin@openvz.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:20 -07:00
Eric W. Biederman	d13fda8564	userns: Convert net/ax25 to use kuid_t where appropriate Cc: Ralf Baechle <ralf@linux-mips.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:49:42 -07:00
Eric W. Biederman	4f82f45730	net ip6 flowlabel: Make owner a union of struct pid * and kuid_t Correct a long standing omission and use struct pid in the owner field of struct ip6_flowlabel when the share type is IPV6_FL_S_PROCESS. This guarantees we don't have issues when pid wraparound occurs. Use a kuid_t in the owner field of struct ip6_flowlabel when the share type is IPV6_FL_S_USER to add user namespace support. In /proc/net/ip6_flowlabel capture the current pid namespace when opening the file and release the pid namespace when the file is closed ensuring we print the pid owner value that is meaning to the reader of the file. Similarly use from_kuid_munged to print uid values that are meaningful to the reader of the file. This requires exporting pid_nr_ns so that ipv6 can continue to built as a module. Yoiks what silliness Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:49:25 -07:00
Eric W. Biederman	7064d16e16	userns: Use kgids for sysctl_ping_group_range - Store sysctl_ping_group_range as a paire of kgid_t values instead of a pair of gid_t values. - Move the kgid conversion work from ping_init_sock into ipv4_ping_group_range - For invalid cases reset to the default disabled state. With the kgid_t conversion made part of the original value sanitation from userspace understand how the code will react becomes clearer and it becomes possible to set the sysctl ping group range from something other than the initial user namespace. Cc: Vasiliy Kulikov <segoon@openwall.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:49:10 -07:00
Eric W. Biederman	a7cb5a49bf	userns: Print out socket uids in a user namespace aware fashion. Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Sridhar Samudrala <sri@us.ibm.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:48:06 -07:00
Eric W. Biederman	fc5795c8a9	userns: Allow USER_NS and NET simultaneously in Kconfig Now that the networking core is user namespace safe allow networking and user namespaces to be built at the same time. Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:47:45 -07:00
Eric W. Biederman	d755586052	userns: Allow the usernamespace support to build after the removal of usbfs The user namespace code has an explicit "depends on USB_DEVICEFS = n" dependency to prevent building code that is not yet user namespace safe. With the removal of usbfs from the kernel it is now impossible to satisfy the USB_DEFICEFS = n dependency and thus it is impossible to enable user namespace support in 3.5-rc1. So remove the now useless depedency. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-03 08:28:01 -07:00
Andrew Morton	c255a45805	memcg: rename config variables Sanity: CONFIG_CGROUP_MEM_RES_CTLR -> CONFIG_MEMCG CONFIG_CGROUP_MEM_RES_CTLR_SWAP -> CONFIG_MEMCG_SWAP CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED -> CONFIG_MEMCG_SWAP_ENABLED CONFIG_CGROUP_MEM_RES_CTLR_KMEM -> CONFIG_MEMCG_KMEM [mhocko@suse.cz: fix missed bits] Cc: Glauber Costa <glommer@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-31 18:42:43 -07:00
Aneesh Kumar K.V	2bc64a2046	mm/hugetlb: add new HugeTLB cgroup Implement a new controller that allows us to control HugeTLB allocations. The extension allows to limit the HugeTLB usage per control group and enforces the controller limit during page fault. Since HugeTLB doesn't support page reclaim, enforcing the limit at page fault time implies that, the application will get SIGBUS signal if it tries to access HugeTLB pages beyond its limit. This requires the application to know beforehand how much HugeTLB pages it would require for its use. The charge/uncharge calls will be added to HugeTLB code in later patch. Support for cgroup removal will be added in later patches. [akpm@linux-foundation.org: s/CONFIG_CGROUP_HUGETLB_RES_CTLR/CONFIG_MEMCG_HUGETLB/g] [akpm@linux-foundation.org: s/CONFIG_MEMCG_HUGETLB/CONFIG_CGROUP_HUGETLB/g] Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Hillf Danton <dhillf@gmail.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-31 18:42:40 -07:00
Will Deacon	8f827a1468	ARM: 7453/1: audit: only allow syscall auditing for pure EABI userspace The audit tools support only EABI userspace and, since there are no AUDIT_ARCH_* defines for the ARM OABI, it makes sense to allow syscall auditing on ARM only for EABI at the moment. Cc: Eric Paris <eparis@redhat.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2012-07-09 17:44:12 +01:00
Linus Torvalds	08615d7d85	Merge branch 'akpm' (Andrew's patch-bomb) Merge misc patches from Andrew Morton: - the "misc" tree - stuff from all over the map - checkpatch updates - fatfs - kmod changes - procfs - cpumask - UML - kexec - mqueue - rapidio - pidns - some checkpoint-restore feature work. Reluctantly. Most of it delayed a release. I'm still rather worried that we don't have a clear roadmap to completion for this work. * emailed from Andrew Morton <akpm@linux-foundation.org>: (78 patches) kconfig: update compression algorithm info c/r: prctl: add ability to set new mm_struct::exe_file c/r: prctl: extend PR_SET_MM to set up more mm_struct entries c/r: procfs: add arg_start/end, env_start/end and exit_code members to /proc/$pid/stat syscalls, x86: add __NR_kcmp syscall fs, proc: introduce /proc/<pid>/task/<tid>/children entry sysctl: make kernel.ns_last_pid control dependent on CHECKPOINT_RESTORE aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() eventfd: change int to __u64 in eventfd_signal() fs/nls: add Apple NLS pidns: make killed children autoreap pidns: use task_active_pid_ns in do_notify_parent rapidio/tsi721: add DMA engine support rapidio: add DMA engine support for RIO data transfers ipc/mqueue: add rbtree node caching support tools/selftests: add mq_perf_tests ipc/mqueue: strengthen checks on mqueue creation ipc/mqueue: correct mq_attr_ok test ipc/mqueue: improve performance of send/recv selftests: add mq_open_tests ...	2012-05-31 18:10:18 -07:00
Randy Dunlap	0a4dd35c67	kconfig: update compression algorithm info There have been new compression algorithms added without updating nearby relevant descriptive text that refers to (a) the number of compression algorithms and (b) the most recent one. Fix these inconsistencies. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Reported-by: <qasdfgtyuiop@gmail.com> Cc: Lasse Collin <lasse.collin@tukaani.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Alain Knaff <alain@knaff.lu> Cc: Albin Tonnerre <albin.tonnerre@free-electrons.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:33 -07:00
Linus Torvalds	0d167518e0	Merge branch 'for-3.5/core' of git://git.kernel.dk/linux-block Merge block/IO core bits from Jens Axboe: "This is a bit bigger on the core side than usual, but that is purely because we decided to hold off on parts of Tejun's submission on 3.4 to give it a bit more time to simmer. As a consequence, it's seen a long cycle in for-next. It contains: - Bug fix from Dan, wrong locking type. - Relax splice gifting restriction from Eric. - A ton of updates from Tejun, primarily for blkcg. This improves the code a lot, making the API nicer and cleaner, and also includes fixes for how we handle and tie policies and re-activate on switches. The changes also include generic bug fixes. - A simple fix from Vivek, along with a fix for doing proper delayed allocation of the blkcg stats." Fix up annoying conflict just due to different merge resolution in Documentation/feature-removal-schedule.txt * 'for-3.5/core' of git://git.kernel.dk/linux-block: (92 commits) blkcg: tg_stats_alloc_lock is an irq lock vmsplice: relax alignement requirements for SPLICE_F_GIFT blkcg: use radix tree to index blkgs from blkcg blkcg: fix blkcg->css ref leak in __blkg_lookup_create() block: fix elvpriv allocation failure handling block: collapse blk_alloc_request() into get_request() blkcg: collapse blkcg_policy_ops into blkcg_policy blkcg: embed struct blkg_policy_data in policy specific data blkcg: mass rename of blkcg API blkcg: style cleanups for blk-cgroup.h blkcg: remove blkio_group->path[] blkcg: blkg_rwstat_read() was missing inline blkcg: shoot down blkgs if all policies are deactivated blkcg: drop stuff unused after per-queue policy activation update blkcg: implement per-queue policy activation blkcg: add request_queue->root_blkg blkcg: make request_queue bypassing on allocation blkcg: make sure blkg_lookup() returns %NULL if @q is bypassing blkcg: make blkg_conf_prep() take @pol and return with queue lock held blkcg: remove static policy ID enums ...	2012-05-30 08:52:42 -07:00
Linus Torvalds	c7523a7c88	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner. Various trivial conflict fixups in arch Kconfig due to addition of unrelated entries nearby. And one slightly more subtle one for sparc32 (new user of GENERIC_CLOCKEVENTS), fixed up as per Thomas. * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) timekeeping: Fix a few minor newline issues. time: remove obsolete declaration ntp: Fix a stale comment and a few stray newlines. ntp: Correct TAI offset during leap second timers: Fixup the Kconfig consolidation fallout x86: Use generic time config unicore32: Use generic time config um: Use generic time config tile: Use generic time config sparc: Use: generic time config sh: Use generic time config score: Use generic time config s390: Use generic time config openrisc: Use generic time config powerpc: Use generic time config mn10300: Use generic time config mips: Use generic time config microblaze: Use generic time config m68k: Use generic time config m32r: Use generic time config ...	2012-05-24 13:29:46 -07:00
Linus Torvalds	644473e9c6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace enhancements from Eric Biederman: "This is a course correction for the user namespace, so that we can reach an inexpensive, maintainable, and reasonably complete implementation. Highlights: - Config guards make it impossible to enable the user namespace and code that has not been converted to be user namespace safe. - Use of the new kuid_t type ensures the if you somehow get past the config guards the kernel will encounter type errors if you enable user namespaces and attempt to compile in code whose permission checks have not been updated to be user namespace safe. - All uids from child user namespaces are mapped into the initial user namespace before they are processed. Removing the need to add an additional check to see if the user namespace of the compared uids remains the same. - With the user namespaces compiled out the performance is as good or better than it is today. - For most operations absolutely nothing changes performance or operationally with the user namespace enabled. - The worst case performance I could come up with was timing 1 billion cache cold stat operations with the user namespace code enabled. This went from 156s to 164s on my laptop (or 156ns to 164ns per stat operation). - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value. Most uid/gid setting system calls treat these value specially anyway so attempting to use -1 as a uid would likely cause entertaining failures in userspace. - If setuid is called with a uid that can not be mapped setuid fails. I have looked at sendmail, login, ssh and every other program I could think of that would call setuid and they all check for and handle the case where setuid fails. - If stat or a similar system call is called from a context in which we can not map a uid we lie and return overflowuid. The LFS experience suggests not lying and returning an error code might be better, but the historical precedent with uids is different and I can not think of anything that would break by lying about a uid we can't map. - Capabilities are localized to the current user namespace making it safe to give the initial user in a user namespace all capabilities. My git tree covers all of the modifications needed to convert the core kernel and enough changes to make a system bootable to runlevel 1." Fix up trivial conflicts due to nearby independent changes in fs/stat.c * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits) userns: Silence silly gcc warning. cred: use correct cred accessor with regards to rcu read lock userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq userns: Convert cgroup permission checks to use uid_eq userns: Convert tmpfs to use kuid and kgid where appropriate userns: Convert sysfs to use kgid/kuid where appropriate userns: Convert sysctl permission checks to use kuid and kgids. userns: Convert proc to use kuid/kgid where appropriate userns: Convert ext4 to user kuid/kgid where appropriate userns: Convert ext3 to use kuid/kgid where appropriate userns: Convert ext2 to use kuid/kgid where appropriate. userns: Convert devpts to use kuid/kgid where appropriate userns: Convert binary formats to use kuid/kgid where appropriate userns: Add negative depends on entries to avoid building code that is userns unsafe userns: signal remove unnecessary map_cred_ns userns: Teach inode_capable to understand inodes whose uids map to other namespaces. userns: Fail exec for suid and sgid binaries with ids outside our user namespace. userns: Convert stat to return values mapped from kuids and kgids userns: Convert user specfied uids and gids in chown into kuids and kgid userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs ...	2012-05-23 17:42:39 -07:00
Linus Torvalds	269af9a1a0	Merge branch 'x86-extable-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull exception table generation updates from Ingo Molnar: "The biggest change here is to allow the build-time sorting of the exception table, to speed up booting. This is achieved by the architecture enabling BUILDTIME_EXTABLE_SORT. This option is enabled for x86 and MIPS currently. On x86 a number of fixes and changes were needed to allow build-time sorting of the exception table, in particular a relocation invariant exception table format was needed. This required the abstracting out of exception table protocol and the removal of 20 years of accumulated assumptions about the x86 exception table format. While at it, this tree also cleans up various other aspects of exception handling, such as early(er) exception handling for rdmsr_safe() et al. All in one, as the result of these changes the x86 exception code is now pretty nice and modern. As an added bonus any regressions in this code will be early and violent crashes, so if you see any of those, you'll know whom to blame!" Fix up trivial conflicts in arch/{mips,x86}/Kconfig files due to nearby modifications of other core architecture options. * 'x86-extable-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits) Revert "x86, extable: Disable presorted exception table for now" scripts/sortextable: Handle relative entries, and other cleanups x86, extable: Switch to relative exception table entries x86, extable: Disable presorted exception table for now x86, extable: Add _ASM_EXTABLE_EX() macro x86, extable: Remove open-coded exception table entries in arch/x86/ia32/ia32entry.S x86, extable: Remove open-coded exception table entries in arch/x86/include/asm/xsave.h x86, extable: Remove open-coded exception table entries in arch/x86/include/asm/kvm_host.h x86, extable: Remove the now-unused __ASM_EX_SEC macros x86, extable: Remove open-coded exception table entries in arch/x86/xen/xen-asm_32.S x86, extable: Remove open-coded exception table entries in arch/x86/um/checksum_32.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/usercopy_32.c x86, extable: Remove open-coded exception table entries in arch/x86/lib/putuser.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/getuser.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/csum-copy_64.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/copy_user_nocache_64.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/copy_user_64.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/checksum_32.S x86, extable: Remove open-coded exception table entries in arch/x86/kernel/test_rodata.c x86, extable: Remove open-coded exception table entries in arch/x86/kernel/entry_64.S ...	2012-05-23 10:44:35 -07:00
Linus Torvalds	2ff2b289a6	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf changes from Ingo Molnar: "Lots of changes: - (much) improved assembly annotation support in perf report, with jump visualization, searching, navigation, visual output improvements and more. - kernel support for AMD IBS PMU hardware features. Notably 'perf record -e cycles:p' and 'perf top -e cycles:p' should work without skid now, like PEBS does on the Intel side, because it takes advantage of IBS transparently. - the libtracevents library: it is the first step towards unifying tracing tooling and perf, and it also gives a tracing library for external tools like powertop to rely on. - infrastructure: various improvements and refactoring of the UI modules and related code - infrastructure: cleanup and simplification of the profiling targets code (--uid, --pid, --tid, --cpu, --all-cpus, etc.) - tons of robustness fixes all around - various ftrace updates: speedups, cleanups, robustness improvements. - typing 'make' in tools/ will now give you a menu of projects to build and a short help text to explain what each does. - ... and lots of other changes I forgot to list. The perf record make bzImage + perf report regression you reported should be fixed." * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (166 commits) tracing: Remove kernel_lock annotations tracing: Fix initial buffer_size_kb state ring-buffer: Merge separate resize loops perf evsel: Create events initially disabled -- again perf tools: Split term type into value type and term type perf hists: Fix callchain ip printf format perf target: Add uses_mmap field ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACER ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code() ftrace: Make ftrace_modify_all_code() global for archs to use ftrace: Return record ip addr for ftrace_location() ftrace: Consolidate ftrace_location() and ftrace_text_reserved() ftrace: Speed up search by skipping pages by address ftrace: Remove extra helper functions ftrace: Sort all function addresses, not just per page tracing: change CPU ring buffer state from tracing_cpumask tracing: Check return value of tracing_dentry_percpu() ring-buffer: Reset head page before running self test ring-buffer: Add integrity check at end of iter read ring-buffer: Make addition of pages in ring buffer atomic ...	2012-05-22 18:18:55 -07:00
Thomas Gleixner	764e0da14f	timers: Fixup the Kconfig consolidation fallout Sigh, I missed to check which architecture Kconfig files actually include the core Kconfig file. There are a few which did not. So we broke them. Instead of adding the includes to those, we are better off to move the include to init/Kconfig like we did already with irqs and others. This does not change anything for the architectures using the old style periodic timer mode. It just solves the build wreckage there. For those architectures which use the clock events infrastructure it moves the include of the core Kconfig file to "General setup" which is a way more logical place than having it at random locations specified by the architecture specific Kconfigs. Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Anna-Maria Gleixner <anna-maria@glx-um.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-05-21 23:43:46 +02:00
Eric W. Biederman	b38a86eb19	userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:30 -07:00
Eric W. Biederman	14a590c3f9	userns: Convert cgroup permission checks to use uid_eq Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:30 -07:00
Eric W. Biederman	8751e03958	userns: Convert tmpfs to use kuid and kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:29 -07:00
Eric W. Biederman	ab27b91b9f	userns: Convert sysfs to use kgid/kuid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:29 -07:00
Eric W. Biederman	091bd3ea4e	userns: Convert sysctl permission checks to use kuid and kgids. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:28 -07:00
Eric W. Biederman	dcb0f22282	userns: Convert proc to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:28 -07:00
Eric W. Biederman	08cefc7ab8	userns: Convert ext4 to user kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:27 -07:00
Eric W. Biederman	1523299d58	userns: Convert ext3 to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:27 -07:00
Eric W. Biederman	b8a9f9e183	userns: Convert ext2 to use kuid/kgid where appropriate. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:26 -07:00
Eric W. Biederman	f04c6ce2cf	userns: Convert devpts to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:26 -07:00
Eric W. Biederman	ebc887b278	userns: Convert binary formats to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:25 -07:00
Eric W. Biederman	e1c972b681	userns: Add negative depends on entries to avoid building code that is userns unsafe Add a new internal Kconfig option UIDGID_CONVERTED that is true when the selected Kconfig options have been converted to be user namespace safe, and guard USER_NS and guard the UIDGID_STRICT_TYPE_CHECK options with it. This keeps innocent kernel users from having the choice to enable the user namespace in the cases where it is known not to work. Most of the rest of the conversions are simple and straight forward but their sheer number means it is good not to count on having them all done and reviwed before thinking of merging this code. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:25 -07:00
Jens Axboe	0b7877d4ee	Linux 3.4-rc5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJPnb50AAoJEHm+PkMAQRiGAE0H/A4zFZIUGmF3miKPDYmejmrZ oVDYxVAu6JHjHWhu8E3VsinvyVscowjV8dr15eSaQzmDmRkSHAnUQ+dB7Di7jLC2 MNopxsWjwyZ8zvvr3rFR76kjbWKk/1GYytnf7GPZLbJQzd51om2V/TY/6qkwiDSX U8Tt7ihSgHAezefqEmWp2X/1pxDCEt+VFyn9vWpkhgdfM1iuzF39MbxSZAgqDQ/9 JJrBHFXhArqJguhENwL7OdDzkYqkdzlGtS0xgeY7qio2CzSXxZXK4svT6FFGA8Za xlAaIvzslDniv3vR2ZKd6wzUwFHuynX222hNim3QMaYdXm012M+Nn1ufKYGFxI0= =4d4w -----END PGP SIGNATURE----- Merge tag 'v3.4-rc5' into for-3.5/core The core branch is behind driver commits that we want to build on for 3.5, hence I'm pulling in a later -rc. Linux 3.4-rc5 Conflicts: Documentation/feature-removal-schedule.txt Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-05-01 14:29:55 +02:00
Robert Richter	392d65a9ad	perf: Remove PERF_COUNTERS config option Renaming remaining PERF_COUNTERS options into PERF_EVENTS. Think we can get rid of PERF_COUNTERS now. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333643084-26776-5-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-26 13:52:52 +02:00
Paul E. McKenney	8932a63d5e	rcu: Reduce cache-miss initialization latencies for large systems Commit #0209f649 (rcu: limit rcu_node leaf-level fanout) set an upper limit of 16 on the leaf-level fanout for the rcu_node tree. This was needed to reduce lock contention that was induced by the synchronization of scheduling-clock interrupts, which was in turn needed to improve energy efficiency for moderate-sized lightly loaded servers. However, reducing the leaf-level fanout means that there are more leaf-level rcu_node structures in the tree, which in turn means that RCU's grace-period initialization incurs more cache misses. This is not a problem on moderate-sized servers with only a few tens of CPUs, but becomes a major source of real-time latency spikes on systems with many hundreds of CPUs. In addition, the workloads running on these large systems tend to be CPU-bound, which eliminates the energy-efficiency advantages of synchronizing scheduling-clock interrupts. Therefore, these systems need maximal values for the rcu_node leaf-level fanout. This commit addresses this problem by introducing a new kernel parameter named RCU_FANOUT_LEAF that directly controls the leaf-level fanout. This parameter defaults to 16 to handle the common case of a moderate sized lightly loaded servers, but may be set higher on larger systems. Reported-by: Mike Galbraith <efault@gmx.de> Reported-by: Dimitri Sivanich <sivanich@sgi.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-24 20:54:52 -07:00
Paul E. McKenney	c9336643e1	rcu: Clarify help text for RCU_BOOST_PRIO The old text confused real-time applications with real-time threads, so that you pretty much needed to understand how this kernel configuration parameter worked to understand the help text. This commit therefore attempts to make the help text human-readable. Reported-by: Jörn Engel <joern@purestorage.com> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-24 20:54:50 -07:00
David Daney	1dbdc6f177	kbuild/extable: Hook up sortextable into the build system. Define a config variable BUILDTIME_EXTABLE_SORT to control build time sorting of the kernel's exception table. Patch Makefile to do the sorting when BUILDTIME_EXTABLE_SORT is selected. Signed-off-by: David Daney <david.daney@cavium.com> Link: http://lkml.kernel.org/r/1334872799-14589-4-git-send-email-ddaney.cavm@gmail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-04-19 15:06:56 -07:00
Eric W. Biederman	5673a94c14	userns: Add a Kconfig option to enforce strict kuid and kgid type checks Make it possible to easily switch between strong mandatory type checks and relaxed type checks so that the code can easily be tested with the type checks and then built with the strong type checks disabled so the resulting code can be used. Require strong mandatory type checks when enabling the user namespace. It is very simple to make a typo and use the wrong type allowing conversions to/from userspace values to be bypassed by accident, the strong type checks prevent this. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-04-07 17:11:01 -07:00
Tejun Heo	959d851caa	Merge branch 'for-3.5' of ../cgroup into block/for-3.5/core-merged cgroup/for-3.5 contains the following changes which blk-cgroup needs to proceed with the on-going cleanup. * Dynamic addition and removal of cftypes to make config/stat file handling modular for policies. * cgroup removal update to not wait for css references to drain to fix blkcg removal hang caused by cfq caching cfqgs. Pull in cgroup/for-3.5 into block/for-3.5/core. This causes the following conflicts in block/blk-cgroup.c. * `761b3ef50e` "cgroup: remove cgroup_subsys argument from callbacks" conflicts with blkiocg_pre_destroy() addition and blkiocg_attach() removal. Resolved by removing @subsys from all subsys methods. * `676f7c8f84` "cgroup: relocate cftype and cgroup_subsys definitions in controllers" conflicts with ->pre_destroy() and ->attach() updates and removal of modular config. Resolved by dropping forward declarations of the methods and applying updates to the relocated blkio_subsys. * `4baf6e3325` "cgroup: convert all non-memcg controllers to the new cftype interface" builds upon the previous item. Resolved by adding ->base_cftypes to the relocated blkio_subsys. Signed-off-by: Tejun Heo <tj@kernel.org>	2012-04-01 12:55:00 -07:00
Rusty Russell	5f054e31c6	documentation: remove references to cpu_*_map. This has been obsolescent for a while, fix documentation and misc comments. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-03-29 15:38:31 +10:30
Tejun Heo	32e380aedc	blkcg: make CONFIG_BLK_CGROUP bool Block cgroup core can be built as module; however, it isn't too useful as blk-throttle can only be built-in and cfq-iosched is usually the default built-in scheduler. Scheduled blkcg cleanup requires calling into blkcg from block core. To simplify that, disallow building blkcg as module by making CONFIG_BLK_CGROUP bool. If building blkcg core as module really matters, which I doubt, we can revisit it after blkcg API cleanup. -v2: Vivek pointed out that IOSCHED_CFQ was incorrectly updated to depend on BLK_CGROUP. Fixed. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-03-06 21:27:21 +01:00
Paul E. McKenney	5c8806a037	rcu: Move RCU_TRACE to lib/Kconfig.debug The RCU_TRACE kernel parameter has always been intended for debugging, not for production use. Formalize this by moving RCU_TRACE from init/Kconfig to lib/Kconfig.debug. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-02-21 09:03:26 -08:00
Linus Torvalds	f429ee3b80	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit: (29 commits) audit: no leading space in audit_log_d_path prefix audit: treat s_id as an untrusted string audit: fix signedness bug in audit_log_execve_info() audit: comparison on interprocess fields audit: implement all object interfield comparisons audit: allow interfield comparison between gid and ogid audit: complex interfield comparison helper audit: allow interfield comparison in audit rules Kernel: Audit Support For The ARM Platform audit: do not call audit_getname on error audit: only allow tasks to set their loginuid if it is -1 audit: remove task argument to audit_set_loginuid audit: allow audit matching on inode gid audit: allow matching on obj_uid audit: remove audit_finish_fork as it can't be called audit: reject entry,always rules audit: inline audit_free to simplify the look of generic code audit: drop audit_set_macxattr as it doesn't do anything audit: inline checks for not needing to collect aux records audit: drop some potentially inadvisable likely notations ... Use evil merge to fix up grammar mistakes in Kconfig file. Bad speling and horrible grammar (and copious swearing) is to be expected, but let's keep it to commit messages and comments, rather than expose it to users in config help texts or printouts.	2012-01-17 16:41:31 -08:00
Nathaniel Husted	29ef73b7a8	Kernel: Audit Support For The ARM Platform This patch provides functionality to audit system call events on the ARM platform. The implementation was based off the structure of the MIPS platform and information in this (http://lists.fedoraproject.org/pipermail/arm/2009-October/000382.html) mailing list thread. The required audit_syscall_exit and audit_syscall_entry checks were added to ptrace using the standard registers for system call values (r0 through r3). A thread information flag was added for auditing (TIF_SYSCALL_AUDIT) and a meta-flag was added (_TIF_SYSCALL_WORK) to simplify modifications to the syscall entry/exit. Now, if either the TRACE flag is set or the AUDIT flag is set, the syscall_trace function will be executed. The prober changes were made to Kconfig to allow CONFIG_AUDITSYSCALL to be enabled. Due to platform availability limitations, this patch was only tested on the Android platform running the modified "android-goldfish-2.6.29" kernel. A test compile was performed using Code Sourcery's cross-compilation toolset and the current linux-3.0 stable kernel. The changes compile without error. I'm hoping, due to the simple modifications, the patch is "obviously correct". Signed-off-by: Nathaniel Husted <nhusted@gmail.com> Signed-off-by: Eric Paris <eparis@redhat.com>	2012-01-17 16:17:01 -05:00
Eric Paris	633b454545	audit: only allow tasks to set their loginuid if it is -1 At the moment we allow tasks to set their loginuid if they have CAP_AUDIT_CONTROL. In reality we want tasks to set the loginuid when they log in and it be impossible to ever reset. We had to make it mutable even after it was once set (with the CAP) because on update and admin might have to restart sshd. Now sshd would get his loginuid and the next user which logged in using ssh would not be able to set his loginuid. Systemd has changed how userspace works and allowed us to make the kernel work the way it should. With systemd users (even admins) are not supposed to restart services directly. The system will restart the service for them. Thus since systemd is going to loginuid==-1, sshd would get -1, and sshd would be allowed to set a new loginuid without special permissions. If an admin in this system were to manually start an sshd he is inserting himself into the system chain of trust and thus, logically, it's his loginuid that should be used! Since we have old systems I make this a Kconfig option. Signed-off-by: Eric Paris <eparis@redhat.com>	2012-01-17 16:17:00 -05:00
Cyrill Gorcunov	067bce1a06	c/r: introduce CHECKPOINT_RESTORE symbol For checkpoint/restore we need auxilary features being compiled into the kernel, such as additional prctl codes, /proc/<pid>/map_files and etc... but same time these features are not mandatory for a regular kernel so CHECKPOINT_RESTORE config symbol should bring a way to disable them all at once if one wish to get rid of additional functionality. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Vagin <avagin@openvz.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: Vasiliy Kulikov <segoon@openwall.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-01-12 20:13:12 -08:00
Linus Torvalds	b8bf17d311	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix lockup by limiting load-balance retries on lock-break sched: Fix CONFIG_CGROUP_SCHED dependency sched: Remove empty #ifdefs	2012-01-11 22:52:48 -08:00
Fabio Estevam	6db9dc150e	sched: Fix CONFIG_CGROUP_SCHED dependency The dependency bug was pointed out by this build warning: warning: (SCHED_AUTOGROUP) selects CGROUP_SCHED which has unmet direct dependencies (CGROUPS && EXPERIMENTAL) Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Cc: a.p.zijlstra@chello.nl Cc: Fabio Estevam <festevam@gmail.com> Link: http://lkml.kernel.org/r/1326192383-5113-1-git-send-email-festevam@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-01-10 13:44:50 +01:00
Linus Torvalds	9753dfe19a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1958 commits) net: pack skb_shared_info more efficiently net_sched: red: split red_parms into parms and vars net_sched: sfq: extend limits cnic: Improve error recovery on bnx2x devices cnic: Re-init dev->stats_addr after chip reset net_sched: Bug in netem reordering bna: fix sparse warnings/errors bna: make ethtool_ops and strings const xgmac: cleanups net: make ethtool_ops const vmxnet3" make ethtool ops const xen-netback: make ops structs const virtio_net: Pass gfp flags when allocating rx buffers. ixgbe: FCoE: Add support for ndo_get_fcoe_hbainfo() call netdev: FCoE: Add new ndo_get_fcoe_hbainfo() call igb: reset PHY after recovering from PHY power down igb: add basic runtime PM support igb: Add support for byte queue limits. e1000: cleanup CE4100 MDIO registers access e1000: unmap ce4100_gbe_mdio_base_virt in e1000_remove ...	2012-01-06 17:22:09 -08:00
Glauber Costa	e5671dfae5	Basic kernel memory functionality for the Memory Controller This patch lays down the foundation for the kernel memory component of the Memory Controller. As of today, I am only laying down the following files: * memory.independent_kmem_limit * memory.kmem.limit_in_bytes (currently ignored) * memory.kmem.usage_in_bytes (always zero) Signed-off-by: Glauber Costa <glommer@parallels.com> CC: Kirill A. Shutemov <kirill@shutemov.name> CC: Paul Menage <paul@paulmenage.org> CC: Greg Thelen <gthelen@google.com> CC: Johannes Weiner <jweiner@redhat.com> CC: Michal Hocko <mhocko@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-12 19:03:55 -05:00
Paul E. McKenney	b807fbff31	rcu: Permit RCU_FAST_NO_HZ to be used by TREE_PREEMPT_RCU The new implementation of RCU_FAST_NO_HZ is compatible with preemptible RCU, so this commit removes the Kconfig restriction that previously prohibited this. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-12-11 10:31:45 -08:00
WANG Cong	c736de60ae	sysctl: make CONFIG_SYSCTL_SYSCALL default to n When I tried to send a patch to remove it, Andi told me we still need to keep compabitlies for old libc, so we can't remove this completely. Then just make it default to n and remove the doc from feature-removal-schedule.txt. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-11-02 16:07:02 -07:00
Linus Torvalds	8a4a8918ed	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) llist: Add back llist_add_batch() and llist_del_first() prototypes sched: Don't use tasklist_lock for debug prints sched: Warn on rt throttling sched: Unify the ->cpus_allowed mask copy sched: Wrap scheduler p->cpus_allowed access sched: Request for idle balance during nohz idle load balance sched: Use resched IPI to kick off the nohz idle balance sched: Fix idle_cpu() llist: Remove cpu_relax() usage in cmpxchg loops sched: Convert to struct llist llist: Add llist_next() irq_work: Use llist in the struct irq_work logic llist: Return whether list is empty before adding in llist_add() llist: Move cpu_relax() to after the cmpxchg() llist: Remove the platform-dependent NMI checks llist: Make some llist functions inline sched, tracing: Show PREEMPT_ACTIVE state in trace_sched_switch sched: Remove redundant test in check_preempt_tick() sched: Add documentation for bandwidth control sched: Return unused runtime on group dequeue ...	2011-10-26 17:08:43 +02:00
Paul E. McKenney	8008e129dc	rcu: Drive configuration directly from SMP and PREEMPT This commit eliminates the possibility of running TREE_PREEMPT_RCU when SMP=n and of running TINY_RCU when PREEMPT=y. People who really want these combinations can hand-edit init/Kconfig, but eliminating them as choices for production systems reduces the amount of testing required. It will also allow cutting out a few #ifdefs. Note that running TREE_RCU and TINY_RCU on single-CPU systems using SMP-built kernels is still supported. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:36:45 -07:00
Paul Turner	ab84d31e15	sched: Introduce primitives to account for CFS bandwidth tracking In this patch we introduce the notion of CFS bandwidth, partitioned into globally unassigned bandwidth, and locally claimed bandwidth. - The global bandwidth is per task_group, it represents a pool of unclaimed bandwidth that cfs_rqs can allocate from. - The local bandwidth is tracked per-cfs_rq, this represents allotments from the global pool bandwidth assigned to a specific cpu. Bandwidth is managed via cgroupfs, adding two new interfaces to the cpu subsystem: - cpu.cfs_period_us : the bandwidth period in usecs - cpu.cfs_quota_us : the cpu bandwidth (in usecs) that this tg will be allowed to consume over period above. Signed-off-by: Paul Turner <pjt@google.com> Signed-off-by: Nikhil Rao <ncrao@google.com> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110721184756.972636699@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-14 12:03:20 +02:00
WANG Cong	00a66d2974	mm: remove the leftovers of noswapaccount In commit `a2c8990aed` ("memsw: remove noswapaccount kernel parameter"), Michal forgot to remove some left pieces of noswapaccount in the tree, this patch removes them all. Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Acked-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-25 20:57:09 -07:00
Linus Torvalds	c0c463d34a	Merge branches 'x86-urgent-for-linus', 'core-debug-for-linus', 'irq-core-for-linus' and 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: um: Make rwsem.S depend on CONFIG_RWSEM_XCHGADD_ALGORITHM * 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: debug: Make CONFIG_EXPERT select CONFIG_DEBUG_KERNEL to unhide debug options * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: Remove unused CHECK_IRQ_PER_CPU() * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf tools, x86: Fix 32-bit compile on 64-bit system	2011-07-23 10:33:08 -07:00
Linus Torvalds	a99a7d1436	Merge branch 'timers-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: mips: Fix i8253 clockevent fallout i8253: Cleanup outb/inb magic arm: Footbridge: Use common i8253 clockevent mips: Use common i8253 clockevent x86: Use common i8253 clockevent i8253: Create common clockevent implementation i8253: Export i8253_lock unconditionally pcpskr: MIPS: Make config dependencies finer grained pcspkr: Cleanup Kconfig dependencies i8253: Move remaining content and delete asm/i8253.h i8253: Consolidate definitions of PIT_LATCH x86: i8253: Consolidate definitions of global_clock_event i8253: Alpha, PowerPC: Remove unused asm/8253pit.h alpha: i8253: Cleanup remaining users of i8253pit.h i8253: Remove I8253_LOCK config i8253: Make pcsp sound driver use the shared i8253_lock i8253: Make pcspkr input driver use the shared i8253_lock i8253: Consolidate all kernel definitions of i8253_lock i8253: Unify all kernel declarations of i8253_lock i8253: Create linux/i8253.h and use it in all 8253 related files	2011-07-22 16:51:56 -07:00
Josh Triplett	d2c3225879	gcov: disable CONFIG_CONSTRUCTORS when not needed by CONFIG_GCOV_KERNEL CONFIG_CONSTRUCTORS controls support for running constructor functions at kernel init time. According to commit `b99b87f70c` ("kernel: constructor support"), gcov (CONFIG_GCOV_KERNEL) needs this. However, CONFIG_CONSTRUCTORS currently defaults to y, with no option to disable it, and CONFIG_GCOV_KERNEL depends on it. Instead, default it to n and have CONFIG_GCOV_KERNEL select it, so that the normal case of CONFIG_GCOV_KERNEL=n will result in CONFIG_CONSTRUCTORS=n. Observed in the short list of =y values in a minimal kernel configuration. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Acked-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-15 20:04:01 -07:00
Josh Triplett	bd5dc17be8	uts: make default hostname configurable, rather than always using "(none)" The "hostname" tool falls back to setting the hostname to "localhost" if /etc/hostname does not exist. Distribution init scripts have the same fallback. However, if userspace never calls sethostname, such as when booting with init=/bin/sh, or otherwise booting a minimal system without the usual init scripts, the default hostname of "(none)" remains, unhelpfully appearing in various places such as prompts ("root@(none):~#") and logs. Furthermore, "(none)" doesn't typically resolve to anything useful. Make the default hostname configurable. This removes the need for the standard fallback, provides a useful default for systems that never call sethostname, and makes minimal systems that much more useful with less configuration. Distributions could choose to use "localhost" here to avoid the fallback, while embedded systems may wish to use a specific target hostname. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: David Miller <davem@davemloft.net> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Kel Modderman <kel@otaku42.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-15 20:04:00 -07:00
Ralf Baechle	8761f1ab71	pcspkr: Cleanup Kconfig dependencies Lenghty lists of the kind "depends on ARCH1 \|\| ARCH2 ... \|\| ARCH123" are usually either wrong or too coarse grained. Or plain an ugly sin. [ tglx: Fixed up amigaone ] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linux-alpha@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Gerhard Pircher <gerhard_pircher@gmx.net> Link: http://lkml.kernel.org/r/20110601180610.984881988@duck.linux-mips.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-06-09 15:01:41 +02:00
Ralf Baechle	15f304b664	i8253: Consolidate all kernel definitions of i8253_lock Move them to drivers/clocksource/i8253.c and remove the implementations in arch/ [ tglx: Avoid the extra file in lib - folded arch patches in. The export will become conditional in a later step ] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Link: http://lkml.kernel.org/r/20110601180610.221426078@duck.linux-mips.net Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-06-09 15:01:38 +02:00
Josh Triplett	f505c553db	debug: Make CONFIG_EXPERT select CONFIG_DEBUG_KERNEL to unhide debug options Several debugging options currently default to y, such as CONFIG_DEBUG_BUGVERBOSE and CONFIG_DEBUG_RODATA. Embedded users might want to turn those options off to save space; however, turning them off requires turning on CONFIG_DEBUG_KERNEL to unhide them. Since CONFIG_DEBUG_KERNEL exists specifically to unhide debugging options, and CONFIG_EXPERT exists specifically to unhide options potentially needed by experts and/or embedded users, make CONFIG_EXPERT automatically imply CONFIG_DEBUG_KERNEL. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20110606012358.GA1909@leaf Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-06-07 00:05:15 +02:00
Daniel Lezcano	a77aea9201	cgroup: remove the ns_cgroup The ns_cgroup is an annoying cgroup at the namespace / cgroup frontier and leads to some problems: * cgroup creation is out-of-control * cgroup name can conflict when pids are looping * it is not possible to have a single process handling a lot of namespaces without falling in a exponential creation time * we may want to create a namespace without creating a cgroup The ns_cgroup was replaced by a compatibility flag 'clone_children', where a newly created cgroup will copy the parent cgroup values. The userspace has to manually create a cgroup and add a task to the 'tasks' file. This patch removes the ns_cgroup as suggested in the following thread: https://lists.linux-foundation.org/pipermail/containers/2009-June/018616.html The 'cgroup_clone' function is removed because it is no longer used. This is a userspace-visible change. Commit `45531757b4` ("cgroup: notify ns_cgroup deprecated") (merged into 2.6.27) caused the kernel to emit a printk warning users that the feature is planned for removal. Since that time we have heard from XXX users who were affected by this. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Jamal Hadi Salim <hadi@cyberus.ca> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Acked-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-26 17:12:34 -07:00
Linus Torvalds	2bb732cdb4	Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6 * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6: kbuild: make KBUILD_NOCMDDEP=1 handle empty built-in.o scripts/kallsyms.c: fix potential segfault scripts/gen_initramfs_list.sh: Convert to a /bin/sh script kbuild: Fix GNU make v3.80 compatibility kbuild: Fix passing -Wno-* options to gcc 4.4+ kbuild: move scripts/basic/docproc.c to scripts/docproc.c kbuild: Fix Makefile.asm-generic for um kbuild: Allow to combine multiple W= levels kbuild: Disable -Wunused-but-set-variable for gcc 4.6.0 Fix handling of backlash character in LINUX_COMPILE_BY name kbuild: asm-generic support kbuild: implement several W= levels kbuild: Fix build with binutils <= 2.19 initramfs: Use KBUILD_BUILD_TIMESTAMP for generated entries kbuild: Allow to override LINUX_COMPILE_BY and LINUX_COMPILE_HOST macros kbuild: Drop unused LINUX_COMPILE_TIME and LINUX_COMPILE_DOMAIN macros kbuild: Use the deterministic mode of ar kbuild: Call gzip with -n kbuild: move KALLSYMS_EXTRA_PASS from Kconfig to Makefile Kconfig: improve KALLSYMS_ALL documentation Fix up trivial conflict in Makefile	2011-05-24 13:31:37 -07:00
Linus Torvalds	e98bae7592	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6: (28 commits) sparc32: fix build, fix missing cpu_relax declaration SCHED_TTWU_QUEUE is not longer needed since sparc32 now implements IPI sparc32,leon: Remove unnecessary page_address calls in LEON DMA API. sparc: convert old cpumask API into new one sparc32, sun4d: Implemented SMP IPIs support for SUN4D machines sparc32, sun4m: Implemented SMP IPIs support for SUN4M machines sparc32,leon: Implemented SMP IPIs for LEON CPU sparc32: implement SMP IPIs using the generic functions sparc32,leon: SMP power down implementation sparc32,leon: added some SMP comments sparc: add {read,write}*_be routines sparc32,leon: don't rely on bootloader to mask IRQs sparc32,leon: operate on boot-cpu IRQ controller registers sparc32: always define boot_cpu_id sparc32: removed unused code, implemented by generic code sparc32: avoid build warning at mm/percpu.c:1647 sparc32: always register a PROM based early console sparc32: probe for cpu info only during startup sparc: consolidate show_cpuinfo in cpu.c sparc32,leon: implement genirq CPU affinity ...	2011-05-22 22:06:24 -07:00
Linus Torvalds	281dc5c5ec	Give up on pushing CC_OPTIMIZE_FOR_SIZE I still happen to believe that I$ miss costs are a major thing, but sadly, -Os doesn't seem to be the solution. With or without it, gcc will miss some obvious code size improvements, and with it enabled gcc will sometimes make choices that aren't good even with high I$ miss ratios. For example, with -Os, gcc on x86 will turn a 20-byte constant memcpy into a "rep movsl". While I sincerely hope that x86 CPU's will some day do a good job at that, they certainly don't do it yet, and the cost is higher than a L1 I$ miss would be. Some day I hope we can re-enable this. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-22 14:30:36 -07:00
Daniel Hellstrom	17d9f311ec	SCHED_TTWU_QUEUE is not longer needed since sparc32 now implements IPI Signed-off-by: Daniel Hellstrom <daniel@gaisler.com> Reported-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-20 13:10:55 -07:00
Linus Torvalds	eb04f2f04e	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (78 commits) Revert "rcu: Decrease memory-barrier usage based on semi-formal proof" net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree() batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu() net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu() net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu() perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu() perf,rcu: convert call_rcu(free_ctx) to kfree_rcu() net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(net_generic_release) to kfree_rcu() net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu() net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu() security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu() net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu() net,rcu: convert call_rcu(xps_map_release) to kfree_rcu() net,rcu: convert call_rcu(rps_map_release) to kfree_rcu() ...	2011-05-19 18:14:34 -07:00
Linus Torvalds	80fe02b5da	Merge branches 'sched-core-for-linus' and 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits) sched: Fix and optimise calculation of the weight-inverse sched: Avoid going ahead if ->cpus_allowed is not changed sched, rt: Update rq clock when unthrottling of an otherwise idle CPU sched: Remove unused parameters from sched_fork() and wake_up_new_task() sched: Shorten the construction of the span cpu mask of sched domain sched: Wrap the 'cfs_rq->nr_spread_over' field with CONFIG_SCHED_DEBUG sched: Remove unused 'this_best_prio arg' from balance_tasks() sched: Remove noop in alloc_rt_sched_group() sched: Get rid of lock_depth sched: Remove obsolete comment from scheduler_tick() sched: Fix sched_domain iterations vs. RCU sched: Next buddy hint on sleep and preempt path sched: Make set__buddy() work on non-task entities sched: Remove need_migrate_task() sched: Move the second half of ttwu() to the remote cpu sched: Restructure ttwu() some more sched: Rename ttwu_post_activation() to ttwu_do_wakeup() sched: Remove rq argument from ttwu_stat() sched: Remove rq->lock from the first half of ttwu() sched: Drop rq->lock from sched_exec() ... 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix rt_rq runtime leakage bug	2011-05-19 17:41:22 -07:00
Ingo Molnar	9cb5baba5e	Merge commit 'v2.6.39-rc7' into sched/core	2011-05-12 09:36:18 +02:00
David Rientjes	21a43e397e	slub: Revert "[PARISC] slub: fix panic with DISCONTIGMEM" This reverts commit `4a5fa3590f`, which did not allow SLUB to be used on architectures that use DISCONTIGMEM without compiling NUMA support without CONFIG_BROKEN also set. The slub panic that it was intended to prevent is addressed by `d9b41e0b54` ("[PARISC] set memory ranges in N_NORMAL_MEMORY when onlined") on parisc so there is no further slub issues with such a configuration. The reverts allows SLUB now to be used on such architectures since there haven't been any reports of additional errors. Cc: James Bottomley <James.Bottomley@suse.de> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-10 17:37:34 -07:00
Paul E. McKenney	27f4d28057	rcu: priority boosting for TREE_PREEMPT_RCU Add priority boosting for TREE_PREEMPT_RCU, similar to that for TINY_PREEMPT_RCU. This is enabled by the default-off RCU_BOOST kernel parameter. The priority to which to boost preempted RCU readers is controlled by the RCU_BOOST_PRIO kernel parameter (defaulting to real-time priority 1) and the time to wait before boosting the readers who are blocking a given grace period is controlled by the RCU_BOOST_DELAY kernel parameter (defaulting to 500 milliseconds). Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-05-05 23:16:55 -07:00
Linus Torvalds	e8dad69408	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6 * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6: [PARISC] slub: fix panic with DISCONTIGMEM [PARISC] set memory ranges in N_NORMAL_MEMORY when onlined	2011-04-27 15:20:33 -07:00
Randy Dunlap	6befe5f69b	init/Kconfig: fix EXPERT menu list The EXPERT menu list was recently broken by the insertion of a kconfig symbol (EMBEDDED) at the beginning of the EXPERT list of kconfig items. Broken by: commit `6a108a14fa` Author: David Rientjes <rientjes@google.com> Date: Thu Jan 20 14:44:16 2011 -0800 kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT Restore the EXPERT menu list -- don't inject a symbol (EMBEDDED) that does not depend on EXPERT into the list. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: David Rientjes <rientjes@google.com> Cc: Peter Foley <pefoley2@verizon.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-04-26 20:48:37 -07:00
James Bottomley	4a5fa3590f	[PARISC] slub: fix panic with DISCONTIGMEM Slub makes assumptions about page_to_nid() which are violated by DISCONTIGMEM and !NUMA. This violation results in a panic because page_to_nid() can be non-zero for pages in the discontiguous ranges and this leads to a null return by get_node(). The assertion by the maintainer is that DISCONTIGMEM should only be allowed when NUMA is also defined. However, at least six architectures: alpha, ia64, m32r, m68k, mips, parisc violate this. The panic is a regression against slab, so just mark slub broken in the problem configuration to prevent users reporting these panics. Cc: stable@kernel.org Acked-by: David Rientjes <rientjes@google.com> Acked-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2011-04-22 15:42:46 -05:00
Artem Bityutskiy	1e2795a119	kbuild: move KALLSYMS_EXTRA_PASS from Kconfig to Makefile At the moment we have the CONFIG_KALLSYMS_EXTRA_PASS Kconfig switch, which users can enable or disable while configuring the kernel. This option is then used by 'make' to determine whether an extra kallsyms pass is needed or not. However, this approach is not nice and confusing, and this patch moves CONFIG_KALLSYMS_EXTRA_PASS from Kconfig to Makefile instead. The rationale is below. 1. CONFIG_KALLSYMS_EXTRA_PASS is really about the build time, not run-time. There is no real need for it to be in Kconfig. It is just an additional work-around which should be used only in rare cases, when someone breaks kallsyms, so Kbuild/Makefile is much better place for this option. 2. Grepping CONFIG_KALLSYMS_EXTRA_PASS shows that many defconfigs have it enabled, probably not because they try to work-around a kallsyms bug, but just because the Kconfig help text is confusing and does not really make it clear that this option should not be used unless except when kallsyms is broken. 3. And since many people have CONFIG_KALLSYMS_EXTRA_PASS enabled in their Kconfig, we do might fail to notice kallsyms bugs in time. E.g., many testers use "make allyesconfig" to test builds, which will enable CONFIG_KALLSYMS_EXTRA_PASS and kallsyms breakage will not be noticed. To address that, this patch: 1. Kills CONFIG_KALLSYMS_EXTRA_PASS 2. Changes Makefile so that people can use "make KALLSYMS_EXTRA_PASS=1" to enable the extra pass if needed. Additionally, they may define KALLSYMS_EXTRA_PASS as an environment variable. 3. By default KALLSYMS_EXTRA_PASS is disabled and if kallsyms has issues, "make" should print a warning and suggest using KALLSYMS_EXTRA_PASS Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> [mmarek: Removed make help text, is not necessary] Signed-off-by: Michal Marek <mmarek@suse.cz>	2011-04-15 15:56:15 +02:00
Artem Bityutskiy	71a83ec7da	Kconfig: improve KALLSYMS_ALL documentation Dumb users like myself are not able to grasp from the existing KALLSYMS_ALL documentation that this option is not what they need. Improve the help message and make it clearer that KALLSYMS is enough in the majority of use cases, and KALLSYMS_ALL should really be used very rarely. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: Michal Marek <mmarek@suse.cz>	2011-04-15 15:48:01 +02:00
Peter Zijlstra	317f394160	sched: Move the second half of ttwu() to the remote cpu Now that we've removed the rq->lock requirement from the first part of ttwu() and can compute placement without holding any rq->lock, ensure we execute the second half of ttwu() on the actual cpu we want the task to run on. This avoids having to take rq->lock and doing the task enqueue remotely, saving lots on cacheline transfers. As measured using: http://oss.oracle.com/~mason/sembench.c $ for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo performance > $i; done $ echo 4096 32000 64 128 > /proc/sys/kernel/sem $ ./sembench -t 2048 -w 1900 -o 0 unpatched: run time 30 seconds 647278 worker burns per second patched: run time 30 seconds 816715 worker burns per second Reviewed-by: Frank Rowand <frank.rowand@am.sony.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110405152729.515897185@chello.nl	2011-04-14 08:52:41 +02:00
Linus Torvalds	e16b396ce3	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (47 commits) doc: CONFIG_UNEVICTABLE_LRU doesn't exist anymore Update cpuset info & webiste for cgroups dcdbas: force SMI to happen when expected arch/arm/Kconfig: remove one to many l's in the word. asm-generic/user.h: Fix spelling in comment drm: fix printk typo 'sracth' Remove one to many n's in a word Documentation/filesystems/romfs.txt: fixing link to genromfs drivers:scsi Change printk typo initate -> initiate serial, pch uart: Remove duplicate inclusion of linux/pci.h header fs/eventpoll.c: fix spelling mm: Fix out-of-date comments which refers non-existent functions drm: Fix printk typo 'failled' coh901318.c: Change initate to initiate. mbox-db5500.c Change initate to initiate. edac: correct i82975x error-info reported edac: correct i82975x mci initialisation edac: correct commented info fs: update comments to point correct document target: remove duplicate include of target/target_core_device.h from drivers/target/target_core_hba.c ... Trivial conflict in fs/eventpoll.c (spelling vs addition)	2011-03-18 10:37:40 -07:00
Linus Torvalds	f74b944419	Merge branch 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: BKL: That's all, folks fs/locks.c: Remove stale FIXME left over from BKL conversion ipx: remove the BKL appletalk: remove the BKL x25: remove the BKL ufs: remove the BKL hpfs: remove the BKL drivers: remove extraneous includes of smp_lock.h tracing: don't trace the BKL adfs: remove the big kernel lock	2011-03-16 17:21:00 -07:00
Linus Torvalds	a5e6b135bd	Merge branch 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (50 commits) printk: do not mangle valid userspace syslog prefixes efivars: Add Documentation efivars: Expose efivars functionality to external drivers. efivars: Parameterize operations. efivars: Split out variable registration efivars: parameterize efivars efivars: Make efivars bin_attributes dynamic efivars: move efivars globals into struct efivars drivers:misc: ti-st: fix debugging code kref: Fix typo in kref documentation UIO: add PRUSS UIO driver support Fix spelling mistakes in Documentation/zh_CN/SubmittingPatches firmware: Fix unaligned memory accesses in dmi-sysfs firmware: Add documentation for /sys/firmware/dmi firmware: Expose DMI type 15 System Event Log firmware: Break out system_event_log in dmi-sysfs firmware: Basic dmi-sysfs support firmware: Add DMI entry types to the headers Driver core: convert platform_{get,set}_drvdata to static inline functions Translate linux-2.6/Documentation/magic-number.txt into Chinese ...	2011-03-16 15:05:40 -07:00
Linus Torvalds	a926021cb1	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (184 commits) perf probe: Clean up probe_point_lazy_walker() return value tracing: Fix irqoff selftest expanding max buffer tracing: Align 4 byte ints together in struct tracer tracing: Export trace_set_clr_event() tracing: Explain about unstable clock on resume with ring buffer warning ftrace/graph: Trace function entry before updating index ftrace: Add .ref.text as one of the safe areas to trace tracing: Adjust conditional expression latency formatting. tracing: Fix event alignment: skb:kfree_skb tracing: Fix event alignment: mce:mce_record tracing: Fix event alignment: kvm:kvm_hv_hypercall tracing: Fix event alignment: module:module_request tracing: Fix event alignment: ftrace:context_switch and ftrace:wakeup tracing: Remove lock_depth from event entry perf header: Stop using 'self' perf session: Use evlist/evsel for managing perf.data attributes perf top: Don't let events to eat up whole header line perf top: Fix events overflow in top command ring-buffer: Remove unused #include <linux/trace_irq.h> tracing: Add an 'overwrite' trace_option. ...	2011-03-15 18:31:30 -07:00
Aneesh Kumar K.V	990d6c2d7a	vfs: Add name to file handle conversion support The syscall also return mount id which can be used to lookup file system specific information such as uuid in /proc/<pid>/mountinfo Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-03-15 02:21:37 -04:00
Arnd Bergmann	4ba8216cd9	BKL: That's all, folks This removes the implementation of the big kernel lock, at last. A lot of people have worked on this in the past, I so the credit for this patch should be with everyone who participated in the hunt. The names on the Cc list are the people that were the most active in this, according to the recorded git history, in alphabetical order. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Alan Cox <alan@linux.intel.com> Cc: Alessio Igor Bogani <abogani@texware.it> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Hendry <andrew.hendry@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Hans Verkuil <hverkuil@xs4all.nl> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Jan Blunck <jblunck@infradead.org> Cc: John Kacur <jkacur@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Oliver Neukum <oliver@neukum.org> Cc: Paul Menage <menage@google.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-03-05 10:56:00 +01:00
Li Zefan	2d0f25201e	perf cgroup: Fix a typo in kernel config s/specificied/specified Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4D6F348C.2050804@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-03-04 11:32:51 +01:00
Stephane Eranian	e5d1367f17	perf: Add cgroup support This kernel patch adds the ability to filter monitoring based on container groups (cgroups). This is for use in per-cpu mode only. The cgroup to monitor is passed as a file descriptor in the pid argument to the syscall. The file descriptor must be opened to the cgroup name in the cgroup filesystem. For instance, if the cgroup name is foo and cgroupfs is mounted in /cgroup, then the file descriptor is opened to /cgroup/foo. Cgroup mode is activated by passing PERF_FLAG_PID_CGROUP in the flags argument to the syscall. For instance to measure in cgroup foo on CPU1 assuming cgroupfs is mounted under /cgroup: struct perf_event_attr attr; int cgroup_fd, fd; cgroup_fd = open("/cgroup/foo", O_RDONLY); fd = perf_event_open(&attr, cgroup_fd, 1, -1, PERF_FLAG_PID_CGROUP); close(cgroup_fd); Signed-off-by: Stephane Eranian <eranian@google.com> [ added perf_cgroup_{exit,attach} ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4d590250.114ddf0a.689e.4482@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-16 13:30:48 +01:00
Jiri Kosina	0a9d59a246	Merge branch 'master' into for-next	2011-02-15 10:24:31 +01:00
Ferenc Wagner	5d6a4ea576	sysfs: Capitalize description of SYSFS_DEPRECATED{_V2} options Signed-off-by: Ferenc Wagner <wferi@niif.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-03 16:36:40 -08:00
David Rientjes	6a108a14fa	kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option is used to configure any non-standard kernel with a much larger scope than only small devices. This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes references to the option throughout the kernel. A new CONFIG_EMBEDDED option is added that automatically selects CONFIG_EXPERT when enabled and can be used in the future to isolate options that should only be considered for embedded systems (RISC architectures, SLOB, etc). Calling the option "EXPERT" more accurately represents its intention: only expert users who understand the impact of the configuration changes they are making should enable it. Reviewed-by: Ingo Molnar <mingo@elte.hu> Acked-by: David Woodhouse <david.woodhouse@intel.com> Signed-off-by: David Rientjes <rientjes@google.com> Cc: Greg KH <gregkh@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jens Axboe <axboe@kernel.dk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Robin Holt <holt@sgi.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:05 -08:00
Michael Witten	c5e0591ae3	Kconfig: BLK_THROTTLE -> BLK_DEV_THROTTLING It would seem that `CONFIG_BLK_THROTTLE' doesn't exist, as it is only referenced in the documentation for `CONFIG_BLK_CGROUP'. The only other choice is `CONFIG_BLK_DEV_THROTTLING': $ git grep --cached THROTTL -- \*Kconfig block/Kconfig:config BLK_DEV_THROTTLING init/Kconfig: CONFIG_BLK_THROTTLE=y. Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-01-17 11:15:04 +01:00
Michael Witten	79e2e7597d	Kconfig: Typo: seti -> set Also, I introduced some punctuation to facilitate reading. Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-01-17 11:13:55 +01:00
Linus Torvalds	f9ee7f60d6	Merge branches 'core-fixes-for-linus', 'x86-fixes-for-linus', 'timers-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: avoid pointless blocked-task warnings rcu: demote SRCU_SYNCHRONIZE_DELAY from kernel-parameter status rtmutex: Fix comment about why new_owner can be NULL in wake_futex_pi() * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, olpc: Add missing Kconfig dependencies x86, mrst: Set correct APB timer IRQ affinity for secondary cpu x86: tsc: Fix calibration refinement conditionals to avoid divide by zero x86, ia64, acpi: Clean up x86-ism in drivers/acpi/numa.c * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: timekeeping: Make local variables static time: Rename misnamed minsec argument of clocks_calc_mult_shift() * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Remove syscall_exit_fields tracing: Only process module tracepoints once perf record: Add "nodelay" mode, disabled by default perf sched: Fix list of events, dropping unsupported ':r' modifier Revert "perf tools: Emit clearer message for sys_perf_event_open ENOENT return" perf top: Fix annotate segv perf evsel: Fix order of event list deletion	2011-01-15 12:45:00 -08:00
Ingo Molnar	1161ec9449	Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/urgent	2011-01-14 15:30:16 +01:00
Paul E. McKenney	c072a388d5	rcu: demote SRCU_SYNCHRONIZE_DELAY from kernel-parameter status Because the adaptive synchronize_srcu_expedited() approach has worked very well in testing, remove the kernel parameter and replace it by a C-preprocessor macro. If someone finds problems with this approach, a more complex and aggressively adaptive approach might be required. Longer term, SRCU will be merged with the other RCU implementations, at which point synchronize_srcu_expedited() will be event driven, just as synchronize_sched_expedited() currently is. At that point, there will be no need for this adaptive approach. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-01-14 04:56:49 -08:00
Linus Torvalds	008d23e485	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits) Documentation/trace/events.txt: Remove obsolete sched_signal_send. writeback: fix global_dirty_limits comment runtime -> real-time ppc: fix comment typo singal -> signal drivers: fix comment typo diable -> disable. m68k: fix comment typo diable -> disable. wireless: comment typo fix diable -> disable. media: comment typo fix diable -> disable. remove doc for obsolete dynamic-printk kernel-parameter remove extraneous 'is' from Documentation/iostats.txt Fix spelling milisec -> ms in snd_ps3 module parameter description Fix spelling mistakes in comments Revert conflicting V4L changes i7core_edac: fix typos in comments mm/rmap.c: fix comment sound, ca0106: Fix assignment to 'channel'. hrtimer: fix a typo in comment init/Kconfig: fix typo anon_inodes: fix wrong function name in comment fix comment typos concerning "consistent" poll: fix a typo in comment ... Fix up trivial conflicts in: - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c) - fs/ext4/ext4.h Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.	2011-01-13 10:05:56 -08:00
Lasse Collin	3ebe12439b	decompressors: add boot-time XZ support This implements the API defined in <linux/decompress/generic.h> which is used for kernel, initramfs, and initrd decompression. This patch together with the first patch is enough for XZ-compressed initramfs and initrd; XZ-compressed kernel will need arch-specific changes. The buffering requirements described in decompress_unxz.c are stricter than with gzip, so the relevant changes should be done to the arch-specific code when adding support for XZ-compressed kernel. Similarly, the heap size in arch-specific pre-boot code may need to be increased (30 KiB is enough). The XZ decompressor needs memmove(), memeq() (memcmp() == 0), and memzero() (memset(ptr, 0, size)), which aren't available in all arch-specific pre-boot environments. I'm including simple versions in decompress_unxz.c, but a cleaner solution would naturally be nicer. Signed-off-by: Lasse Collin <lasse.collin@tukaani.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alain Knaff <alain@knaff.lu> Cc: Albin Tonnerre <albin.tonnerre@free-electrons.com> Cc: Phillip Lougher <phillip@lougher.demon.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-13 08:03:25 -08:00
Linus Torvalds	65b2074f84	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits) sched: Change wait_for_completion_*_timeout() to return a signed long sched, autogroup: Fix reference leak sched, autogroup: Fix potential access to freed memory sched: Remove redundant CONFIG_CGROUP_SCHED ifdef sched: Fix interactivity bug by charging unaccounted run-time on entity re-weight sched: Move periodic share updates to entity_tick() printk: Use this_cpu_{read\|write} api on printk_pending sched: Make pushable_tasks CONFIG_SMP dependant sched: Add 'autogroup' scheduling feature: automated per session task groups sched: Fix unregister_fair_sched_group() sched: Remove unused argument dest_cpu to migrate_task() mutexes, sched: Introduce arch_mutex_cpu_relax() sched: Add some clock info to sched_debug cpu: Remove incorrect BUG_ON cpu: Remove unused variable sched: Fix UP build breakage sched: Make task dump print all 15 chars of proc comm sched: Update tg->shares after cpu.shares write sched: Allow update_cfs_load() to update global load sched: Implement demand based update_cfs_load() ...	2011-01-06 10:23:33 -08:00
Ingo Molnar	394f4528c5	Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/rcu	2010-12-23 12:57:04 +01:00
Jim Cromie	43d547f9ce	init/Kconfig: fix typo Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-12-22 18:58:36 +01:00
Ingo Molnar	8e9255e6a2	Merge branch 'linus' into sched/core Merge reason: we want to queue up dependent cleanup Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-08 20:15:29 +01:00
Mike Galbraith	5091faa449	sched: Add 'autogroup' scheduling feature: automated per session task groups A recurring complaint from CFS users is that parallel kbuild has a negative impact on desktop interactivity. This patch implements an idea from Linus, to automatically create task groups. Currently, only per session autogroups are implemented, but the patch leaves the way open for enhancement. Implementation: each task's signal struct contains an inherited pointer to a refcounted autogroup struct containing a task group pointer, the default for all tasks pointing to the init_task_group. When a task calls setsid(), a new task group is created, the process is moved into the new task group, and a reference to the preveious task group is dropped. Child processes inherit this task group thereafter, and increase it's refcount. When the last thread of a process exits, the process's reference is dropped, such that when the last process referencing an autogroup exits, the autogroup is destroyed. At runqueue selection time, IFF a task has no cgroup assignment, its current autogroup is used. Autogroup bandwidth is controllable via setting it's nice level through the proc filesystem: cat /proc/<pid>/autogroup Displays the task's group and the group's nice level. echo <nice level> > /proc/<pid>/autogroup Sets the task group's shares to the weight of nice <level> task. Setting nice level is rate limited for !admin users due to the abuse risk of task group locking. The feature is enabled from boot by default if CONFIG_SCHED_AUTOGROUP=y is selected, but can be disabled via the boot option noautogroup, and can also be turned on/off on the fly via: echo [01] > /proc/sys/kernel/sched_autogroup_enabled ... which will automatically move tasks to/from the root task group. Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Paul Turner <pjt@google.com> Cc: Oleg Nesterov <oleg@redhat.com> [ Removed the task_group_path() debug code, and fixed !EVENTFD build failure. ] Signed-off-by: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1290281700.28711.9.camel@maggy.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-30 16:03:35 +01:00
Paul E. McKenney	46fdb0937f	rcu: Make synchronize_srcu_expedited() fast if running readers The synchronize_srcu_expedited() function is currently quick if there are no active readers, but will delay a full jiffy if there are any. If these readers leave their SRCU read-side critical sections quickly, this is way too long to wait. So this commit first waits ten microseconds, and only then falls back to jiffy-at-a-time waiting. Reported-by: Avi Kivity <avi@redhat.com> Reported-by: Marcelo Tosatti <mtosatti@redhat.com> Tested-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-11-29 22:02:40 -08:00
Paul E. McKenney	9e571a82f0	rcu: add tracing for TINY_RCU and TINY_PREEMPT_RCU Add tracing for the tiny RCU implementations, including statistics on boosting in the case of TINY_PREEMPT_RCU and RCU_BOOST. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-11-29 22:01:55 -08:00
Paul E. McKenney	24278d1483	rcu: priority boosting for TINY_PREEMPT_RCU Add priority boosting, but only for TINY_PREEMPT_RCU. This is enabled by the default-off RCU_BOOST kernel parameter. The priority to which to boost preempted RCU readers is controlled by the RCU_BOOST_PRIO kernel parameter (defaulting to real-time priority 1) and the time to wait before boosting the readers blocking a given grace period is controlled by the RCU_BOOST_DELAY kernel parameter (defaulting to 500 milliseconds). Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-11-29 22:01:54 -08:00
Michal Hocko	a42c390cfa	cgroups: make swap accounting default behavior configurable Swap accounting can be configured by CONFIG_CGROUP_MEM_RES_CTLR_SWAP configuration option and then it is turned on by default. There is a boot option (noswapaccount) which can disable this feature. This makes it hard for distributors to enable the configuration option as this feature leads to a bigger memory consumption and this is a no-go for general purpose distribution kernel. On the other hand swap accounting may be very usuful for some workloads. This patch adds a new configuration option which controls the default behavior (CGROUP_MEM_RES_CTLR_SWAP_ENABLED). If the option is selected then the feature is turned on by default. It also adds a new boot parameter swapaccount[=1\|0] which enhances the original noswapaccount parameter semantic by means of enable/disable logic (defaults to 1 if no value is provided to be still consistent with noswapaccount). The default behavior is unchanged (if CONFIG_CGROUP_MEM_RES_CTLR_SWAP is enabled then CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED is enabled as well) Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-25 06:50:45 +09:00
Daniel Lezcano	7af37bec41	namespaces Kconfig: move namespace menu location after the cgroup We have the namespaces as a menuconfig like the cgroup. The cgroup and the namespace are two base bricks for the containers. It is more logical to put the namespace menu right after the cgroup menu. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-27 18:03:17 -07:00
Daniel Lezcano	eef691b36e	namespaces Kconfig: remove the cgroup device whitelist experimental tag This subsystem is merged since a long time now, I think we can consider it mature enough. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-27 18:03:17 -07:00
Daniel Lezcano	79ae9c29e4	namespaces Kconfig: remove pointless cgroup dependency The different cgroup subsystems are under the cgroup submenu. The dependency between the cgroups and the menu subsystems is pointless. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-27 18:03:16 -07:00
Daniel Lezcano	8dd2a82c29	namespaces Kconfig: make namespace a submenu Make the namespaces config option a submenu. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-27 18:03:16 -07:00
Daniel Lezcano	17a6d4411a	namespaces: default all the namespaces to 'yes' when CONFIG_NAMESPACES is selected As the different namespaces depend on 'CONFIG_NAMESPACES', it is logical to enable all the namespaces when we enable NAMESPACES. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: David Miller <davem@davemloft.net> Acked-By: Matt Helsley <matthltc@us.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-27 18:03:16 -07:00
Daniel Lezcano	9bd38c2cda	namespaces: remove pid_ns and net_ns experimental status The pid namespace is in the kernel since 2.6.27 and the net_ns since 2.6.29. They are enabled in the distro by default and used by userspace component. They are mature enough to remove the 'experimental' label. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-27 18:03:16 -07:00
Linus Torvalds	229aebb873	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) Update broken web addresses in arch directory. Update broken web addresses in the kernel. Revert "drivers/usb: Remove unnecessary return's from void functions" for musb gadget Revert "Fix typo: configuation => configuration" partially ida: document IDA_BITMAP_LONGS calculation ext2: fix a typo on comment in ext2/inode.c drivers/scsi: Remove unnecessary casts of private_data drivers/s390: Remove unnecessary casts of private_data net/sunrpc/rpc_pipe.c: Remove unnecessary casts of private_data drivers/infiniband: Remove unnecessary casts of private_data drivers/gpu/drm: Remove unnecessary casts of private_data kernel/pm_qos_params.c: Remove unnecessary casts of private_data fs/ecryptfs: Remove unnecessary casts of private_data fs/seq_file.c: Remove unnecessary casts of private_data arm: uengine.c: remove C99 comments arm: scoop.c: remove C99 comments Fix typo configue => configure in comments Fix typo: configuation => configuration Fix typo interrest[ing\|ed] => interest[ing\|ed] Fix various typos of valid in comments ... Fix up trivial conflicts in: drivers/char/ipmi/ipmi_si_intf.c drivers/usb/gadget/rndis.c net/irda/irnet/irnet_ppp.c	2010-10-24 13:41:39 -07:00
Linus Torvalds	b9da057105	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (31 commits) driver core: Display error codes when class suspend fails Driver core: Add section count to memory_block struct Driver core: Add mutex for adding/removing memory blocks Driver core: Move find_memory_block routine hpilo: Despecificate driver from iLO generation driver core: Convert link_mem_sections to use find_memory_block_hinted. driver core: Introduce find_memory_block_hinted which utilizes kset_find_obj_hinted. kobject: Introduce kset_find_obj_hinted. driver core: fix build for CONFIG_BLOCK not enabled driver-core: base: change to new flag variable sysfs: only access bin file vm_ops with the active lock sysfs: Fail bin file mmap if vma close is implemented. FW_LOADER: fix kconfig dependency warning on HOTPLUG uio: Statically allocate uio_class and use class .dev_attrs. uio: Support 2^MINOR_BITS minors uio: Cleanup irq handling. uio: Don't clear driver data uio: Fix lack of locking in init_uio_class SYSFS: Allow boot time switching between deprecated and modern sysfs layout driver core: remove CONFIG_SYSFS_DEPRECATED_V2 but keep it for block devices ...	2010-10-22 19:36:42 -07:00
Linus Torvalds	e9dd2b6837	Merge branch 'for-2.6.37/core' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.37/core' of git://git.kernel.dk/linux-2.6-block: (39 commits) cfq-iosched: Fix a gcc 4.5 warning and put some comments block: Turn bvec_k{un,}map_irq() into static inline functions block: fix accounting bug on cross partition merges block: Make the integrity mapped property a bio flag block: Fix double free in blk_integrity_unregister block: Ensure physical block size is unsigned int blkio-throttle: Fix possible multiplication overflow in iops calculations blkio-throttle: limit max iops value to UINT_MAX blkio-throttle: There is no need to convert jiffies to milli seconds blkio-throttle: Fix link failure failure on i386 blkio: Recalculate the throttled bio dispatch time upon throttle limit change blkio: Add root group to td->tg_list blkio: deletion of a cgroup was causes oops blkio: Do not export throttle files if CONFIG_BLK_DEV_THROTTLING=n block: set the bounce_pfn to the actual DMA limit rather than to max memory block: revert bad fix for memory hotplug causing bounces Fix compile error in blk-exec.c for !CONFIG_DETECT_HUNG_TASK block: set the bounce_pfn to the actual DMA limit rather than to max memory block: Prevent hang_check firing during long I/O cfq: improve fsync performance for small files ... Fix up trivial conflicts due to __rcu sparse annotation in include/linux/genhd.h	2010-10-22 17:00:32 -07:00
Linus Torvalds	5704e44d28	Merge branch 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: BKL: introduce CONFIG_BKL. dabusb: remove the BKL sunrpc: remove the big kernel lock init/main.c: remove BKL notations blktrace: remove the big kernel lock rtmutex-tester: make it build without BKL dvb-core: kill the big kernel lock dvb/bt8xx: kill the big kernel lock tlclk: remove big kernel lock fix rawctl compat ioctls breakage on amd64 and itanic uml: kill big kernel lock parisc: remove big kernel lock cris: autoconvert trivial BKL users alpha: kill big kernel lock isapnp: BKL removal s390/block: kill the big kernel lock hpet: kill BKL, add compat_ioctl	2010-10-22 10:43:11 -07:00
Andi Kleen	e52eec13cd	SYSFS: Allow boot time switching between deprecated and modern sysfs layout I have some systems which need legacy sysfs due to old tools that are making assumptions that a directory can never be a symlink to another directory, and it's a big hazzle to compile separate kernels for them. This patch turns CONFIG_SYSFS_DEPRECATED into a run time option that can be switched on/off the kernel command line. This way the same binary can be used in both cases with just a option on the command line. The old CONFIG_SYSFS_DEPRECATED_V2 option is still there to set the default. I kept the weird name to not break existing config files. Also the compat code can be still completely disabled by undefining CONFIG_SYSFS_DEPRECATED_SWITCH -- just the optimizer takes care of this now instead of lots of ifdefs. This makes the code look nicer. v2: This is an updated version on top of Kay's patch to only handle the block devices. I tested it on my old systems and that seems to work. Cc: axboe@kernel.dk Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-10-22 10:16:43 -07:00
Kay Sievers	39aba963d9	driver core: remove CONFIG_SYSFS_DEPRECATED_V2 but keep it for block devices This patch removes the old CONFIG_SYSFS_DEPRECATED_V2 config option, but it keeps the logic around to handle block devices in the old manner as some people like to run new kernel versions on old (pre 2007/2008) distros. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: "James E.J. Bottomley" <James.Bottomley@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-10-22 10:16:43 -07:00
Linus Torvalds	4a60cfa945	Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (96 commits) apic, x86: Use BIOS settings for IBS and MCE threshold interrupt LVT offsets apic, x86: Check if EILVT APIC registers are available (AMD only) x86: ioapic: Call free_irte only if interrupt remapping enabled arm: Use ARCH_IRQ_INIT_FLAGS genirq, ARM: Fix boot on ARM platforms genirq: Fix CONFIG_GENIRQ_NO_DEPRECATED=y build x86: Switch sparse_irq allocations to GFP_KERNEL genirq: Switch sparse_irq allocator to GFP_KERNEL genirq: Make sparse_lock a mutex x86: lguest: Use new irq allocator genirq: Remove the now unused sparse irq leftovers genirq: Sanitize dynamic irq handling genirq: Remove arch_init_chip_data() x86: xen: Sanitise sparse_irq handling x86: Use sane enumeration x86: uv: Clean up the direct access to irq_desc x86: Make io_apic.c local functions static genirq: Remove irq_2_iommu x86: Speed up the irq_remapped check in hot pathes intr_remap: Simplify the code further ... Fix up trivial conflicts in arch/x86/Kconfig	2010-10-21 14:11:46 -07:00
Linus Torvalds	5d70f79b5e	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (163 commits) tracing: Fix compile issue for trace_sched_wakeup.c [S390] hardirq: remove pointless header file includes [IA64] Move local_softirq_pending() definition perf, powerpc: Fix power_pmu_event_init to not use event->ctx ftrace: Remove recursion between recordmcount and scripts/mod/empty jump_label: Add COND_STMT(), reducer wrappery perf: Optimize sw events perf: Use jump_labels to optimize the scheduler hooks jump_label: Add atomic_t interface jump_label: Use more consistent naming perf, hw_breakpoint: Fix crash in hw_breakpoint creation perf: Find task before event alloc perf: Fix task refcount bugs perf: Fix group moving irq_work: Add generic hardirq context callbacks perf_events: Fix transaction recovery in group_sched_in() perf_events: Fix bogus AMD64 generic TLB events perf_events: Fix bogus context time tracking tracing: Remove parent recording in latency tracer graph options tracing: Use one prologue for the preempt irqs off tracer function tracers ...	2010-10-21 12:54:49 -07:00
Arnd Bergmann	6de5bd128d	BKL: introduce CONFIG_BKL. With all the patches we have queued in the BKL removal tree, only a few dozen modules are left that actually rely on the BKL, and even there are lots of low-hanging fruit. We need to decide what to do about them, this patch illustrates one of the options: Every user of the BKL is marked as 'depends on BKL' in Kconfig, and the CONFIG_BKL becomes a user-visible option. If it gets disabled, no BKL using module can be built any more and the BKL code itself is compiled out. The one exception is file locking, which is practically always enabled and does a 'select BKL' instead. This effectively forces CONFIG_BKL to be enabled until we have solved the fs/lockd mess and can apply the patch that removes the BKL from fs/locks.c. Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2010-10-21 15:44:13 +02:00
Peter Zijlstra	e360adbe29	irq_work: Add generic hardirq context callbacks Provide a mechanism that allows running code in IRQ context. It is most useful for NMI code that needs to interact with the rest of the system -- like wakeup a task to drain buffers. Perf currently has such a mechanism, so extract that and provide it as a generic feature, independent of perf so that others may also benefit. The IRQ context callback is generated through self-IPIs where possible, or on architectures like powerpc the decrementer (the built-in timer facility) is set to generate an interrupt immediately. Architectures that don't have anything like this get to do with a callback from the timer tick. These architectures can call irq_work_run() at the tail of any IRQ handlers that might enqueue such work (like the perf IRQ handler) to avoid undue latencies in processing the work. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [ various fixes ] Signed-off-by: Huang Ying <ying.huang@intel.com> LKML-Reference: <1287036094.7768.291.camel@yhuang-dev> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-10-18 19:58:50 +02:00
Thomas Gleixner	d9817ebeee	genirq: Provide Kconfig The generic irq Kconfig options are copied around all archs. Provide a generic Kconfig file which can be included. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20100927121843.217333624@linutronix.de> Reviewed-by: H. Peter Anvin <hpa@zytor.com> Reviewed-by: Ingo Molnar <mingo@elte.hu>	2010-10-04 11:01:05 +02:00
Vivek Goyal	e43473b7f2	blkio: Core implementation of throttle policy o Actual implementation of throttling policy in block layer. Currently it implements READ and WRITE bytes per second throttling logic. IOPS throttling comes in later patches. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-09-16 08:42:52 +02:00
Stephan Sperber	681b3049dd	Kconfig: delete duplicate word Deleted a word which apeared twice. Signed-off-by: Stephan Sperber <sperberstephan@googlemail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-08-23 15:35:15 +02:00
Paul E. McKenney	a57eb940d1	rcu: Add a TINY_PREEMPT_RCU Implement a small-memory-footprint uniprocessor-only implementation of preemptible RCU. This implementation uses but a single blocked-tasks list rather than the combinatorial number used per leaf rcu_node by TREE_PREEMPT_RCU, which reduces memory consumption and greatly simplifies processing. This version also takes advantage of uniprocessor execution to accelerate grace periods in the case where there are no readers. The general design is otherwise broadly similar to that of TREE_PREEMPT_RCU. This implementation is a step towards having RCU implementation driven off of the SMP and PREEMPT kernel configuration variables, which can happen once this implementation has accumulated sufficient experience. Removed ACCESS_ONCE() from __rcu_read_unlock() and added barrier() as suggested by Steve Rostedt in order to avoid the compiler-reordering issue noted by Mathieu Desnoyers (http://lkml.org/lkml/2010/8/16/183). As can be seen below, CONFIG_TINY_PREEMPT_RCU represents almost 5Kbyte savings compared to CONFIG_TREE_PREEMPT_RCU. Of course, for non-real-time workloads, CONFIG_TINY_RCU is even better. CONFIG_TREE_PREEMPT_RCU text data bss dec filename 13 0 0 13 kernel/rcupdate.o 6170 825 28 7023 kernel/rcutree.o ---- 7026 Total CONFIG_TINY_PREEMPT_RCU text data bss dec filename 13 0 0 13 kernel/rcupdate.o 2081 81 8 2170 kernel/rcutiny.o ---- 2183 Total CONFIG_TINY_RCU (non-preemptible) text data bss dec filename 13 0 0 13 kernel/rcupdate.o 719 25 0 744 kernel/rcutiny.o --- 757 Total Requested-by: Loïc Minier <loic.minier@canonical.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-08-20 08:55:00 -07:00
Paul E. McKenney	4d87ffadbb	rcu: Fix RCU_FANOUT help message Commit `cf244dc01b` added a fourth level to the TREE_RCU hierarchy, but the RCU_FANOUT help message still said "cube root". This commit fixes this to "fourth root" and also emphasizes that production systems are well-served by the default. (Stress-testing RCU itself uses small RCU_FANOUT values in order to test large-system code paths on small(er) systems.) Located-by: John Kacur <jkacur@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-08-19 17:18:04 -07:00
Paul E. McKenney	687d7a960a	rcu: restrict TREE_RCU to SMP builds with !PREEMPT Because both TINY_RCU and TREE_PREEMPT_RCU have been in mainline for several releases, it is time to restrict the use of TREE_RCU to SMP non-preemptible systems. This reduces testing/validation effort. This commit is a first step towards driving the selection of RCU implementation directly off of the SMP and PREEMPT configuration parameters. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-08-19 17:18:04 -07:00
KAMEZAWA Hiroyuki	65e0e81166	memcg: remove experimental from swap account config It's 11 months since we changed swap_map[] to indicates SWAP_HAS_CACHE. Since that, memcg's swap accounting has been very stable and it seems it can be maintained. So, I'd like to remove EXPERIMENTAL from the config. Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:18 -07:00
Linus Torvalds	8c8946f509	Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify * 'for-linus' of git://git.infradead.org/users/eparis/notify: (132 commits) fanotify: use both marks when possible fsnotify: pass both the vfsmount mark and inode mark fsnotify: walk the inode and vfsmount lists simultaneously fsnotify: rework ignored mark flushing fsnotify: remove global fsnotify groups lists fsnotify: remove group->mask fsnotify: remove the global masks fsnotify: cleanup should_send_event fanotify: use the mark in handler functions audit: use the mark in handler functions dnotify: use the mark in handler functions inotify: use the mark in handler functions fsnotify: send fsnotify_mark to groups in event handling functions fsnotify: Exchange list heads instead of moving elements fsnotify: srcu to protect read side of inode and vfsmount locks fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called fsnotify: use _rcu functions for mark list traversal fsnotify: place marks on object in order of group memory address vfs/fsnotify: fsnotify_close can delay the final work in fput fsnotify: store struct file not struct path ... Fix up trivial delete/modify conflict in fs/notify/inotify/inotify.c.	2010-08-10 11:39:13 -07:00
Eric Paris	939a67fc4c	Audit: split audit watch Kconfig Audit watch should depend on CONFIG_AUDIT_SYSCALL and should select FSNOTIFY. This splits the spagetti like mixing of audit_watch and audit_filter code so they can be configured seperately. Signed-off-by: Eric Paris <eparis@redhat.com>	2010-07-28 09:58:19 -04:00
Eric Paris	67640b602f	Audit: audit watches depend on fsnotify CONFIG_AUDIT builds audit_watches which depend on fsnotify. Make CONFIG_AUDIT select fsnotify. Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Eric Paris <eparis@redhat.com>	2010-07-28 09:58:18 -04:00
Eric Paris	28a3a7eb3b	audit: reimplement audit_trees using fsnotify rather than inotify Simply switch audit_trees from using inotify to using fsnotify for it's inode pinning and disappearing act information. Signed-off-by: Eric Paris <eparis@redhat.com>	2010-07-28 09:58:17 -04:00
Tejun Heo	181a51f6e0	slow-work: kill it slow-work doesn't have any user left. Kill it. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Howells <dhowells@redhat.com>	2010-07-23 13:14:49 +02:00
Linus Torvalds	1f73897861	Merge branch 'for-35' of git://repo.or.cz/linux-kbuild * 'for-35' of git://repo.or.cz/linux-kbuild: (81 commits) kbuild: Revert part of `e8d400a` to resolve a conflict kbuild: Fix checking of scm-identifier variable gconfig: add support to show hidden options that have prompts menuconfig: add support to show hidden options which have prompts gconfig: remove show_debug option gconfig: remove dbg_print_ptype() and dbg_print_stype() kconfig: fix zconfdump() kconfig: some small fixes add random binaries to .gitignore kbuild: Include gen_initramfs_list.sh and the file list in the .d file kconfig: recalc symbol value before showing search results .gitignore: ignore *.lzo files headerdep: perlcritic warning scripts/Makefile.lib: Align the output of LZO kbuild: Generate modules.builtin in make modules_install Revert "kbuild: specify absolute paths for cscope" kbuild: Do not unnecessarily regenerate modules.builtin headers_install: use local file handles headers_check: fix perl warnings export_report: fix perl warnings ...	2010-06-01 08:55:52 -07:00
Jens Axboe	ee9a3607fb	Merge branch 'master' into for-2.6.35 Conflicts: fs/ext3/fsync.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2010-05-21 21:27:26 +02:00
Vivek Goyal	afc24d49c1	blk-cgroup: config options re-arrangement This patch fixes few usability and configurability issues. o All the cgroup based controller options are configurable from "Genral Setup/Control Group Support/" menu. blkio is the only exception. Hence make this option visible in above menu and make it configurable from there to bring it inline with rest of the cgroup based controllers. o Get rid of CONFIG_DEBUG_CFQ_IOSCHED. This option currently does two things. - Enable printing of cgroup paths in blktrace - Enables CONFIG_DEBUG_BLK_CGROUP, which in turn displays additional stat files in cgroup. If we are using group scheduling, blktrace data is of not really much use if cgroup information is not present. To get this data, currently one has to also enable CONFIG_DEBUG_CFQ_IOSCHED, which in turn brings the overhead of all the additional debug stat files which is not desired. Hence, this patch moves printing of cgroup paths under CONFIG_CFQ_GROUP_IOSCHED. This allows us to get rid of CONFIG_DEBUG_CFQ_IOSCHED completely. Now all the debug stat files are controlled only by CONFIG_DEBUG_BLK_CGROUP which can be enabled through config menu. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Divyesh Shah <dpshah@google.com> Reviewed-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2010-04-26 19:27:56 +02:00
Li Zefan	32bd7eb5a7	sched: Remove remaining USER_SCHED code This is left over from commit `7c9414385e` ("sched: Remove USER_SCHED"") Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Dhaval Giani <dhaval.giani@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: David Howells <dhowells@redhat.com> LKML-Reference: <4BA9A05F.7010407@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-04-02 20:12:00 +02:00
Kirill A. Shutemov	0dea116876	cgroup: implement eventfd-based generic API for notifications This patchset introduces eventfd-based API for notifications in cgroups and implements memory notifications on top of it. It uses statistics in memory controler to track memory usage. Output of time(1) on building kernel on tmpfs: Root cgroup before changes: make -j2 506.37 user 60.93s system 193% cpu 4:52.77 total Non-root cgroup before changes: make -j2 507.14 user 62.66s system 193% cpu 4:54.74 total Root cgroup after changes (0 thresholds): make -j2 507.13 user 62.20s system 193% cpu 4:53.55 total Non-root cgroup after changes (0 thresholds): make -j2 507.70 user 64.20s system 193% cpu 4:55.70 total Root cgroup after changes (1 thresholds, never crossed): make -j2 506.97 user 62.20s system 193% cpu 4:53.90 total Non-root cgroup after changes (1 thresholds, never crossed): make -j2 507.55 user 64.08s system 193% cpu 4:55.63 total This patch: Introduce the write-only file "cgroup.event_control" in every cgroup. To register new notification handler you need: - create an eventfd; - open a control file to be monitored. Callbacks register_event() and unregister_event() must be defined for the control file; - write "<event_fd> <control_fd> <args>" to cgroup.event_control. Interpretation of args is defined by control file implementation; eventfd will be woken up by control file implementation or when the cgroup is removed. To unregister notification handler just close eventfd. If you need notification functionality for a control file you have to implement callbacks register_event() and unregister_event() in the struct cftype. [kamezawa.hiroyu@jp.fujitsu.com: Kconfig fix] Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Dan Malek <dan@embeddedalley.com> Cc: Vladislav Buzov <vbuzov@embeddedalley.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Alexander Shishkin <virtuoso@slind.org> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-12 15:52:37 -08:00
Linus Torvalds	f66ffdedbf	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits) sched: Fix SCHED_MC regression caused by change in sched cpu_power sched: Don't use possibly stale sched_class kthread, sched: Remove reference to kthread_create_on_cpu sched: cpuacct: Use bigger percpu counter batch values for stats counters percpu_counter: Make __percpu_counter_add an inline function on UP sched: Remove member rt_se from struct rt_rq sched: Change usage of rt_rq->rt_se to rt_rq->tg->rt_se[cpu] sched: Remove unused update_shares_locked() sched: Use for_each_bit sched: Queue a deboosted task to the head of the RT prio queue sched: Implement head queueing for sched_rt sched: Extend enqueue_task to allow head queueing sched: Remove USER_SCHED sched: Fix the place where group powers are updated sched: Assume *balance is valid sched: Remove load_balance_newidle() sched: Unify load_balance{,_newidle}() sched: Add a lock break for PREEMPT=y sched: Remove from fwd decls sched: Remove rq_iterator from move_one_task ... Fix up trivial conflicts in kernel/sched.c	2010-02-28 10:31:01 -08:00
Linus Torvalds	6556a67435	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (172 commits) perf_event, amd: Fix spinlock initialization perf_event: Fix preempt warning in perf_clock() perf tools: Flush maps on COMM events perf_events, x86: Split PMU definitions into separate files perf annotate: Handle samples not at objdump output addr boundaries perf_events, x86: Remove superflous MSR writes perf_events: Simplify code by removing cpu argument to hw_perf_group_sched_in() perf_events, x86: AMD event scheduling perf_events: Add new start/stop PMU callbacks perf_events: Report the MMAP pgoff value in bytes perf annotate: Defer allocating sym_priv->hist array perf symbols: Improve debugging information about symtab origins perf top: Use a macro instead of a constant variable perf symbols: Check the right return variable perf/scripts: Tag syscall_name helper as not yet available perf/scripts: Add perf-trace-python Documentation perf/scripts: Remove unnecessary PyTuple resizes perf/scripts: Add syscall tracing scripts perf/scripts: Add Python scripting engine perf/scripts: Remove check-perf-trace from listed scripts ... Fix trivial conflict in tools/perf/util/probe-event.c	2010-02-28 10:20:25 -08:00
Linus Torvalds	d25e8dbdab	Merge branch 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: oprofile/x86: fix msr access to reserved counters oprofile/x86: use kzalloc() instead of kmalloc() oprofile/x86: fix perfctr nmi reservation for mulitplexing oprofile/x86: add comment to counter-in-use warning oprofile/x86: warn user if a counter is already active oprofile/x86: implement randomization for IBS periodic op counter oprofile/x86: implement lsfr pseudo-random number generator for IBS oprofile/x86: implement IBS cpuid feature detection oprofile/x86: remove node check in AMD IBS initialization oprofile/x86: remove OPROFILE_IBS config option oprofile: remove EXPERIMENTAL from the config option description oprofile: remove tracing build dependency	2010-02-28 10:15:31 -08:00
Linus Torvalds	642c4c75a7	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (44 commits) rcu: Fix accelerated GPs for last non-dynticked CPU rcu: Make non-RCU_PROVE_LOCKING rcu_read_lock_sched_held() understand boot rcu: Fix accelerated grace periods for last non-dynticked CPU rcu: Export rcu_scheduler_active rcu: Make rcu_read_lock_sched_held() take boot time into account rcu: Make lockdep_rcu_dereference() message less alarmist sched, cgroups: Fix module export rcu: Add RCU_CPU_STALL_VERBOSE to dump detailed per-task information rcu: Fix rcutorture mod_timer argument to delay one jiffy rcu: Fix deadlock in TREE_PREEMPT_RCU CPU stall detection rcu: Convert to raw_spinlocks rcu: Stop overflowing signed integers rcu: Use canonical URL for Mathieu's dissertation rcu: Accelerate grace period if last non-dynticked CPU rcu: Fix citation of Mathieu's dissertation rcu: Documentation update for CONFIG_PROVE_RCU security: Apply lockdep-based checking to rcu_dereference() uses idr: Apply lockdep-based diagnostics to rcu_dereference() uses radix-tree: Disable RCU lockdep checking in radix tree vfs: Abstract rcu_dereference_check for files-fdtable use ...	2010-02-28 10:13:16 -08:00

... 3 4 5 6 7 ...

755 Commits