linux

Author	SHA1	Message	Date
Linus Torvalds	ccb1ec95e9	Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "The itimer removal one is not strictly a fix, but I really wanted to avoid a rebase of the urgent ones." * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "clocksource: Load the ACPI PM clocksource asynchronously" clockevents: tTack broadcast device mode change in tick_broadcast_switch_to_oneshot() itimer: Use printk_once instead of WARN_ONCE nohz: Fix stale jiffies update in tick_nohz_restart() tick: Document TICK_ONESHOT config option proc: stats: Use arch_idle_time for idle and iowait times if available itimer: Schedule silent NULL pointer fixup in setitimer() for removal	2012-04-12 15:16:26 -07:00
Dave Jones	4edc2ca388	Btrfs: fix use-after-free in __btrfs_end_transaction `49b25e0540` introduced a use-after-free bug that caused spurious -EIO's to be returned. Do the check before we free the transaction. Cc: David Sterba <dsterba@suse.cz> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2012-04-12 16:03:56 -04:00
Tsutomu Itoh	e627ee7bcd	Btrfs: check return value of bio_alloc() properly bio_alloc() has the possibility of returning NULL. So, it is necessary to check the return value. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2012-04-12 16:03:56 -04:00
Ilya Dryomov	c6664b42c4	Btrfs: remove lock assert from get_restripe_target() This fixes a regression introduced by `fc67c450`. spin_is_locked() always returns 0 on UP kernels, which caused assert in get_restripe_target() to be fired on every call from btrfs_reduce_alloc_profile() on UP systems. Remove it completely for now, it's not clear if it's going to be needed in future. Reported-by: Bobby Powers <bobbypowers@gmail.com> Reported-by: Mitch Harder <mitch.harder@sabayonlinux.org> Tested-by: Mitch Harder <mitch.harder@sabayonlinux.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2012-04-12 16:03:56 -04:00
Liu Bo	b89203f74b	Btrfs: fix eof while discarding extents We miscalculate the length of extents we're discarding, and it leads to an eof of device. Reported-by: Daniel Blueman <daniel@quora.org> Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>	2012-04-12 16:03:56 -04:00
Chris Mason	d95603b262	Btrfs: fix uninit variable in repair_eb_io_failure We'd have to be passing bogus extent buffers for this uninit variable to actually be used, but set it to zero just in case. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2012-04-12 15:55:15 -04:00
Linus Torvalds	5ba7026b44	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Regression fix in mtdchar_open(), fix for a really old leak (almost never hit in practice - it's a b0rken failure exit in simple_fill_super()) and a typo fix in vfs.txt (misspelled method type)." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: typo fix in Documentation/filesystems/vfs.txt dentry leak in simple_fill_super() failure exit fix breakage in mtdchar_open(), sanitize failure exits	2012-04-12 12:07:39 -07:00
Chris Mason	8e62c2de6e	Revert "Btrfs: increase the global block reserve estimates" This reverts commit `5500cdbe14`. We've had a number of complaints of early enospc that bisect down to this patch. We'll hae to fix the reservations differently. CC: stable@kernel.org Signed-off-by: Chris Mason <chris.mason@oracle.com>	2012-04-12 13:46:48 -04:00
Stanislav Kinsbursky	f69adb2fe2	nfsd: allocate id-to-name and name-to-id caches in per-net operations. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:12:11 -04:00
Stanislav Kinsbursky	9e75a4dee0	nfsd: make name-to-id cache allocated per network namespace context Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:12:10 -04:00
Stanislav Kinsbursky	c2e76ef5e0	nfsd: make id-to-name cache allocated per network namespace context Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:12:10 -04:00
Stanislav Kinsbursky	43ec1a20bf	nfsd: pass network context to idmap init/exit functions These functions will be called from per-net operations. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:12:10 -04:00
Stanislav Kinsbursky	5717e01284	nfsd: allocate export and expkey caches in per-net operations. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:11:47 -04:00
Stanislav Kinsbursky	e5f06f720e	nfsd: make expkey cache allocated per network namespace context This patch also changes svcauth_unix_purge() function: added network namespace as a parameter and thus loop over all networks was replaced by only one call for ip map cache purge. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:11:46 -04:00
Stanislav Kinsbursky	b3853e0ea1	nfsd: make export cache allocated per network namespace context This patch also changes prototypes of nfsd_export_flush() and exp_rootfh(): network namespace parameter added. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:11:11 -04:00
Stanislav Kinsbursky	2a75cfa64e	nfsd: pass pointer to export cache down to stack wherever possible. This cache will be per-net soon. And it's easier to get the pointer to desired per-net instance only once and then pass it down instead of discovering it in every place were required. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-12 09:10:19 -04:00
Sachin Prabhu	4fe9e9639d	Cleanup handling of NULL value passed for a mount option Allow blank user= and ip= mount option. Also clean up redundant checks for NULL values since the token parser will not actually match mount options with NULL values unless explicitly specified. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reported-by: Chris Clayton <chris2553@googlemail.com> Acked-by: Jeff Layton <jlayton@samba.org> Tested-by: Chris Clayton <chris2553@googlemail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2012-04-11 22:53:02 -05:00
Stanislav Kinsbursky	b89109bef4	nfsd: pass network context to export caches init/shutdown routines These functions will be called from per-net operations. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 18:01:33 -04:00
Stanislav Kinsbursky	e3f70eadb7	Lockd: pass network namespace to creation and destruction routines v2: dereference of most probably already released nlm_host removed in nlmclnt_done() and reclaimer(). These routines are called from locks reclaimer() kernel thread. This thread works in "init_net" network context and currently relays on persence on lockd thread and it's per-net resources. Thus lockd_up() and lockd_down() can't relay on current network context. So let's pass corrent one into them. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:06 -04:00
Stanislav Kinsbursky	f890edbbef	NFSd: remove hard-coded dereferences to name-to-id and id-to-name caches These dereferences to global static caches are redundant. They also prevents converting these caches into per-net ones. So this patch is cleanup + precursor of patch set,a which will make them per-net. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:05 -04:00
Stanislav Kinsbursky	c89172e36e	nfsd: pass pointer to expkey cache down to stack wherever possible. This cache will be per-net soon. And it's easier to get the pointer to desired per-net instance only once and then pass it down instead of discovering it in every place were required. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:05 -04:00
Stanislav Kinsbursky	83e0ed700d	nfsd: use hash table from cache detail in nfsd export seq ops Hard-code is redundant and will prevent from making caches per net ns. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:04 -04:00
Stanislav Kinsbursky	f2c7ea10f9	nfsd: pass svc_export_cache pointer as private data to "exports" seq file ops Global svc_export_cache cache is going to be replaced with per-net instance. So prepare the ground for it. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:03 -04:00
Stanislav Kinsbursky	a09581f294	nfsd: use exp_put() for svc_export_cache put This patch replaces cache_put() call for svc_export_cache by exp_put() call. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:02 -04:00
Stanislav Kinsbursky	db3a353263	nfsd: add link to owner cache detail to svc_export structure Without info about owner cache datail it won't be able to find out, which per-net cache detail have to be. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:01 -04:00
Stanislav Kinsbursky	d4bb527e9e	nfsd: use passed cache_detail pointer expkey_parse() Using of hard-coded svc_expkey_cache pointer in expkey_parse() looks redundant. Moreover, global cache will be replaced with per-net instance soon. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:00 -04:00
Jeff Layton	33dcc481ed	nfsd: don't use locks_in_grace to determine whether to call nfs4_grace_end It's possible that lockd or another lock manager might still be on the list after we call nfsd4_end_grace. If the laundromat thread runs again at that point, then we could end up calling nfsd4_end_grace more than once. That's not only inefficient, but calling nfsd4_recdir_purge_old more than once could be problematic. Fix this by adding a new global "grace_ended" flag and use that to determine whether we've already called nfsd4_grace_end. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:55:00 -04:00
Jeff Layton	03af42c59e	nfsd: trivial: remove unused variable from nfsd4_lock ..."fp" is set but never used. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:54:58 -04:00
J. Bruce Fields	9dc4e6c4d1	nfsd: don't fail unchecked creates of non-special files Allow a v3 unchecked open of a non-regular file succeed as if it were a lookup; typically a client in such a case will want to fall back on a local open, so succeeding and giving it the filehandle is more useful than failing with nfserr_exist, which makes it appear that nothing at all exists by that name. Similarly for v4, on an open-create, return the same errors we would on an attempt to open a non-regular file, instead of returning nfserr_exist. This fixes a problem found doing a v4 open of a symlink with O_RDONLY\|O_CREAT, which resulted in the current client returning EEXIST. Thanks also to Trond for analysis. Cc: stable@kernel.org Reported-by: Orion Poplawski <orion@cora.nwra.com> Tested-by: Orion Poplawski <orion@cora.nwra.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-04-11 17:49:52 -04:00
Linus Torvalds	1b6150fe82	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes Pull GFS2 fixes from Steven Whitehouse * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes: GFS2: Allow caching of rindex glock GFS2: Make sure rindex is uptodate before starting transactions GFS2: use depends instead of select in kconfig GFS2: put glock reference in error patch of read_rindex_entry	2012-04-11 11:04:45 -07:00
Miklos Szeredi	0a2da9b2ef	fuse: allow nanosecond granularity Derrik Pates reports that an utimensat with a NULL argument results in the current time being sent from the kernel with 1 second granularity. Reported-by: Derrik Pates <demon@now.ai> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2012-04-11 11:45:06 +02:00
Artem Bityutskiy	f72cf5e223	ext2: do not register write_super within VFS Jan Kara removed 'sb->s_dirt' VFS flag references, so we do not need to register the ext2 'ext2_write_super()' method in the VFS superblock operations, because 'sb->s_dirt' won't be ever set to 1 and VFS won't ever call '->write_super()' anyway. Thus, remove the method. Tested using xfstests. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz>	2012-04-11 11:12:46 +02:00
Jan Kara	b838ec2232	ext2: Remove s_dirt handling Places which modify superblock feature / state fields mark the superblock buffer dirty so it is written out by flusher thread. Thus there's no need to set s_dirt there. The only other fields changing in the superblock are the numbers of free blocks, free inodes and s_wtime. There's no real need to write (or even compute) these periodically. Free blocks / inodes counters are recomputed on every mount from group counters anyway and value of s_wtime is only informational and imprecise anyway. So it should be enough to write these opportunistically on mount, remount, umount, and sync_fs times. Signed-off-by: Jan Kara <jack@suse.cz>	2012-04-11 11:12:45 +02:00
Artem Bityutskiy	f2b2242081	ext2: write superblock only once on unmount Currently on unmount if we are mounted R/W, we first write the superblock to the media if it is dirty, and then write it again, which is not optimal. This patch makes ext2 write the superblock on unmount less times. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz>	2012-04-11 11:12:45 +02:00
Akira Fujita	ac0dd24748	ext3: remove max_debt in find_group_orlov() max_debt, involved variables and calculations are no longer needed, clean them up. Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com> Signed-off-by: Jan Kara <jack@suse.cz>	2012-04-11 11:12:44 +02:00
Jan Kara	2db938bee3	jbd: Refine commit writeout logic Currently we write out all journal buffers in WRITE_SYNC mode. This improves performance for fsync heavy workloads but hinders performance when writes are mostly asynchronous, most noticably it slows down readers and users complain about slow desktop response etc. So submit writes as asynchronous in the normal case and only submit writes as WRITE_SYNC if we detect someone is waiting for current transaction commit. I've gathered some numbers to back this change. The first is the read latency test. It measures time to read 1 MB after several seconds of sleeping in presence of streaming writes. Top 10 times (out of 90) in us: Before After 2131586 697473 1709932 557487 1564598 535642 1480462 347573 1478579 323153 1408496 222181 1388960 181273 1329565 181070 1252486 172832 1223265 172278 Average: 619377 82180 So the improvement in both maximum and average latency is massive. I've measured fsync throughput by: fs_mark -n 100 -t 1 -s 16384 -d /mnt/fsync/ -S 1 -L 4 in presence of streaming reader. The numbers (fsyncs/s) are: Before After 9.9 6.3 6.8 6.0 6.3 6.2 5.8 6.1 So fsync performance seems unharmed by this change. Signed-off-by: Jan Kara <jack@suse.cz>	2012-04-11 11:12:44 +02:00
Dan Williams	3a198886ab	sysfs: handle 'parent deleted before child added' In scsi at least two cases of the parent device being deleted before the child is added have been observed. 1/ scsi is performing async scans and the device is removed prior to the async can thread running (can happen with an in-opportune / unlikely unplug during initial scan). 2/ libsas discovery event running after the parent port has been torn down (this is a bug in libsas). Result in crash signatures like: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 IP: [<ffffffff8115e100>] sysfs_create_dir+0x32/0xb6 ... Process scsi_scan_8 (pid: 5417, threadinfo ffff88080bd16000, task ffff880801b8a0b0) Stack: 00000000fffffffe ffff880813470628 ffff88080bd17cd0 ffff88080614b7e8 ffff88080b45c108 00000000fffffffe ffff88080bd17d20 ffffffff8125e4a8 ffff88080bd17cf0 ffffffff81075149 ffff88080bd17d30 ffff88080614b7e8 Call Trace: [<ffffffff8125e4a8>] kobject_add_internal+0x120/0x1e3 [<ffffffff81075149>] ? trace_hardirqs_on+0xd/0xf [<ffffffff8125e641>] kobject_add_varg+0x41/0x50 [<ffffffff8125e70b>] kobject_add+0x64/0x66 [<ffffffff8131122b>] device_add+0x12d/0x63a In this scenario the parent is still valid (because we have a reference), but it has been device_del()'d which means its kobj->sd pointer is NULL'd via: device_del()->kobject_del()->sysfs_remove_dir() ...and then sysfs_create_dir() (without this fix) goes ahead and de-references parent_sd via sysfs_ns_type(): return (sd->s_flags & SYSFS_NS_TYPE_MASK) >> SYSFS_NS_TYPE_SHIFT; This scenario is being fixed in scsi/libsas, but if other subsystems present the same ordering the system need not immediately crash. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: James Bottomley <JBottomley@parallels.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-04-10 14:48:51 -07:00
Bruno Prémont	5631f2c18f	sysfs: Prevent crash on unset sysfs group attributes Do not let the kernel crash when a device is registered with sysfs while group attributes are not set (aka NULL). Warn about the offender with some information about the offending device. This would warn instead of trying NULL pointer deref like: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81152673>] internal_create_group+0x83/0x1a0 PGD 0 Oops: 0000 [#1] SMP CPU 0 Modules linked in: Pid: 1, comm: swapper/0 Not tainted 3.4.0-rc1-x86_64 #3 HP ProLiant DL360 G4 RIP: 0010:[<ffffffff81152673>] [<ffffffff81152673>] internal_create_group+0x83/0x1a0 RSP: 0018:ffff88019485fd70 EFLAGS: 00010202 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001 RDX: ffff880192e99908 RSI: ffff880192e99630 RDI: ffffffff81a26c60 RBP: ffff88019485fdc0 R08: 0000000000000000 R09: 0000000000000000 R10: ffff880192e99908 R11: 0000000000000000 R12: ffffffff81a16a00 R13: ffff880192e99908 R14: ffffffff81a16900 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88019bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000001a0c000 CR4: 00000000000007f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper/0 (pid: 1, threadinfo ffff88019485e000, task ffff880194878000) Stack: ffff88019485fdd0 ffff880192da9d60 0000000000000000 ffff880192e99908 ffff880192e995d8 0000000000000001 ffffffff81a16a00 ffff880192da9d60 0000000000000000 0000000000000000 ffff88019485fdd0 ffffffff811527be Call Trace: [<ffffffff811527be>] sysfs_create_group+0xe/0x10 [<ffffffff81376ca6>] device_add_groups+0x46/0x80 [<ffffffff81377d3d>] device_add+0x46d/0x6a0 ... Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-04-10 14:48:51 -07:00
Bob Peterson	ca9248d833	GFS2: Allow caching of rindex glock This patch allows caching of the rindex glock. We were previously setting the GL_NOCACHE bit when the glock was released. That forced the rindex inode to be invalidated, which caused us to re-read rindex at the next access. However, it caused the glock to be unnecessarily bounced around the cluster. This patch allows the glock to remain cached, but it still causes the rindex to be re-read once it has been written to by gfs2_grow. Ben and I have tested single-node gfs2_grow cases and I've tested clustered gfs2_grow cases on my four-node cluster. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2012-04-10 13:49:53 +01:00
Tom Goff	70fa4a62e9	sysfs: Update the name hash for an entry after changing the namespace This is needed to allow renaming network devices that have been moved to another network namespace. Signed-off-by: Tom Goff <thomas.goff@boeing.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-04-09 15:33:00 -07:00
Eric Paris	83d498569e	SELinux: rename dentry_open to file_open dentry_open takes a file, rename it to file_open Signed-off-by: Eric Paris <eparis@redhat.com>	2012-04-09 12:22:50 -04:00
Al Viro	640946f203	dentry leak in simple_fill_super() failure exit d_genocide() does _not_ evict dentries; it just removes extra ref pinning each of those. Normally it's followed by shrinking the tree (it's done just before generic_shutdown_super() by kill_litter_super()), but in case of simple_fill_super() nothing of that kind will follow. Just do shrink_dcache_parent() manually. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-04-09 01:39:22 -04:00
Jiri Kosina	e75d660672	Merge branch 'master' into for-next Merge with latest Linus' tree, as I have incoming patches that fix code that is newer than current HEAD of for-next. Conflicts: drivers/net/ethernet/realtek/r8169.c	2012-04-08 21:48:52 +02:00
Eric W. Biederman	7b44ab978b	userns: Disassociate user_struct from the user_namespace. Modify alloc_uid to take a kuid and make the user hash table global. Stop holding a reference to the user namespace in struct user_struct. This simplifies the code and makes the per user accounting not care about which user namespace a uid happens to appear in. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-04-07 17:11:46 -07:00
Eric W. Biederman	1a48e2ac03	userns: Replace the hard to write inode_userns with inode_capable. This represents a change in strategy of how to handle user namespaces. Instead of tagging everything explicitly with a user namespace and bulking up all of the comparisons of uids and gids in the kernel, all uids and gids in use will have a mapping to a flat kuid and kgid spaces respectively. This allows much more of the existing logic to be preserved and in general allows for faster code. In this new and improved world we allow someone to utiliize capabilities over an inode if the inodes owner mapps into the capabilities holders user namespace and the user has capabilities in their user namespace. Which is simple and efficient. Moving the fs uid comparisons to be comparisons in a flat kuid space follows in later patches, something that is only significant if you are using user namespaces. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-04-07 17:02:46 -07:00
Eric W. Biederman	c4a4d60379	userns: Use cred->user_ns instead of cred->user->user_ns Optimize performance and prepare for the removal of the user_ns reference from user_struct. Remove the slow long walk through cred->user->user_ns and instead go straight to cred->user_ns. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-04-07 16:55:51 -07:00
Linus Torvalds	f68e556e23	Make the "word-at-a-time" helper functions more commonly usable I have a new optimized x86 "strncpy_from_user()" that will use these same helper functions for all the same reasons the name lookup code uses them. This is preparation for that. This moves them into an architecture-specific header file. It's architecture-specific for two reasons: - some of the functions are likely to want architecture-specific implementations. Even if the current code happens to be "generic" in the sense that it should work on any little-endian machine, it's likely that the "multiply by a big constant and shift" implementation is less than optimal for an architecture that has a guaranteed fast bit count instruction, for example. - I expect that if architectures like sparc want to start playing around with this, we'll need to abstract out a few more details (in particular the actual unaligned accesses). So we're likely to have more architecture-specific stuff if non-x86 architectures start using this. (and if it turns out that non-x86 architectures don't start using this, then having it in an architecture-specific header is still the right thing to do, of course) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-04-06 13:54:56 -07:00
Linus Torvalds	23f347ef63	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking updates from David Miller: 1) Fix inaccuracies in network driver interface documentation, from Ben Hutchings. 2) Fix handling of negative offsets in BPF JITs, from Jan Seiffert. 3) Compile warning, locking, and refcounting fixes in netfilter's xt_CT, from Pablo Neira Ayuso. 4) phonet sendmsg needs to validate user length just like any other datagram protocol, fix from Sasha Levin. 5) Ipv6 multicast code uses wrong loop index, from RongQing Li. 6) Link handling and firmware fixes in bnx2x driver from Yaniv Rosner and Yuval Mintz. 7) mlx4 erroneously allocates 4 pages at a time, regardless of page size, fix from Thadeu Lima de Souza Cascardo. 8) SCTP socket option wasn't extended in a backwards compatible way, fix from Thomas Graf. 9) Add missing address change event emissions to bonding, from Shlomo Pongratz. 10) /proc/net/dev regressed because it uses a private offset to track where we are in the hash table, but this doesn't track the offset pullback that the seq_file code does resulting in some entries being missed in large dumps. Fix from Eric Dumazet. 11) do_tcp_sendpage() unloads the send queue way too fast, because it invokes tcp_push() when it shouldn't. Let the natural sequence generated by the splice paths, and the assosciated MSG_MORE settings, guide the tcp_push() calls. Otherwise what goes out of TCP is spaghetti and doesn't batch effectively into GSO/TSO clusters. From Eric Dumazet. 12) Once we put a SKB into either the netlink receiver's queue or a socket error queue, it can be consumed and freed up, therefore we cannot touch it after queueing it like that. Fixes from Eric Dumazet. 13) PPP has this annoying behavior in that for every transmit call it immediately stops the TX queue, then calls down into the next layer to transmit the PPP frame. But if that next layer can take it immediately, it just un-stops the TX queue right before returning from the transmit method. Besides being useless work, it makes several facilities unusable, in particular things like the equalizers. Well behaved devices should only stop the TX queue when they really are full, and in PPP's case when it gets backlogged to the downstream device. David Woodhouse therefore fixed PPP to not stop the TX queue until it's downstream can't take data any more. 14) IFF_UNICAST_FLT got accidently lost in some recent stmmac driver changes, re-add. From Marc Kleine-Budde. 15) Fix link flaps in ixgbe, from Eric W. Multanen. 16) Descriptor writeback fixes in e1000e from Matthew Vick. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits) net: fix a race in sock_queue_err_skb() netlink: fix races after skb queueing doc, net: Update ndo_start_xmit return type and values doc, net: Remove instruction to set net_device::trans_start doc, net: Update netdev operation names doc, net: Update documentation of synchronisation for TX multiqueue doc, net: Remove obsolete reference to dev->poll ethtool: Remove exception to the requirement of holding RTNL lock MAINTAINERS: update for Marvell Ethernet drivers bonding: properly unset current_arp_slave on slave link up phonet: Check input from user before allocating tcp: tcp_sendpages() should call tcp_push() once ipv6: fix array index in ip6_mc_add_src() mlx4: allocate just enough pages instead of always 4 pages stmmac: re-add IFF_UNICAST_FLT for dwmac1000 bnx2x: Clear MDC/MDIO warning message bnx2x: Fix BCM57711+BCM84823 link issue bnx2x: Clear BCM84833 LED after fan failure bnx2x: Fix BCM84833 PHY FW version presentation bnx2x: Fix link issue for BCM8727 boards. ...	2012-04-06 10:37:38 -07:00
Jesper Juhl	9c017abc50	btrfs: assignment in write_dev_flush() doesn't need two semi-colons One is enough. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-04-05 17:07:14 -07:00
Eric Dumazet	35f9c09fe9	tcp: tcp_sendpages() should call tcp_push() once commit `2f53384424` (tcp: allow splice() to build full TSO packets) added a regression for splice() calls using SPLICE_F_MORE. We need to call tcp_flush() at the end of the last page processed in tcp_sendpages(), or else transmits can be deferred and future sends stall. Add a new internal flag, MSG_SENDPAGE_NOTLAST, acting like MSG_MORE, but with different semantic. For all sendpage() providers, its a transparent change. Only sock_sendpage() and tcp_sendpages() can differentiate the two different flags provided by pipe_to_sendpage() Reported-by: Tom Herbert <therbert@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: H.K. Jerry Chu <hkchu@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: Eric Dumazet <eric.dumazet@gmail>com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-05 19:04:27 -04:00

... 261 262 263 264 265 ...

39703 Commits