linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-04 10:01:41 +00:00

Author	SHA1	Message	Date
Simo Sorce	030d794bf4	SUNRPC: Use gssproxy upcall for server RPCGSS authentication. The main advantge of this new upcall mechanism is that it can handle big tickets as seen in Kerberos implementations where tickets carry authorization data like the MS-PAC buffer with AD or the Posix Authorization Data being discussed in IETF on the krbwg working group. The Gssproxy program is used to perform the accept_sec_context call on the kernel's behalf. The code is changed to also pass the input buffer straight to upcall mechanism to avoid allocating and copying many pages as tokens can be as big (potentially more in future) as 64KiB. Signed-off-by: Simo Sorce <simo@redhat.com> [bfields: containerization, negotiation api] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-26 11:41:28 -04:00
Simo Sorce	1d658336b0	SUNRPC: Add RPC based upcall mechanism for RPCGSS auth This patch implements a sunrpc client to use the services of the gssproxy userspace daemon. In particular it allows to perform calls in user space using an RPC call instead of custom hand-coded upcall/downcall messages. Currently only accept_sec_context is implemented as that is all is needed for the server case. File server modules like NFS and CIFS can use full gssapi services this way, once init_sec_context is also implemented. For the NFS server case this code allow to lift the limit of max 2k krb5 tickets. This limit is prevents legitimate kerberos deployments from using krb5 authentication with the Linux NFS server as they have normally ticket that are many kilobytes large. It will also allow to lift the limitation on the size of the credential set (uid,gid,gids) passed down from user space for users that have very many groups associated. Currently the downcall mechanism used by rpc.svcgssd is limited to around 2k secondary groups of the 65k allowed by kernel structures. Signed-off-by: Simo Sorce <simo@redhat.com> [bfields: containerization, concurrent upcalls, misc. fixes and cleanup] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-26 11:41:27 -04:00
Simo Sorce	400f26b542	SUNRPC: conditionally return endtime from import_sec_context We expose this parameter for a future caller. It will be used to extract the endtime from the gss-proxy upcall mechanism, in order to set the rsc cache expiration time. Signed-off-by: Simo Sorce <simo@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-26 11:41:27 -04:00
J. Bruce Fields	33d90ac058	SUNRPC: allow disabling idle timeout In the gss-proxy case we don't want to have to reconnect at random--we want to connect only on gss-proxy startup when we can steal gss-proxy's context to do the connect in the right namespace. So, provide a flag that allows the rpc_create caller to turn off the idle timeout. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-26 11:41:26 -04:00
J. Bruce Fields	7073ea8799	SUNRPC: attempt AF_LOCAL connect on setup In the gss-proxy case, setup time is when I know I'll have the right namespace for the connect. In other cases, it might be useful to get any connection errors earlier--though actually in practice it doesn't make any difference for rpcbind. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-26 11:41:25 -04:00
J. Bruce Fields	c85b03ab20	Merge Trond's nfs-for-next Merging Trond's nfs-for-next branch, mainly to get `b7993cebb8` "SUNRPC: Allow rpc_create() to request that TCP slots be unlimited", which a small piece of the gss-proxy work depends on.	2013-04-26 11:37:43 -04:00
Trond Myklebust	b7993cebb8	SUNRPC: Allow rpc_create() to request that TCP slots be unlimited This is mainly for use by NFSv4.1, where the session negotiation ultimately wants to decide how many RPC slots we can fill. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-04-14 12:26:03 -04:00
Trond Myklebust	ba60eb25ff	SUNRPC: Fix a livelock problem in the xprt->backlog queue This patch ensures that we throttle new RPC requests if there are requests already waiting in the xprt->backlog queue. The reason for doing this is to fix livelock issues that can occur when an existing (high priority) task is waiting in the backlog queue, gets woken up by xprt_free_slot(), but a new task then steals the slot. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-04-14 12:26:02 -04:00
Paul Bolle	cb60718747	sunrpc: drop "select NETVM" The Kconfig entry for SUNRPC_SWAP selects NETVM. That select statement was added in commit `a564b8f039` ("nfs: enable swap on NFS"). But there's no Kconfig symbol NETVM. It apparently was only in used in development versions of the swap over nfs functionality but never entered mainline. Anyhow, it is a nop and can safely be dropped. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-04-05 17:03:52 -04:00
Alexey Khoroshilov	a7823c797b	SUNRPC/cache: add module_put() on error path in cache_open() If kmalloc() fails in cache_open(), module cd->owner left locked. The patch adds module_put(cd->owner) on this path. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-04-03 15:32:32 -04:00
Trond Myklebust	3ed5e2a2c3	SUNRPC: Report network/connection errors correctly for SOFTCONN rpc tasks In the case of a SOFTCONN rpc task, we really want to ensure that it reports errors like ENETUNREACH back to the caller. Currently, only some of these errors are being reported back (connect errors are not), and they are being converted by the RPC layer into EIO. Reported-by: Jan Engelhardt <jengelh@inai.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-03-25 12:04:10 -04:00
Trond Myklebust	1166fde6a9	SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked We need to be careful when testing task->tk_waitqueue in rpc_wake_up_task_queue_locked, because it can be changed while we are holding the queue->lock. By adding appropriate memory barriers, we can ensure that it is safe to test task->tk_waitqueue for equality if the RPC_TASK_QUEUED bit is set. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org	2013-03-25 11:23:40 -04:00
Linus Torvalds	aea8b5d1e5	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace bugfixes from Eric Biederman: "This tree includes a partial revert for "fs: Limit sys_mount to only request filesystem modules." When I added the new style module aliases to the filesystems I deleted the old ones. A bad move. It turns out that distributions like Arch linux use module aliases when constructing ramdisks. Which meant ultimately that an ext3 filesystem mounted with ext4 would not result in the ext4 module being put into the ramdisk. The other change in this tree adds a handful of filesystem module alias I simply failed to add the first time. Which inconvinienced a few folks using cifs. I don't want to inconvinience folks any longer than I have to so here are these trivial fixes." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: fs: Readd the fs module aliases. fs: Limit sys_mount to only request filesystem modules. (Part 3)	2013-03-13 15:47:50 -07:00
Eric W. Biederman	fa7614ddd6	fs: Readd the fs module aliases. I had assumed that the only use of module aliases for filesystems prior to "fs: Limit sys_mount to only request filesystem modules." was in request_module. It turns out I was wrong. At least mkinitcpio in Arch linux uses these aliases. So readd the preexising aliases, to keep from breaking userspace. Userspace eventually will have to follow and use the same aliases the kernel does. So at some point we may be delete these aliases without problems. However that day is not today. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-03-12 18:55:21 -07:00
Linus Torvalds	368edaadc0	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fix from Sage Weil: "This fixes a bug in the new message decoding that just went in during the last window." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: fix decoding of pgids	2013-03-12 09:22:42 -07:00
Linus Torvalds	5b22b1848b	Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux Pull nfsd bugfixes from Bruce Fields: "Some minor fallout from the user-namespace work broke most krb5 mounts to nfsd, and I screwed up a change to the AF_LOCAL rpc code." * 'for-3.9' of git://linux-nfs.org/~bfields/linux: sunrpc: don't attempt to cancel unitialized work nfsd: fix krb5 handling of anonymous principals	2013-03-12 09:20:58 -07:00
Sage Weil	d6c0dd6b0c	libceph: fix decoding of pgids In `4f6a7e5ee1` we effectively dropped support for the legacy encoding for the OSDMap and incremental. However, we didn't fix the decoding for the pgid. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>	2013-03-11 14:31:00 -07:00
Linus Torvalds	0cb7750825	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Missing cancel of work items in mac80211 MLME, from Ben Greear. 2) Fix DMA mapping handling in iwlwifi by using coherent DMA for command headers, from Johannes Berg. 3) Decrease the amount of pressure on the page allocator by using order 1 pages less in iwlwifi, from Emmanuel Grumbach. 4) Fix mesh PS broadcast OOPS in mac80211, from Marco Porsch. 5) Don't forget to recalculate idle state in mac80211 monitor interface, from Felix Fietkau. 6) Fix varargs in netfilter conntrack handler, from Joe Perches. 7) Need to reset entire chip when command queue fills up in iwlwifi, from Emmanuel Grumbach. 8) The TX antenna value must be valid when calibrations are performed in iwlwifi, fix from Dor Shaish. 9) Don't generate netfilter audit log entries when audit is disabled, from Gao Feng. 10) Deal with DMA unit hang on e1000e during power state transitions, from Bruce Allan. 11) Remove BUILD_BUG_ON check from igb driver, from Alexander Duyck. 12) Fix lockdep warning on i2c handling of igb driver, from Carolyn Wyborny. 13) Fix several TTY handling issues in IRDA ircomm tty driver, from Peter Hurley. 14) Several QFQ packet scheduler fixes from Paolo Valente. 15) When VXLAN encapsulates on transmit, we have to reset the netfilter state. From Zang MingJie. 16) Fix jiffie check in net_rx_action() so that we really cap the processing at 2HZ. From Eric Dumazet. 17) Fix erroneous trigger of IP option space exhaustion, when routers are pre-specified and we are looking to see if we can insert a timestamp, we will have the space. From David Ward. 18) Fix various issues in benet driver wrt waiting for firmware to finish POST after resets or errors. From Gavin Shan and Sathya Perla. 19) Fix TX locking in SFC driver, from Ben Hutchings. 20) Like the VXLAN fix above, when we encap in a TUN device we have to reset the netfilter state. This should fix several strange crashes reported by Dave Jones and others. From Eric Dumazet. 21) Don't forget to clean up MAC address resources when shutting down a port in mlx4 driver, from Yan Burman. 22) Fix divide by zero in vmxnet3 driver, from Bhavesh Davda. 23) Fix device statistic regression in tg3 when the driver is using phylib, from Nithin Sujir. 24) Fix info leak in several netlink handlers, from Mathias Krause. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (79 commits) 6lowpan: Fix endianness issue in is_addr_link_local(). rrunner.c: fix possible memory leak in rr_init_one() dcbnl: fix various netlink info leaks rtnl: fix info leak on RTM_GETLINK request for VF devices bridge: fix mdb info leaks tg3: Update link_up flag for phylib devices ipv6: stop multicast forwarding to process interface scoped addresses bridging: fix rx_handlers return code netlabel: fix build problems when CONFIG_IPV6=n drivers/isdn: checkng length to be sure not memory overflow net/rds: zero last byte for strncpy bnx2x: Fix SFP+ misconfiguration in iSCSI boot scenario bnx2x: Fix intermittent long KR2 link up time macvlan: Set IFF_UNICAST_FLT flag to prevent unnecessary promisc mode. team: unsyc the devices addresses when port is removed bridge: add missing vid to br_mdb_get() Fix: sparse warning in inet_csk_prepare_forced_close afkey: fix a typo MAINTAINERS: Update qlcnic maintainers list netlabel: correctly list all the static label mappings ...	2013-03-11 07:51:59 -07:00
YOSHIFUJI Hideaki / 吉藤英明	9026c49272	6lowpan: Fix endianness issue in is_addr_link_local(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-10 16:49:35 -04:00
Mathias Krause	29cd8ae0e1	dcbnl: fix various netlink info leaks The dcb netlink interface leaks stack memory in various places: * perm_addr[] buffer is only filled at max with 12 of the 32 bytes but copied completely, * no in-kernel driver fills all fields of an IEEE 802.1Qaz subcommand, so we're leaking up to 58 bytes for ieee_ets structs, up to 136 bytes for ieee_pfc structs, etc., * the same is true for CEE -- no in-kernel driver fills the whole struct, Prevent all of the above stack info leaks by properly initializing the buffers/structures involved. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-10 05:19:26 -04:00
Mathias Krause	84d73cd3fb	rtnl: fix info leak on RTM_GETLINK request for VF devices Initialize the mac address buffer with 0 as the driver specific function will probably not fill the whole buffer. In fact, all in-kernel drivers fill only ETH_ALEN of the MAX_ADDR_LEN bytes, i.e. 6 of the 32 possible bytes. Therefore we currently leak 26 bytes of stack memory to userland via the netlink interface. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-10 05:19:26 -04:00
Mathias Krause	c085c49920	bridge: fix mdb info leaks The bridging code discloses heap and stack bytes via the RTM_GETMDB netlink interface and via the notify messages send to group RTNLGRP_MDB afer a successful add/del. Fix both cases by initializing all unset members/padding bytes with memset(0). Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-10 05:19:25 -04:00
Linus Torvalds	72932611b4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace bugfixes from Eric Biederman: "This is three simple fixes against 3.9-rc1. I have tested each of these fixes and verified they work correctly. The userns oops in key_change_session_keyring and the BUG_ON triggered by proc_ns_follow_link were found by Dave Jones. I am including the enhancement for mount to only trigger requests of filesystem modules here instead of delaying this for the 3.10 merge window because it is both trivial and the kind of change that tends to bit-rot if left untouched for two months." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: proc: Use nd_jump_link in proc_ns_follow_link fs: Limit sys_mount to only request filesystem modules (Part 2). fs: Limit sys_mount to only request filesystem modules. userns: Stop oopsing in key_change_session_keyring	2013-03-09 16:51:13 -08:00
J. Bruce Fields	190b1ecf25	sunrpc: don't attempt to cancel unitialized work As of `dc107402ae` "SUNRPC: make AF_LOCAL connect synchronous", we no longer initialize connect_worker in the AF_LOCAL case, resulting in warnings like: WARNING: at lib/debugobjects.c:261 debug_print_object+0x8c/0xb0() Hardware name: Bochs ODEBUG: assert_init not available (active state 0) object type: timer_list hint: stub_timer+0x0/0x20 Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfsd auth_rpcgss nfs_acl lockd sunrpc Pid: 4816, comm: nfsd Tainted: G W 3.8.0-rc2-00049-gdc10740 #801 Call Trace: [<ffffffff8156ec00>] ? free_obj_work+0x60/0xa0 [<ffffffff81046aaf>] warn_slowpath_common+0x7f/0xc0 [<ffffffff81046ba6>] warn_slowpath_fmt+0x46/0x50 [<ffffffff8156eccc>] debug_print_object+0x8c/0xb0 [<ffffffff81055030>] ? timer_debug_hint+0x10/0x10 [<ffffffff8156f7e3>] debug_object_assert_init+0xe3/0x120 [<ffffffff81057ebb>] del_timer+0x2b/0x80 [<ffffffff8109c4e6>] ? mark_held_locks+0x86/0x110 [<ffffffff81065a29>] try_to_grab_pending+0xd9/0x150 [<ffffffff81065b57>] __cancel_work_timer+0x27/0xc0 [<ffffffff81065c03>] cancel_delayed_work_sync+0x13/0x20 [<ffffffffa0007067>] xs_destroy+0x27/0x80 [sunrpc] [<ffffffffa00040d8>] xprt_destroy+0x78/0xa0 [sunrpc] [<ffffffffa0006241>] xprt_put+0x21/0x30 [sunrpc] [<ffffffffa00030cf>] rpc_free_client+0x10f/0x1a0 [sunrpc] [<ffffffffa0002ff3>] ? rpc_free_client+0x33/0x1a0 [sunrpc] [<ffffffffa0002f7e>] rpc_release_client+0x6e/0xb0 [sunrpc] [<ffffffffa000325d>] rpc_shutdown_client+0xfd/0x1b0 [sunrpc] [<ffffffffa0017196>] rpcb_put_local+0x106/0x130 [sunrpc] ... Acked-by: "Myklebust, Trond" <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-03-09 12:43:42 -05:00
Arnd Bergmann	dc893e19b5	Revert parts of "hlist: drop the node parameter from iterators" Commit `b67bfe0d42` ("hlist: drop the node parameter from iterators") did a lot of nice changes but also contains two small hunks that seem to have slipped in accidentally and have no apparent connection to the intent of the patch. This reverts the two extraneous changes. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Peter Senna Tschudin <peter.senna@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-03-08 15:05:34 -08:00
Hannes Frederic Sowa	ddf64354af	ipv6: stop multicast forwarding to process interface scoped addresses v2: a) used struct ipv6_addr_props v3: a) reverted changes for ipv6_addr_props v4: a) do not use __ipv6_addr_needs_scope_id Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-08 12:28:20 -05:00
Cristian Bercaru	3bc1b1add7	bridging: fix rx_handlers return code The frames for which rx_handlers return RX_HANDLER_CONSUMED are no longer counted as dropped. They are counted as successfully received by 'netif_receive_skb'. This allows network interface drivers to correctly update their RX-OK and RX-DRP counters based on the result of 'netif_receive_skb'. Signed-off-by: Cristian Bercaru <B43982@freescale.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-08 12:19:59 -05:00
Paul Moore	a6a8fe950e	netlabel: fix build problems when CONFIG_IPV6=n My last patch to solve a problem where the static/fallback labels were not fully displayed resulted in build problems when IPv6 was disabled. This patch resolves the IPv6 build problems; sorry for the screw-up. Please queue for -stable or simply merge with the previous patch. Reported-by: Kbuild Test Robot <fengguang.wu@intel.com> Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-08 11:33:51 -05:00
Chen Gang	2e85d67690	net/rds: zero last byte for strncpy for NUL terminated string, need be always sure '\0' in the end. additional info: strncpy will pads with zeroes to the end of the given buffer. should initialise every bit of memory that is going to be copied to userland Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-08 00:35:44 -05:00
Cong Wang	fbca58a224	bridge: add missing vid to br_mdb_get() Obviously, vid should be considered when searching for multicast group. Cc: Vlad Yasevich <vyasevic@redhat.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Vlad Yasevich <vyasevich@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-07 16:32:19 -05:00
Christoph Paasch	c10cb5fc0f	Fix: sparse warning in inet_csk_prepare_forced_close In `e337e24d66` (inet: Fix kmemleak in tcp_v4/6_syn_recv_sock and dccp_v4/6_request_recv_sock) I introduced the function inet_csk_prepare_forced_close, which does a call to bh_unlock_sock(). This produces a sparse-warning. This patch adds the missing __releases. Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-07 16:31:29 -05:00
Junwei Zhang	d0d79c3fd7	afkey: fix a typo Signed-off-by: Martin Zhang <martinbj2008@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-07 16:26:45 -05:00
Paul Moore	0c1233aba1	netlabel: correctly list all the static label mappings When we have a large number of static label mappings that spill across the netlink message boundary we fail to properly save our state in the netlink_callback struct which causes us to repeat the same listings. This patch fixes this problem by saving the state correctly between calls to the NetLabel static label netlink "dumpit" routines. Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-07 16:20:23 -05:00
David S. Miller	43b18db8a2	Merge branch 'master' of git://1984.lsi.us.es/nf Pablo Neira Ayuso says: ==================== The following patchset contains Netfilter fixes for your net tree, they are: * Don't generate audit log message if audit is not enabled, from Gao Feng. * Fix logging formatting for packets dropped by helpers, by Joe Perches. * Fix a compilation warning in nfnetlink if CONFIG_PROVE_RCU is not set, from Paul Bolle. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-07 15:20:02 -05:00
John W. Linville	32cdd592b7	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-03-06 10:21:17 -05:00
J. Bruce Fields	3c34ae11fa	nfsd: fix krb5 handling of anonymous principals krb5 mounts started failing as of `683428fae8` "sunrpc: Update svcgss xdr handle to rpsec_contect cache". The problem is that mounts are usually done with some host principal which isn't normally mapped to any user, in which case svcgssd passes down uid -1, which the kernel is then expected to map to the export-specific anonymous uid or gid. The new uid_valid/gid_valid checks were therefore causing that downcall to fail. (Note the regression may not have been seen with older userspace that tended to map unknown principals to an anonymous id on their own rather than leaving it to the kernel.) Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-03-06 10:11:08 -05:00
David Ward	fa2b04f450	net/ipv4: Timestamp option cannot overflow with prespecified addresses When a router forwards a packet that contains the IPv4 timestamp option, if there is no space left in the option for the router to add its own timestamp, then the router increments the Overflow value in the option. However, if the addresses of the routers are prespecified in the option, then the overflow condition cannot happen: the option is structured so that each prespecified router has a place to write its timestamp. Other routers do not add a timestamp, so there will never be a lack of space. This fix ensures that the Overflow value in the IPv4 timestamp option is not incremented when the addresses of the routers are prespecified, even if the Pointer value is greater than the Length value. Signed-off-by: David Ward <david.ward@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:06 -05:00
Eric Dumazet	d1f41b67ff	net: reduce net_rx_action() latency to 2 HZ We should use time_after_eq() to get maximum latency of two ticks, instead of three. Bug added in commit `24f8b2385` (net: increase receive packet quantum) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:06 -05:00
Randy Dunlap	691b3b7e13	net: fix new kernel-doc warnings in net core Fix new kernel-doc warnings in net/core/dev.c: Warning(net/core/dev.c:4788): No description found for parameter 'new_carrier' Warning(net/core/dev.c:4788): Excess function parameter 'new_carries' description in 'dev_change_carrier' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:06 -05:00
Paolo Valente	76e4cb0d3a	pkt_sched: sch_qfq: remove a useless invocation of qfq_update_eligible QFQ+ can select for service only 'eligible' aggregates, i.e., aggregates that would have started to be served also in the emulated ideal system. As a consequence, for QFQ+ to be work conserving, at least one of the active aggregates must be eligible when it is time to choose the next aggregate to serve. The set of eligible aggregates is updated through the function qfq_update_eligible(), which does guarantee that, after its invocation, at least one of the active aggregates is eligible. Because of this property, this function is invoked in qfq_deactivate_agg() to guarantee that at least one of the active aggregates is still eligible after an aggregate has been deactivated. In particular, the critical case is when there are other active aggregates, but the aggregate being deactivated happens to be the only one eligible. However, this precaution is not needed for QFQ+ to be work conserving, because update_eligible() is always invoked also at the beginning of qfq_choose_next_agg(). This patch removes the additional invocation of update_eligible() in qfq_deactivate_agg(). Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Reviewed-by: Fabio Checconi <fchecconi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:05 -05:00
Paolo Valente	40dd2d5461	pkt_sched: sch_qfq: do not allow virtual time to jump if an aggregate is in service By definition of (the algorithm of) QFQ+, the system virtual time must be pushed up only if there is no 'eligible' aggregate, i.e. no aggregate that would have started to be served also in the ideal system emulated by QFQ+. QFQ+ serves only eligible aggregates, hence the aggregate currently in service is eligible. As a consequence, to decide whether there is no eligible aggregate, QFQ+ must also check whether there is no aggregate in service. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Reviewed-by: Fabio Checconi <fchecconi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:05 -05:00
Paolo Valente	a0143efa96	pkt_sched: sch_qfq: prevent budget from wrapping around after a dequeue Aggregate budgets are computed so as to guarantee that, after an aggregate has been selected for service, that aggregate has enough budget to serve at least one maximum-size packet for the classes it contains. For this reason, after a new aggregate has been selected for service, its next packet is immediately dequeued, without any further control. The maximum packet size for a class, lmax, can be changed through qfq_change_class(). In case the user sets lmax to a lower value than the the size of some of the still-to-arrive packets, QFQ+ will automatically push up lmax as it enqueues these packets. This automatic push up is likely to happen with TSO/GSO. In any case, if lmax is assigned a lower value than the size of some of the packets already enqueued for the class, then the following problem may occur: the size of the next packet to dequeue for the class may happen to be larger than lmax, after the aggregate to which the class belongs has been just selected for service. In this case, even the budget of the aggregate, which is an unsigned value, may be lower than the size of the next packet to dequeue. After dequeueing this packet and subtracting its size from the budget, the latter would wrap around. This fix prevents the budget from wrapping around after any packet dequeue. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Reviewed-by: Fabio Checconi <fchecconi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:05 -05:00
Paolo Valente	2f3b89a1fe	pkt_sched: sch_qfq: serve activated aggregates immediately if the scheduler is empty If no aggregate is in service, then the function qfq_dequeue() does not dequeue any packet. For this reason, to guarantee QFQ+ to be work conserving, a just-activated aggregate must be set as in service immediately if it happens to be the only active aggregate. This is done by the function qfq_enqueue(). Unfortunately, the function qfq_add_to_agg(), used to add a class to an aggregate, does not perform this important additional operation. In particular, if: 1) qfq_add_to_agg() is invoked to complete the move of a class from a source aggregate, becoming, for this move, inactive, to a destination aggregate, becoming instead active, and 2) the destination aggregate becomes the only active aggregate, then this aggregate is not however set as in service. QFQ+ remains then in a non-work-conserving state until a new invocation of qfq_enqueue() recovers the situation. This fix solves the problem by moving the logic for setting an aggregate as in service directly into the function qfq_activate_agg(). Hence, from whatever point qfq_activate_aggregate() is invoked, QFQ+ remains work conserving. Since the more-complex logic of this new version of activate_aggregate() is not necessary, in qfq_dequeue(), to reschedule an aggregate that finishes its budget, then the aggregate is now rescheduled by invoking directly the functions needed. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Reviewed-by: Fabio Checconi <fchecconi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:05 -05:00
Paolo Valente	624b85fb96	pkt_sched: sch_qfq: fix the update of eligible-group sets Between two invocations of make_eligible, the system virtual time may happen to grow enough that, in its binary representation, a bit with higher order than 31 flips. This happens especially with TSO/GSO. Before this fix, the mask used in make_eligible was computed as (1UL<<index_of_last_flipped_bit)-1, whose value is well defined on a 64-bit architecture, because index_of_flipped_bit <= 63, but is in general undefined on a 32-bit architecture if index_of_flipped_bit > 31. The fix just replaces 1UL with 1ULL. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Reviewed-by: Fabio Checconi <fchecconi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:05 -05:00
Paolo Valente	9b99b7e90b	pkt_sched: sch_qfq: properly cap timestamps in charge_actual_service QFQ+ schedules the active aggregates in a group using a bucket list (one list per group). The bucket in which each aggregate is inserted depends on the aggregate's timestamps, and the number of buckets in a group is enough to accomodate the possible (range of) values of the timestamps of all the aggregates in the group. For this property to hold, timestamps must however be computed correctly. One necessary condition for computing timestamps correctly is that the number of bits dequeued for each aggregate, while the aggregate is in service, does not exceed the maximum budget budgetmax assigned to the aggregate. For each aggregate, budgetmax is proportional to the number of classes in the aggregate. If the number of classes of the aggregate is decreased through qfq_change_class(), then budgetmax is decreased automatically as well. Problems may occur if the aggregate is in service when budgetmax is decreased, because the current remaining budget of the aggregate and/or the service already received by the aggregate may happen to be larger than the new value of budgetmax. In this case, when the aggregate is eventually deselected and its timestamps are updated, the aggregate may happen to have received an amount of service larger than budgetmax. This may cause the aggregate to be assigned a higher virtual finish time than the maximum acceptable value for the last bucket in the bucket list of the group. This fix introduces a cap that addresses this issue. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Reviewed-by: Fabio Checconi <fchecconi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:05 -05:00
Peter Hurley	f74861ca87	net/irda: Raise dtr in non-blocking open DTR/RTS need to be raised, regardless of the open() mode, but not if the port has already shutdown. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:05 -05:00
Peter Hurley	0b176ce3a7	net/irda: Use barrier to set task state Without a memory and compiler barrier, the task state change can migrate relative to the condition testing in a blocking loop. However, the task state change must be visible across all cpus prior to testing those conditions. Failing to do this can result in the familiar 'lost wakeup' and this task will hang until killed. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:04 -05:00
Peter Hurley	2f7c069b96	net/irda: Hold port lock while bumping blocked_open Although tty_lock() already protects concurrent update to blocked_open, that fails to meet the separation-of-concerns between tty_port and tty. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:04 -05:00
Peter Hurley	a4ed2e737c	net/irda: Fix port open counts Saving the port count bump is unsafe. If the tty is hung up while this open was blocking, the port count is zeroed. Explicitly check if the tty was hung up while blocking, and correct the port count if not. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:04 -05:00
Linus Torvalds	9da060d0ed	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "A moderately sized pile of fixes, some specifically for merge window introduced regressions although others are for longer standing items and have been queued up for -stable. I'm kind of tired of all the RDS protocol bugs over the years, to be honest, it's way out of proportion to the number of people who actually use it. 1) Fix missing range initialization in netfilter IPSET, from Jozsef Kadlecsik. 2) ieee80211_local->tim_lock needs to use BH disabling, from Johannes Berg. 3) Fix DMA syncing in SFC driver, from Ben Hutchings. 4) Fix regression in BOND device MAC address setting, from Jiri Pirko. 5) Missing usb_free_urb in ISDN Hisax driver, from Marina Makienko. 6) Fix UDP checksumming in bnx2x driver for 57710 and 57711 chips, fix from Dmitry Kravkov. 7) Missing cfgspace_lock initialization in BCMA driver. 8) Validate parameter size for SCTP assoc stats getsockopt(), from Guenter Roeck. 9) Fix SCTP association hangs, from Lee A Roberts. 10) Fix jumbo frame handling in r8169, from Francois Romieu. 11) Fix phy_device memory leak, from Petr Malat. 12) Omit trailing FCS from frames received in BGMAC driver, from Hauke Mehrtens. 13) Missing socket refcount release in L2TP, from Guillaume Nault. 14) sctp_endpoint_init should respect passed in gfp_t, rather than use GFP_KERNEL unconditionally. From Dan Carpenter. 15) Add AISX AX88179 USB driver, from Freddy Xin. 16) Remove MAINTAINERS entries for drivers deleted during the merge window, from Cesar Eduardo Barros. 17) RDS protocol can try to allocate huge amounts of memory, check that the user's request length makes sense, from Cong Wang. 18) SCTP should use the provided KMALLOC_MAX_SIZE instead of it's own, bogus, definition. From Cong Wang. 19) Fix deadlocks in FEC driver by moving TX reclaim into NAPI poll, from Frank Li. Also, fix a build error introduced in the merge window. 20) Fix bogus purging of default routes in ipv6, from Lorenzo Colitti. 21) Don't double count RTT measurements when we leave the TCP receive fast path, from Neal Cardwell." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits) tcp: fix double-counted receiver RTT when leaving receiver fast path CAIF: fix sparse warning for caif_usb rds: simplify a warning message net: fec: fix build error in no MXC platform net: ipv6: Don't purge default router if accept_ra=2 net: fec: put tx to napi poll function to fix dead lock sctp: use KMALLOC_MAX_SIZE instead of its own MAX_KMALLOC_SIZE rds: limit the size allocated by rds_message_alloc() MAINTAINERS: remove eexpress MAINTAINERS: remove drivers/net/wan/cycx* MAINTAINERS: remove 3c505 caif_dev: fix sparse warnings for caif_flow_cb ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver sctp: use the passed in gfp flags instead GFP_KERNEL ipv[4\|6]: correct dropwatch false positive in local_deliver_finish l2tp: Restore socket refcount when sendmsg succeeds net/phy: micrel: Disable asymmetric pause for KSZ9021 bgmac: omit the fcs phy: Fix phy_device_free memory leak bnx2x: Fix KR2 work-around condition ...	2013-03-05 18:42:29 -08:00

1 2 3 4 5 ...

27162 Commits