linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-08 21:21:47 +00:00

Author	SHA1	Message	Date
Ollie Wild	56b49f4b8f	net: Move "struct net" declaration inside the __KERNEL__ macro guard This patch reduces namespace pollution by moving the "struct net" declaration out of the userspace-facing portion of linux/netlink.h. It has no impact on the kernel. (This came up because we have several C++ applications which use "net" as a namespace name.) Signed-off-by: Ollie Wild <aaw@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-22 13:21:05 -07:00
Jiri Olsa	cbdd769ab9	netfilter: nf_conntrack_defrag: check socket type before touching nodefrag flag we need to check proper socket type within ipv4_conntrack_defrag function before referencing the nodefrag flag. For example the tun driver receive path produces skbs with AF_UNSPEC socket type, and so current code is causing unwanted fragmented packets going out. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-22 13:13:34 -07:00
Patrick McHardy	d6120b8afa	netfilter: nf_nat_snmp: fix checksum calculation (v4) Fix checksum calculation in nf_nat_snmp_basic. Based on patches by Clark Wang <wtweeker@163.com> and Stephen Hemminger <shemminger@vyatta.com>. https://bugzilla.kernel.org/show_bug.cgi?id=17622 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-22 13:13:33 -07:00
Eric Dumazet	15cdeadaa5	netfilter: fix a race in nf_ct_ext_create() As soon as rcu_read_unlock() is called, there is no guarantee current thread can safely derefence t pointer, rcu protected. Fix is to copy t->alloc_size in a temporary variable. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-22 13:13:33 -07:00
Changli Gao	b46ffb8545	netfilter: fix ipt_REJECT TCP RST routing for indev == outdev ip_route_me_harder can't create the route cache when the outdev is the same with the indev for the skbs whichout a valid protocol set. __mkroute_input functions has this check: 1998 if (skb->protocol != htons(ETH_P_IP)) { 1999 /* Not IP (i.e. ARP). Do not create route, if it is 2000 * invalid for proxy arp. DNAT routes are always valid. 2001 * 2002 * Proxy arp feature have been extended to allow, ARP 2003 * replies back to the same interface, to support 2004 * Private VLAN switch technologies. See arp.c. 2005 */ 2006 if (out_dev == in_dev && 2007 IN_DEV_PROXY_ARP_PVLAN(in_dev) == 0) { 2008 err = -EINVAL; 2009 goto cleanup; 2010 } 2011 } This patch gives the new skb a valid protocol to bypass this check. In order to make ipt_REJECT work with bridges, you also need to enable ip_forward. This patch also fixes a regression. When we used skb_copy_expand(), we didn't have this issue stated above, as the protocol was properly set. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-22 13:13:32 -07:00
Simon Horman	7874896a26	netfilter: nf_ct_sip: default to NF_ACCEPT in sip_help_tcp() I initially noticed this because of the compiler warning below, but it does seem to be a valid concern in the case where ct_sip_get_header() returns 0 in the first iteration of the while loop. net/netfilter/nf_conntrack_sip.c: In function 'sip_help_tcp': net/netfilter/nf_conntrack_sip.c:1379: warning: 'ret' may be used uninitialized in this function Signed-off-by: Simon Horman <horms@verge.net.au> [Patrick: changed NF_DROP to NF_ACCEPT] Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-22 13:13:32 -07:00
Eric Dumazet	d485d500cf	netfilter: tproxy: nf_tproxy_assign_sock() can handle tw sockets transparent field of a socket is either inet_twsk(sk)->tw_transparent for timewait sockets, or inet_sk(sk)->transparent for other sockets (TCP/UDP). Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-22 13:13:31 -07:00
Eric Dumazet	3d13008e73	ip: fix truesize mismatch in ip fragmentation Special care should be taken when slow path is hit in ip_fragment() : When walking through frags, we transfert truesize ownership from skb to frags. Then if we hit a slow_path condition, we must undo this or risk uncharging frags->truesize twice, and in the end, having negative socket sk_wmem_alloc counter, or even freeing socket sooner than expected. Many thanks to Nick Bowler, who provided a very clean bug report and test program. Thanks to Jarek for reviewing my first patch and providing a V2 While Nick bisection pointed to commit `2b85a34e91` (net: No more expensive sock_hold()/sock_put() on each tx), underlying bug is older (2.6.12-rc5) A side effect is to extend work done in commit `b2722b1c3a` (ip_fragment: also adjust skb->truesize for packets not owned by a socket) to ipv6 as well. Reported-and-bisected-by: Nick Bowler <nbowler@elliptictech.com> Tested-by: Nick Bowler <nbowler@elliptictech.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Jarek Poplawski <jarkao2@gmail.com> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-21 15:05:50 -07:00
Eric Dumazet	7e96dc7045	netxen: dont set skb->truesize skb->truesize is set in core network. Dont change it unless dealing with fragments. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-21 13:04:04 -07:00
Eric Dumazet	8df8fd2712	qlcnic: dont set skb->truesize skb->truesize is set in core network. Dont change it unless dealing with fragments. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-21 13:03:24 -07:00
David S. Miller	2d813760d7	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6	2010-09-21 12:26:07 -07:00
Tom Marshall	a4d258036e	tcp: Fix race in tcp_poll If a RST comes in immediately after checking sk->sk_err, tcp_poll will return POLLIN but not POLLOUT. Fix this by checking sk->sk_err at the end of tcp_poll. Additionally, ensure the correct order of operations on SMP machines with memory barriers. Signed-off-by: Tom Marshall <tdm.code@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-20 15:42:05 -07:00
David S. Miller	9828e6e6e3	rose: Fix signedness issues wrt. digi count. Just use explicit casts, since we really can't change the types of structures exported to userspace which have been around for 15 years or so. Reported-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-20 15:40:35 -07:00
David S. Miller	3779298b81	Merge branch 'vhost-net' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost	2010-09-20 11:13:34 -07:00
Thomas Egerer	8444cf712c	xfrm: Allow different selector family in temporary state The family parameter xfrm_state_find is used to find a state matching a certain policy. This value is set to the template's family (encap_family) right before xfrm_state_find is called. The family parameter is however also used to construct a temporary state in xfrm_state_find itself which is wrong for inter-family scenarios because it produces a selector for the wrong family. Since this selector is included in the xfrm_user_acquire structure, user space programs misinterpret IPv6 addresses as IPv4 and vice versa. This patch splits up the original init_tempsel function into a part that initializes the selector respectively the props and id of the temporary state, to allow for differing ip address families whithin the state. Signed-off-by: Thomas Egerer <thomas.egerer@secunet.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-20 11:11:38 -07:00
Johannes Berg	df6d02300f	wext: fix potential private ioctl memory content leak When a driver doesn't fill the entire buffer, old heap contents may remain, and if it also doesn't update the length properly, this old heap content will be copied back to userspace. It is very unlikely that this happens in any of the drivers using private ioctls since it would show up as junk being reported by iwpriv, but it seems better to be safe here, so use kzalloc. Reported-by: Jeff Mahoney <jeffm@suse.com> Cc: stable@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2010-09-20 13:41:40 -04:00
Eric Dumazet	842c74bffc	ip_gre: CONFIG_IPV6_MODULE support ipv6 can be a module, we should test CONFIG_IPV6 and CONFIG_IPV6_MODULE to enable ipv6 bits in ip_gre. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-20 10:06:12 -07:00
Eric Dumazet	04746ff128	qlcnic: dont assume NET_IP_ALIGN is 2 qlcnic driver allocates rx skbs and gives to hardware too bytes of extra storage, allowing for corruption of kernel data. NET_IP_ALIGN being 0 on some platforms (including x86), drivers should not assume it's 2. rds_ring->skb_size = rds_ring->dma_size + NET_IP_ALIGN; ... skb = dev_alloc_skb(rds_ring->skb_size); skb_reserve(skb, 2); pci_map_single(pdev, skb->data, rds_ring->dma_size, PCI_DMA_FROMDEVICE); (and rds_ring->skb_size == rds_ring->dma_size) -> bug Because of extra alignment (1500 + 32) -> four extra bytes are available before the struct skb_shared_info, so corruption is not noticed. Note: this driver could use netdev_alloc_skb_ip_align() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-17 22:58:08 -07:00
Sosnowski, Maciej	4e8cec269d	dca: disable dca on IOAT ver.3.0 multiple-IOH platforms Direct Cache Access is not supported on IOAT ver.3.0 multiple-IOH platforms. This patch blocks registering of dca providers when multiple IOH detected with IOAT ver.3.0. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-17 20:08:21 -07:00
Herbert Xu	f0f9deae9e	netpoll: Disable IRQ around RCU dereference in netpoll_rx We cannot use rcu_dereference_bh safely in netpoll_rx as we may be called with IRQs disabled. We could however simply disable IRQs as that too causes BH to be disabled and is safe in either case. Thanks to John Linville for discovering this bug and providing a patch. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-17 16:55:03 -07:00
Vlad Yasevich	4bdab43323	sctp: Do not reset the packet during sctp_packet_config(). sctp_packet_config() is called when getting the packet ready for appending of chunks. The function should not touch the current state, since it's possible to ping-pong between two transports when sending, and that can result packet corruption followed by skb overlfow crash. Reported-by: Thomas Dreibholz <dreibh@iem.uni-due.de> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-17 16:47:56 -07:00
Wey-Yi Guy	7acc7c683a	iwlwifi: do not perferm force reset while doing scan When uCode error condition detected, driver try to perform either rf reset or firmware reload in order bring device back to working condition. If rf reset is required and scan is in process, there is no need to issue rf reset since scan already reset the rf. If firmware reload is required and scan is in process, skip the reload request. There is a possibility firmware reload during scan cause problem. [ 485.804046] WARNING: at net/mac80211/main.c:310 ieee80211_restart_hw+0x28/0x62() [ 485.804049] Hardware name: Latitude E6400 [ 485.804052] ieee80211_restart_hw called with hardware scan in progress [ 485.804054] Modules linked in: iwlagn iwlcore bnep sco rfcomm l2cap crc16 bluetooth [last unloaded: iwlcore] [ 485.804069] Pid: 812, comm: kworker/u:3 Tainted: G W 2.6.36-rc3-wl+ #74 [ 485.804072] Call Trace: [ 485.804079] [<c103019a>] warn_slowpath_common+0x60/0x75 [ 485.804084] [<c1030213>] warn_slowpath_fmt+0x26/0x2a [ 485.804089] [<c145da67>] ieee80211_restart_hw+0x28/0x62 [ 485.804102] [<f8b35dc6>] iwl_bg_restart+0x113/0x150 [iwlagn] [ 485.804108] [<c10415d5>] process_one_work+0x181/0x25c [ 485.804119] [<f8b35cb3>] ? iwl_bg_restart+0x0/0x150 [iwlagn] [ 485.804124] [<c104190a>] worker_thread+0xf9/0x1f2 [ 485.804128] [<c1041811>] ? worker_thread+0x0/0x1f2 [ 485.804133] [<c10451b0>] kthread+0x64/0x69 [ 485.804137] [<c104514c>] ? kthread+0x0/0x69 [ 485.804141] [<c1002df6>] kernel_thread_helper+0x6/0x10 [ 485.804145] ---[ end trace 3d4ebdc02d524bbb ]--- [ 485.804148] WG> 1 [ 485.804153] Pid: 812, comm: kworker/u:3 Tainted: G W 2.6.36-rc3-wl+ #74 [ 485.804156] Call Trace: [ 485.804161] [<c145da9b>] ? ieee80211_restart_hw+0x5c/0x62 [ 485.804172] [<f8b35dcb>] iwl_bg_restart+0x118/0x150 [iwlagn] [ 485.804177] [<c10415d5>] process_one_work+0x181/0x25c [ 485.804188] [<f8b35cb3>] ? iwl_bg_restart+0x0/0x150 [iwlagn] [ 485.804192] [<c104190a>] worker_thread+0xf9/0x1f2 [ 485.804197] [<c1041811>] ? worker_thread+0x0/0x1f2 [ 485.804201] [<c10451b0>] kthread+0x64/0x69 [ 485.804205] [<c104514c>] ? kthread+0x0/0x69 [ 485.804209] [<c1002df6>] kernel_thread_helper+0x6/0x10 Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>	2010-09-17 13:03:35 -07:00
Dan Carpenter	2507136f74	net/llc: storing negative error codes in unsigned short If the alloc_skb() fails then we return 65431 instead of -ENOBUFS (-105). Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-16 22:38:23 -07:00
Chris Snook	e443e38324	MAINTAINERS: move atlx discussions to netdev The atlx drivers are sufficiently mature that we no longer need a separate mailing list for them. Move the discussion to netdev, so we can decommission atl1-devel, which is now mostly spam. Signed-off-by: Chris Snook <chris.snook@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-16 22:00:28 -07:00
Dan Rosenberg	49c37c0334	drivers/net/cxgb3/cxgb3_main.c: prevent reading uninitialized stack memory Fixed formatting (tabs and line breaks). The CHELSIO_GET_QSET_NUM device ioctl allows unprivileged users to read 4 bytes of uninitialized stack memory, because the "addr" member of the ch_reg struct declared on the stack in cxgb_extension_ioctl() is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-16 21:55:00 -07:00
Dan Rosenberg	44467187dc	drivers/net/eql.c: prevent reading uninitialized stack memory Fixed formatting (tabs and line breaks). The EQL_GETMASTRCFG device ioctl allows unprivileged users to read 16 bytes of uninitialized stack memory, because the "master_name" member of the master_config_t struct declared on the stack in eql_g_master_cfg() is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-16 21:54:59 -07:00
Dan Rosenberg	7011e66093	drivers/net/usb/hso.c: prevent reading uninitialized memory Fixed formatting (tabs and line breaks). The TIOCGICOUNT device ioctl allows unprivileged users to read uninitialized stack memory, because the "reserved" member of the serial_icounter_struct struct declared on the stack in hso_get_count() is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-16 21:54:59 -07:00
Eric Dumazet	e71895a1be	xfrm: dont assume rcu_read_lock in xfrm_output_one() ip_local_out() is called with rcu_read_lock() held from ip_queue_xmit() but not from other call sites. Reported-and-bisected-by: Nick Bowler <nbowler@elliptictech.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-16 21:46:15 -07:00
Matthew Garrett	801e147cde	r8169: Handle rxfifo errors on 8168 chips The Thinkpad X100e seems to have some odd behaviour when the display is powered off - the onboard r8169 starts generating rxfifo overflow errors. The root cause of this has not yet been identified and may well be a hardware design bug on the platform, but r8169 should be more resiliant to this. This patch enables the rxfifo interrupt on 8168 devices and removes the MAC version check in the interrupt handler, and the machine no longer crashes when under network load while the screen turns off. Signed-off-by: Matthew Garrett <mjg@redhat.com> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-15 19:32:59 -07:00
Denis Kirjanov	84176b7b56	3c59x: Remove atomic context inside vortex_{set\|get}_wol There is no need to use spinlocks in vortex_{set\|get}_wol. This also fixes a bug: [ 254.214993] 3c59x 0000:00:0d.0: PME# enabled [ 254.215021] BUG: sleeping function called from invalid context at kernel/mutex.c:94 [ 254.215030] in_atomic(): 0, irqs_disabled(): 1, pid: 4875, name: ethtool [ 254.215042] Pid: 4875, comm: ethtool Tainted: G W 2.6.36-rc3+ #7 [ 254.215049] Call Trace: [ 254.215050] [] __might_sleep+0xb1/0xb6 [ 254.215050] [] mutex_lock+0x17/0x30 [ 254.215050] [] acpi_enable_wakeup_device_power+0x2b/0xb1 [ 254.215050] [] acpi_pm_device_sleep_wake+0x42/0x7f [ 254.215050] [] acpi_pci_sleep_wake+0x5d/0x63 [ 254.215050] [] platform_pci_sleep_wake+0x1d/0x20 [ 254.215050] [] __pci_enable_wake+0x90/0xd0 [ 254.215050] [] acpi_set_WOL+0x8e/0xf5 [3c59x] [ 254.215050] [] vortex_set_wol+0x4e/0x5e [3c59x] [ 254.215050] [] dev_ethtool+0x1cf/0xb61 [ 254.215050] [] ? debug_mutex_free_waiter+0x45/0x4a [ 254.215050] [] ? __mutex_lock_common+0x204/0x20e [ 254.215050] [] ? __mutex_lock_slowpath+0x12/0x15 [ 254.215050] [] ? mutex_lock+0x23/0x30 [ 254.215050] [] dev_ioctl+0x42c/0x533 [ 254.215050] [] ? _cond_resched+0x8/0x1c [ 254.215050] [] ? lock_page+0x1c/0x30 [ 254.215050] [] ? page_address+0x15/0x7c [ 254.215050] [] ? filemap_fault+0x187/0x2c4 [ 254.215050] [] sock_ioctl+0x1d4/0x1e0 [ 254.215050] [] ? sock_ioctl+0x0/0x1e0 [ 254.215050] [] vfs_ioctl+0x19/0x33 [ 254.215050] [] do_vfs_ioctl+0x424/0x46f [ 254.215050] [] ? selinux_file_ioctl+0x3c/0x40 [ 254.215050] [] sys_ioctl+0x40/0x5a [ 254.215050] [] sysenter_do_call+0x12/0x22 vortex_set_wol protected with a spinlock, but nested acpi_set_WOL acquires a mutex inside atomic context. Ethtool operations are already serialized by RTNL mutex, so it is safe to drop the locks. Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-15 14:32:39 -07:00
Alexey Kuznetsov	01f83d6984	tcp: Prevent overzealous packetization by SWS logic. If peer uses tiny MSS (say, 75 bytes) and similarly tiny advertised window, the SWS logic will packetize to half the MSS unnecessarily. This causes problems with some embedded devices. However for large MSS devices we do want to half-MSS packetize otherwise we never get enough packets into the pipe for things like fast retransmit and recovery to work. Be careful also to handle the case where MSS > window, otherwise we'll never send until the probe timer. Reported-by: ツ Leandro Melo de Sales <leandroal@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-15 12:01:44 -07:00
David S. Miller	6dcbc12290	net: RPS needs to depend upon USE_GENERIC_SMP_HELPERS You cannot invoke __smp_call_function_single() unless the architecture sets this symbol. Reported-by: Daniel Hellstrom <daniel@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-14 21:42:22 -07:00
Simon Guinot	fddd91016d	phylib: fix PAL state machine restart on resume On resume, before starting the PAL state machine, check if the adjust_link() method is well supplied. If not, this would lead to a NULL pointer dereference in the phy_state_machine() function. This scenario can happen if the Ethernet driver call manually the PHY functions instead of using the PAL state machine. The mv643xx_eth driver is a such example. Signed-off-by: Simon Guinot <sguinot@lacie.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-14 14:31:03 -07:00
Eric Dumazet	ef885afbf8	net: use rcu_barrier() in rollback_registered_many netdev_wait_allrefs() waits that all references to a device vanishes. It currently uses a _very_ pessimistic 250 ms delay between each probe. Some users reported that no more than 4 devices can be dismantled per second, this is a pretty serious problem for some setups. Most of the time, a refcount is about to be released by an RCU callback, that is still in flight because rollback_registered_many() uses a synchronize_rcu() call instead of rcu_barrier(). Problem is visible if number of online cpus is one, because synchronize_rcu() is then a no op. time to remove 50 ipip tunnels on a UP machine : before patch : real 11.910s after patch : real 1.250s Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reported-by: Octavian Purdila <opurdila@ixiacom.com> Reported-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-14 14:27:29 -07:00
Andy Gospodarek	ab12811c89	bonding: correctly process non-linear skbs It was recently brought to my attention that 802.3ad mode bonds would no longer form when using some network hardware after a driver update. After snooping around I realized that the particular hardware was using page-based skbs and found that skb->data did not contain a valid LACPDU as it was not stored there. That explained the inability to form an 802.3ad-based bond. For balance-alb mode bonds this was also an issue as ARPs would not be properly processed. This patch fixes the issue in my tests and should be applied to 2.6.36 and as far back as anyone cares to add it to stable. Thanks to Alexander Duyck <alexander.h.duyck@intel.com> and Jesse Brandeburg <jesse.brandeburg@intel.com> for the suggestions on this one. Signed-off-by: Andy Gospodarek <andy@greyhouse.net> CC: Alexander Duyck <alexander.h.duyck@intel.com> CC: Jesse Brandeburg <jesse.brandeburg@intel.com> CC: stable@kerne.org Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-14 14:25:32 -07:00
Michael S. Tsirkin	ee05d6939e	vhost-net: fix range checking in mrg bufs case In mergeable buffer case, we use headcount, log_num and seg as indexes in same-size arrays, and we know that headcount <= seg and log_num equals either 0 or seg. Therefore, the right thing to do is range-check seg, not headcount as we do now: these will be different if guest chains s/g descriptors (this does not happen now, but we can not trust the guest). Long term, we should add BUG_ON checks to verify two other indexes are what we think they should be. Reported-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2010-09-14 15:22:41 +02:00
Michael Kerrisk	a89b47639f	ipv4: enable getsockopt() for IP_NODEFRAG While integrating your man-pages patch for IP_NODEFRAG, I noticed that this option is settable by setsockopt(), but not gettable by getsockopt(). I suppose this is not intended. The (untested, trivial) patch below adds getsockopt() support. Signed-off-by: Michael kerrisk <mtk.manpages@gmail.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-13 19:57:23 -07:00
Bob Arendt	7998156344	ipv4: force_igmp_version ignored when a IGMPv3 query received After all these years, it turns out that the /proc/sys/net/ipv4/conf//force_igmp_version parameter isn't fully implemented. Symptom: When set force_igmp_version to a value of 2, the kernel should only perform multicast IGMPv2 operations (IETF rfc2236). An host-initiated Join message will be sent as a IGMPv2 Join message. But if a IGMPv3 query message is received, the host responds with a IGMPv3 join message. Per rfc3376 and rfc2236, a IGMPv2 host should treat a IGMPv3 query as a IGMPv2 query and respond with an IGMPv2 Join message. Consequences: This is an issue when a IGMPv3 capable switch is the querier and will only issue IGMPv3 queries (which double as IGMPv2 querys) and there's an intermediate switch that is only IGMPv2 capable. The intermediate switch processes the initial v2 Join, but fails to recognize the IGMPv3 Join responses to the Query, resulting in a dropped connection when the intermediate v2-only switch times it out. Identifying issue in the kernel source: The issue is in this section of code (in net/ipv4/igmp.c), which is called when an IGMP query is received (from mainline 2.6.36-rc3 gitweb): ... A IGMPv3 query has a length >= 12 and no sources. This routine will exit after line 880, setting the general query timer (random timeout between 0 and query response time). This calls igmp_gq_timer_expire(): ... .. which only sends a v3 response. So if a v3 query is received, the kernel always sends a v3 response. IGMP queries happen once every 60 sec (per vlan), so the traffic is low. A IGMPv3 query is* a strict superset of a IGMPv2 query, so this patch properly short circuit's the v3 behaviour. One issue is that this does not address force_igmp_version=1. Then again, I've never seen any IGMPv1 multicast equipment in the wild. However there is a lot of v2-only equipment. If it's necessary to support the IGMPv1 case as well: 837 if (len == 8 \|\| IGMP_V2_SEEN(in_dev) \|\| IGMP_V1_SEEN(in_dev)) { Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-13 12:56:51 -07:00
Dan Carpenter	3429769bc6	ppp: potential NULL dereference in ppp_mp_explode() Smatch complains because we check whether "pch->chan" is NULL and then dereference it unconditionally on the next line. Partly the reason this bug was introduced is because code was too complicated. I've simplified it a little. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-13 12:44:11 -07:00
Dan Carpenter	339db11b21	net/llc: make opt unsigned in llc_ui_setsockopt() The members of struct llc_sock are unsigned so if we pass a negative value for "opt" it can cause a sign bug. Also it can cause an integer overflow when we multiply "opt * HZ". CC: stable@kernel.org Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-13 12:44:10 -07:00
David S. Miller	a505b3b30f	sch_atm: Fix potential NULL deref. The list_head conversion unearther an unnecessary flow check. Since flow is always NULL here we don't need to see if a matching flow exists already. Reported-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-12 11:56:44 -07:00
David S. Miller	053d8f6622	Merge branch 'vhost-net' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost	2010-09-09 21:59:51 -07:00
Dan Williams	c9cedbba0f	ipheth: remove incorrect devtype to WWAN The 'wwan' devtype is meant for devices that require preconfiguration and every time setup before the ethernet interface can be used, like cellular modems which require a series of setup commands on serial ports or other mechanisms before the ethernet interface will handle packets. As ipheth only requires one-per-hotplug pairing setup with no preconfiguration (like APN, phone #, etc) and the network interface is usable at any time after that initial setup, remove the incorrect devtype wwan. Signed-off-by: Dan Williams <dcbw@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-09 21:41:59 -07:00
Joe Perches	201b6bab67	MAINTAINERS: Add CAIF Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-09 21:41:59 -07:00
Joe Perches	123031c0ee	sctp: fix test for end of loop Add a list_has_sctp_addr function to simplify loop Based on a patches by Dan Carpenter and David Miller Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-09 15:00:29 -07:00
David S. Miller	e199e6136c	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6	2010-09-08 23:49:04 -07:00
Eric Dumazet	972c40b5be	KS8851: Correct RX packet allocation Use netdev_alloc_skb_ip_align() helper and do correct allocation Tested-by: Abraham Arce <x0066660@ti.com> Signed-off-by: Abraham Arce <x0066660@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-08 21:47:13 -07:00
Eric Dumazet	719f835853	udp: add rehash on connect() commit `30fff923` introduced in linux-2.6.33 (udp: bind() optimisation) added a secondary hash on UDP, hashed on (local addr, local port). Problem is that following sequence : fd = socket(...) connect(fd, &remote, ...) not only selects remote end point (address and port), but also sets local address, while UDP stack stored in secondary hash table the socket while its local address was INADDR_ANY (or ipv6 equivalent) Sequence is : - autobind() : choose a random local port, insert socket in hash tables [while local address is INADDR_ANY] - connect() : set remote address and port, change local address to IP given by a route lookup. When an incoming UDP frame comes, if more than 10 sockets are found in primary hash table, we switch to secondary table, and fail to find socket because its local address changed. One solution to this problem is to rehash datagram socket if needed. We add a new rehash(struct socket *) method in "struct proto", and implement this method for UDP v4 & v6, using a common helper. This rehashing only takes care of secondary hash table, since primary hash (based on local port only) is not changed. Reported-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-08 21:45:01 -07:00
Jianzhao Wang	ae2688d59b	net: blackhole route should always be recalculated Blackhole routes are used when xfrm_lookup() returns -EREMOTE (error triggered by IKE for example), hence this kind of route is always temporary and so we should check if a better route exists for next packets. Bug has been introduced by commit `d11a4dc18b`. Signed-off-by: Jianzhao Wang <jianzhao.wang@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-08 14:35:43 -07:00
Jarek Poplawski	f6b085b69d	ipv4: Suppress lockdep-RCU false positive in FIB trie (3) Hi, Here is one more of these warnings and a patch below: Sep 5 23:52:33 del kernel: [46044.244833] =================================================== Sep 5 23:52:33 del kernel: [46044.269681] [ INFO: suspicious rcu_dereference_check() usage. ] Sep 5 23:52:33 del kernel: [46044.277000] --------------------------------------------------- Sep 5 23:52:33 del kernel: [46044.285185] net/ipv4/fib_trie.c:1756 invoked rcu_dereference_check() without protection! Sep 5 23:52:33 del kernel: [46044.293627] Sep 5 23:52:33 del kernel: [46044.293632] other info that might help us debug this: Sep 5 23:52:33 del kernel: [46044.293634] Sep 5 23:52:33 del kernel: [46044.325333] Sep 5 23:52:33 del kernel: [46044.325335] rcu_scheduler_active = 1, debug_locks = 0 Sep 5 23:52:33 del kernel: [46044.348013] 1 lock held by pppd/1717: Sep 5 23:52:33 del kernel: [46044.357548] #0: (rtnl_mutex){+.+.+.}, at: [<c125dc1f>] rtnl_lock+0xf/0x20 Sep 5 23:52:33 del kernel: [46044.367647] Sep 5 23:52:33 del kernel: [46044.367652] stack backtrace: Sep 5 23:52:33 del kernel: [46044.387429] Pid: 1717, comm: pppd Not tainted 2.6.35.4.4a #3 Sep 5 23:52:33 del kernel: [46044.398764] Call Trace: Sep 5 23:52:33 del kernel: [46044.409596] [<c12f9aba>] ? printk+0x18/0x1e Sep 5 23:52:33 del kernel: [46044.420761] [<c1053969>] lockdep_rcu_dereference+0xa9/0xb0 Sep 5 23:52:33 del kernel: [46044.432229] [<c12b7235>] trie_firstleaf+0x65/0x70 Sep 5 23:52:33 del kernel: [46044.443941] [<c12b74d4>] fib_table_flush+0x14/0x170 Sep 5 23:52:33 del kernel: [46044.455823] [<c1033e92>] ? local_bh_enable_ip+0x62/0xd0 Sep 5 23:52:33 del kernel: [46044.467995] [<c12fc39f>] ? _raw_spin_unlock_bh+0x2f/0x40 Sep 5 23:52:33 del kernel: [46044.480404] [<c12b24d0>] ? fib_sync_down_dev+0x120/0x180 Sep 5 23:52:33 del kernel: [46044.493025] [<c12b069d>] fib_flush+0x2d/0x60 Sep 5 23:52:33 del kernel: [46044.505796] [<c12b06f5>] fib_disable_ip+0x25/0x50 Sep 5 23:52:33 del kernel: [46044.518772] [<c12b10d3>] fib_netdev_event+0x73/0xd0 Sep 5 23:52:33 del kernel: [46044.531918] [<c1048dfd>] notifier_call_chain+0x2d/0x70 Sep 5 23:52:33 del kernel: [46044.545358] [<c1048f0a>] raw_notifier_call_chain+0x1a/0x20 Sep 5 23:52:33 del kernel: [46044.559092] [<c124f687>] call_netdevice_notifiers+0x27/0x60 Sep 5 23:52:33 del kernel: [46044.573037] [<c124faec>] __dev_notify_flags+0x5c/0x80 Sep 5 23:52:33 del kernel: [46044.586489] [<c124fb47>] dev_change_flags+0x37/0x60 Sep 5 23:52:33 del kernel: [46044.599394] [<c12a8a8d>] devinet_ioctl+0x54d/0x630 Sep 5 23:52:33 del kernel: [46044.612277] [<c12aabb7>] inet_ioctl+0x97/0xc0 Sep 5 23:52:34 del kernel: [46044.625208] [<c123f6af>] sock_ioctl+0x6f/0x270 Sep 5 23:52:34 del kernel: [46044.638046] [<c109d2b0>] ? handle_mm_fault+0x420/0x6c0 Sep 5 23:52:34 del kernel: [46044.650968] [<c123f640>] ? sock_ioctl+0x0/0x270 Sep 5 23:52:34 del kernel: [46044.663865] [<c10c3188>] vfs_ioctl+0x28/0xa0 Sep 5 23:52:34 del kernel: [46044.676556] [<c10c38fa>] do_vfs_ioctl+0x6a/0x5c0 Sep 5 23:52:34 del kernel: [46044.688989] [<c1048676>] ? up_read+0x16/0x30 Sep 5 23:52:34 del kernel: [46044.701411] [<c1021376>] ? do_page_fault+0x1d6/0x3a0 Sep 5 23:52:34 del kernel: [46044.714223] [<c10b6588>] ? fget_light+0xf8/0x2f0 Sep 5 23:52:34 del kernel: [46044.726601] [<c1241f98>] ? sys_socketcall+0x208/0x2c0 Sep 5 23:52:34 del kernel: [46044.739140] [<c10c3eb3>] sys_ioctl+0x63/0x70 Sep 5 23:52:34 del kernel: [46044.751967] [<c12fca3d>] syscall_call+0x7/0xb Sep 5 23:52:34 del kernel: [46044.764734] [<c12f0000>] ? cookie_v6_check+0x3d0/0x630 --------------> This patch fixes the warning: =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- net/ipv4/fib_trie.c:1756 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 1 lock held by pppd/1717: #0: (rtnl_mutex){+.+.+.}, at: [<c125dc1f>] rtnl_lock+0xf/0x20 stack backtrace: Pid: 1717, comm: pppd Not tainted 2.6.35.4a #3 Call Trace: [<c12f9aba>] ? printk+0x18/0x1e [<c1053969>] lockdep_rcu_dereference+0xa9/0xb0 [<c12b7235>] trie_firstleaf+0x65/0x70 [<c12b74d4>] fib_table_flush+0x14/0x170 ... Allow trie_firstleaf() to be called either under rcu_read_lock() protection or with RTNL held. The same annotation is added to node_parent_rcu() to prevent a similar warning a bit later. Followup of commits `634a4b20` and `4eaa0e3c`. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-09-08 14:14:20 -07:00

1 2 3 4 5 ...

210409 Commits