When using l2tp over ipsec, the tunnel will hang when rekeying
occurs. Reason is that the transformer bundle attached to the dst entry
is now in STATE_DEAD and thus xfrm_output_one() drops all packets
(XfrmOutStateExpired increases).
Fix this by calling __sk_dst_check (which drops the stale dst
if xfrm dst->check callback finds that the bundle is no longer valid).
Cc: James Chapman <jchapman@katalix.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Better use sk_reset_timer() / sk_stop_timer() helpers to make sure we
dont access already freed/reused memory later.
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 88491d8103 ("drivers/net: Kconfig
& Makefile cleanup") changed the type of these options to bool, but
they select code that could (and still can) be built as modules.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The pmtu informations on the inetpeer are visible for output and
input routes. On packet forwarding, we might propagate a learned
pmtu to the sender. As we update the pmtu informations of the
inetpeer on demand, the original sender of the forwarded packets
might never notice when the pmtu to that inetpeer increases.
So use the mtu of the outgoing device on packet forwarding instead
of the pmtu to the final destination.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We move all mtu handling from dst_mtu() down to the protocol
layer. So each protocol can implement the mtu handling in
a different manner.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We plan to invoke the dst_opt->default_mtu() method unconditioally
from dst_mtu(). So rename the method to dst_opt->mtu() to match
the name with the new meaning.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As it is, we return null as the default mtu of blackhole routes.
This may lead to a propagation of a bogus pmtu if the default_mtu
method of a blackhole route is invoked. So return dst->dev->mtu
as the default mtu instead.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Skip entries from foreign network namespaces.
Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This was copy and pasted from the IPv4 code. We're calling the
ip4 version of that function and map4 is NULL.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can not update iph->daddr in ip_options_rcv_srr(), It is too early.
When some exception ocurred later (eg. in ip_forward() when goto
sr_failed) we need the ip header be identical to the original one as
ICMP need it.
Add a field 'nexthop' in struct ip_options to save nexthop of LSRR
or SSRR option.
Signed-off-by: Li Wei <lw@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use round_jiffies_relative to align the ehea workqueue and avoid
extra wakeups.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we enable multiqueue by default the ehea driver is using
quite a lot of memory for its buffer pools. With 4 queues we
consume 64MB in the jumbo packet ring, 16MB in the medium packet
ring and 16MB in the tiny packet ring.
We should only fill the jumbo ring once the MTU is increased but
for now halve it's size so it consumes 32MB. Also reduce the tiny
packet ring, with 4 queues we had 16k entries which is overkill.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When transmiting a fragmented skb, qlge fills a descriptor with the
fragment addresses, after DMA-mapping them. If there are more than eight
fragments, it will use the eighth descriptor as a pointer to an external
list. After mapping this external list, called OAL to a structure
containing more descriptors, it fills it with the extra fragments.
However, considering that systems with pages larger than 8KiB would have
less than 8 fragments, which was true before commit a715dea3c8, it
defined a macro for the OAL size as 0 in those cases.
Now, if a skb with more than 8 fragments (counting skb->data as one
fragment), this would start overwriting the list of addresses already
mapped and would make the driver fail to properly unmap the right
addresses on architectures with pages larger than 8KiB.
Besides that, the list of mappings was one size too small, since it must
have a mapping for the maxinum number of skb fragments plus one for
skb->data and another for the OAL. So, even on architectures with page
sizes 4KiB and 8KiB, a skb with the maximum number of fragments would
make the driver overwrite its counter for the number of mappings, which,
again, would make it fail to unmap the mapped DMA addresses.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix port identify test on 5461x PHY by driving LEDs through MDIO.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When add sources to interface failure, need to roll back the sfcount[MODE]
to before state. We need to match it corresponding.
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jun Zhao <mypopydev@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since linux 2.6.26 (commit c6aefafb7e : Add IPv6 support to TCP SYN
cookies), we can drop a SYN packet reusing a TIME_WAIT socket.
(As a matter of fact we fail to send the SYNACK answer)
As the client resends its SYN packet after a one second timeout, we
accept it, because first packet removed the TIME_WAIT socket before
being dropped.
This probably explains why nobody ever noticed or complained.
Reported-by: Jesse Young <jlyo@jlyo.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reported issues when using dev_kfree_skb() on UP systems and
systems with low numbers of cores. dev_kfree_skb_irq() will
properly save IRQ state before freeing the skb.
Tested on 3.1.1 and 3.2_rc2
Example of reproducible trace of kernel 3.1.1
------------[ cut here ]------------
WARNING: at kernel/softirq.c:159 local_bh_enable+0x32/0x79()
...
Pid: 0, comm: swapper Not tainted 3.1.1-gentoo #1
Call Trace:
[<c1022970>] warn_slowpath_common+0x65/0x7a
[<c102699e>] ? local_bh_enable+0x32/0x79
[<c1022994>] warn_slowpath_null+0xf/0x13
[<c102699e>] local_bh_enable+0x32/0x79
[<c134bfd8>] destroy_conntrack+0x7c/0x9b
[<c134890b>] nf_conntrack_destroy+0x1f/0x26
[<c132e3a6>] skb_release_head_state+0x74/0x83
[<c132e286>] __kfree_skb+0xb/0x6b
[<c132e30a>] consume_skb+0x24/0x26
[<c127c925>] b44_poll+0xaa/0x449
[<c1333ca1>] net_rx_action+0x3f/0xea
[<c1026a44>] __do_softirq+0x5f/0xd5
[<c10269e5>] ? local_bh_enable+0x79/0x79
<IRQ> [<c1026c32>] ? irq_exit+0x34/0x8d
[<c1003628>] ? do_IRQ+0x74/0x87
[<c13f5329>] ? common_interrupt+0x29/0x30
[<c1006e18>] ? default_idle+0x29/0x3e
[<c10015a7>] ? cpu_idle+0x2f/0x5d
[<c13e91c5>] ? rest_init+0x79/0x7b
[<c15c66a9>] ? start_kernel+0x297/0x29c
[<c15c60b0>] ? i386_start_kernel+0xb0/0xb7
---[ end trace 583f33bb1aa207a9 ]---
Signed-off-by: Xander Hover <LKML@hover.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Distributions are using this in their default scripts, so don't hide
them behind the advanced setting.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 72a3effaf6 ([NET]: Size listen hash tables using backlog
hint) added a bug allowing inet6_synq_hash() to return an out of bound
array index, because of u16 overflow.
Bug can happen if system admins set net.core.somaxconn &
net.ipv4.tcp_max_syn_backlog sysctls to values greater than 65536
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 4ba7d99978.
The original patch was a misguided attempt to improve performance on
some hardware that is apparently prone to spurious interrupt generation.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This reverts commit 23085d5796.
The original patch was a misguided attempt to improve performance on
some hardware that is apparently prone to spurious interrupt generation.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
when skb_shift, we want to shift paged data from skb to tgt frag area.
Original comments revert the shift order
Signed-off-by: Feng King <kinwin2008@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "tmp" variable here is used to store the result of cpu_to_le16()
so it should be an __le16 instead of an int. We want the high bits
set and the current code works on little endian systems but not on
big endian systems.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The errcode is not updated when ip_route_newports() fails.
Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It was pointed out by "make versioncheck" that we do not need to include
version.h in drivers/net/can/sja1000/peak_pci.c
This patch removes the unneeded include.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to mask the MMC irq otherwise if we raise the mmc
interrupts that are not handled the driver loops in the
handler.
In fact, by default all mmc counters (only used for stats)
are managed in SW and registers are cleared on each READ.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"remote_list" is of type
struct dma_chunk remote_list[VETH_MAX_FRAMES_PER_MSG];
Probably a copy'n'paste error.
Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"dwrq->length" is the capped version of "essid->length".
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
By the time userspace returns with a response to
the regulatory domain request, the wiphy causing
the request might have gone away. If this is so,
reject the update but mark the request as having
been processed anyway.
Cc: Luis R. Rodriguez <lrodriguez@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
I intoduced this bug in commit a2fe816674
"mac80211: Build TX radiotap header dynamically"
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
It was flipped. See section 7.3.2.56 of the 802.11n
spec for details.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (86 commits)
ipv4: fix redirect handling
ping: dont increment ICMP_MIB_INERRORS
sky2: fix hang in napi_disable
sky2: enforce minimum ring size
bonding: Don't allow mode change via sysfs with slaves present
f_phonet: fix page offset of first received fragment
stmmac: fix pm functions avoiding sleep on spinlock
stmmac: remove spin_lock in stmmac_ioctl.
stmmac: parameters auto-tuning through HW cap reg
stmmac: fix advertising 1000Base capabilties for non GMII iface
stmmac: use mdelay on timeout of sw reset
sky2: version 1.30
sky2: used fixed RSS key
sky2: reduce default Tx ring size
sky2: rename up/down functions
sky2: pci posting issues
sky2: fix hang on shutdown (and other irq issues)
r6040: fix check against MCRO_HASHEN bit in r6040_multicast_list
MAINTAINERS: change email address for shemminger
pch_gbe: Move #include of module.h
...
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / Suspend: Fix bug in suspend statistics update
PM / Hibernate: Fix the early termination of test modes
PM / shmobile: Fix build of sh7372_pm_init() for CONFIG_PM unset
PM Sleep: Do not extend wakeup paths to devices with ignore_children set
PM / driver core: disable device's runtime PM during shutdown
PM / devfreq: correct Kconfig dependency
PM / devfreq: fix use after free in devfreq_remove_device
PM / shmobile: Avoid restoring the INTCS state during initialization
PM / devfreq: Remove compiler error after irq.h update
PM / QoS: Properly use the WARN() macro in dev_pm_qos_add_request()
PM / Clocks: Only disable enabled clocks in pm_clk_suspend()
ARM: mach-shmobile: sh7372 A3SP no_suspend_console fix
PM / shmobile: Don't skip debugging output in pd_power_up()
Prevent tracing of preempt_disable() in get_cpu_var() in
kvm_clock_read(). When CONFIG_DEBUG_PREEMPT is enabled,
preempt_disable/enable() are traced and this causes the function_graph
tracer to go into an infinite recursion. By open coding the
preempt_disable() around the get_cpu_var(), we can use the notrace
version which prevents preempt_disable/enable() from being traced and
prevents the recursion.
Based on a similar patch for Xen from Jeremy Fitzhardinge.
Tested-by: Gleb Natapov <gleb@redhat.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (exynos4_tmu) Fix Kconfig dependency
[ Merging code in-flight, just because I can. What timezone should I
use? - Linus ]
After commit 2a77c46de1
(PM / Suspend: Add statistics debugfs file for suspend to RAM)
a missing pair of braces inside the state_store() function causes even
invalid arguments to suspend to be wrongly treated as failed suspend
attempts. Fix this.
[rjw: Put the hash/subject of the buggy commit into the changelog.]
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
No longer at Citrix, still interested in Xen.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dummy, non-zero definitions for HPAGE_MASK and HPAGE_SIZE were added in
51c6f666fc ("mm: ZAP_BLOCK causes redundant work") to avoid a divide
by zero in generic kernel code.
That code has since been removed, but probably should never have been
added in the first place: we don't want HPAGE_SIZE to act like PAGE_SIZE
for code that is working with hugepages, for example, when the
dependency on CONFIG_HUGETLB_PAGE has not been fulfilled.
Because hugepage size can differ from architecture to architecture, each
is required to have their own definitions for both HPAGE_MASK and
HPAGE_SIZE. This is always done in arch/*/include/asm/page.h.
So, just remove the dummy and dangerous definitions since they are no
longer needed and reveals the correct dependencies. Tested on
architectures using the definitions with allyesconfig: x86 (even with
thp), hppa, mips, powerpc, s390, sh3, sh4, sparc, and sparc64, and with
defconfig on ia64.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
new helper: mount_subtree()
switch create_mnt_ns() to saner calling conventions, fix double mntput() in nfs
btrfs: fix double mntput() in mount_subvol()
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
random: Fix handing of arch_get_random_long in get_random_bytes()
x86: Call stop_machine_text_poke() on all CPUs
x86, ioapic: Only print ioapic debug information for IRQs belonging to an ioapic chip
x86/mrst: Avoid reporting wrong nmi status
x86/mrst: Add support for Penwell clock calibration
x86/apic: Allow use of lapic timer early calibration result
x86/apic: Do not clear nr_irqs_gsi if no legacy irqs
x86/platform: Add a wallclock_init func to x86_platforms ops
x86/mce: Make mce_chrdev_ops 'static const'