In L2TPv3, we need to create/delete/modify/query L2TP tunnel and
session contexts. The number of parameters is significant. So let's
use netlink. Userspace uses this API to control L2TP tunnel/session
contexts in the kernel.
The previous pppol2tp driver was managed using [gs]etsockopt(). This
API is retained for backwards compatibility. Unlike L2TPv2 which
carries only PPP frames, L2TPv3 can carry raw ethernet frames or other
frame types and these do not always have an associated socket
family. Therefore, we need a way to use L2TP sessions that doesn't
require a socket type for each supported frame type. Hence netlink is
used.
Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This lets kernel modules which use genl netlink APIs serialize netlink
processing.
Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a new L2TPIP socket family and modifies the core to
handle the case where there is no UDP header in the L2TP
packet. L2TP/IP uses IP protocol 115. Since L2TP/UDP and L2TP/IP
packets differ in layout, the datapath packet handling code needs
changes too. Userspace uses an L2TPIP socket instead of a UDP socket
when IP encapsulation is required.
We can't use raw sockets for this because the semantics of raw sockets
don't lend themselves to the socket-per-tunnel model - we need to
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes changes to the L2TP PPP code for L2TPv3.
The existing code has some assumptions about the L2TP header which are
broken by L2TPv3. Also the sockaddr_pppol2tp structure of the original
code is too small to support the increased size of the L2TPv3 tunnel
and session id, so a new sockaddr_pppol2tpv3 structure is needed. In
the socket calls, the size of this structure is used to tell if the
operation is for L2TPv2 or L2TPv3.
Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The L2TPv3 protocol changes the layout of the L2TP packet
header. Tunnel and session ids change from 16-bit to 32-bit values,
data sequence numbers change from 16-bit to 24-bit values and PPP-specific
fields are moved into protocol-specific subheaders.
Although this patch introduces L2TPv3 protocol support, there are no
userspace interfaces to create L2TPv3 sessions yet.
Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When dumping L2TP PPP sessions using /proc/net/pppol2tp, get the
assigned PPP device name from PPP using ppp_dev_name().
Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ppp_dev_name() gives PPP users visibility of a ppp channel's device
name. This can be used by L2TP drivers to dump the assigned PPP
interface name.
Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch splits the pppol2tp driver into separate L2TP and PPP parts
to prepare for L2TPv3 support. In L2TPv3, protocols other than PPP can
be carried, so this split creates a common L2TP core that will handle
the common L2TP bits which protocol support modules such as PPP will
use.
Note that the existing pppol2tp module is split into l2tp_core and
l2tp_ppp by this change.
There are no feature changes here. Internally, however, there are
significant changes, mostly to handle the separation of PPP-specific
data from the L2TP session and to provide hooks in the core for
modules like PPP to access.
Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch moves the existing pppol2tp driver from drivers/net into a
new net/l2tp directory, which is where the upcoming L2TPv3 code will
live. The existing CONFIG_PPPOL2TP config option is left in its
current place to avoid "make oldconfig" issues when an existing
pppol2tp user takes this change. (This is the same approach used for
the pppoatm driver, which moved to net/atm.)
There are no code changes. The existing drivers/net/pppol2tp.c is
simply moved to net/l2tp.
Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Converts the list and the core manipulating with it to be the same as uc_list.
+uses two functions for adding/removing mc address (normal and "global"
variant) instead of a function parameter.
+removes dev_mcast.c completely.
+exposes netdev_hw_addr_list_* macros along with __hw_addr_* functions for
manipulation with lists on a sandbox (used in bonding and 80211 drivers)
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
+little renaming of unicast functions to be smooth with multicast ones
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cpu_to_le32 was missing and used improperly.
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Interface should be visible even if resource allocation fails.
netif_device_attach should be called for every netif_device_detach.
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add debug print in driver, can be tuned by ethtool msg level
callback.
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o USE/Read IDC defined timeout value from ROM.
o While resetting chip, don't wait for other pci-func to respond,
more than reset_ack_timeo seconds,
Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix incorrect offset calculation and remove unnecessary remap
of the region in bar 0 to access onchip memory.
This was leading to read incorrect values by debug tools.
Signed-off-by: Dhananjay Phadke <dhananjay.phadke@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All QLogic converged NICs have 128-bit 128MB on card memory.
Fix the limit check from 64MB to 128MB and remove unnecessary
64-bit read/write checks.
Signed-off-by: Dhananjay Phadke <dhananjay.phadke@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check the access by tools for hardware queue engine and handle it
separately than other block registers, otherwise incorrect data
is returned.
Signed-off-by: Dhananjay Phadke <dhananjay.phadke@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rarely: Fw file size can be unaligned to 8.
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't use the normal hotplug mechanism because it doesn't work. It will
load the module some time after the device appears, but that's not good
enough for us -- we need the driver loaded _immediately_ because otherwise
the NIC driver may just abort and then the phy 'device' goes away.
[bwh: s/phy/mdio/ in module alias, kerneldoc for struct mdio_device_id]
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Followup to commit 5acbbd428d
(net: change illegal_highdma to use dma_mask)
If dev->dev.parent is NULL, we should not try to dereference it.
Dont force inline illegal_highdma() as its pretty big now.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Patch adds support to enable PCI SRIOV in the driver and changes to handle initialization of PCI virtual functions.
- Function handler to change mac addresses for VF from its corresponding PF.
Signed-off-by: Sarveshwar Bandi <sarveshwarb@serverengines.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
BugLink: http://bugs.launchpad.net/bugs/457878
v2:
- remove duplicated phy_speed caculation
- fix the phy_speed caculation according to the DataSheet
v1:
- removed old MII phy control code
- add phylib supporting
- add ethtool interface to make user space NetworkManager works
Tested on Freescale i.MX51 Babbage board.
This patch is based on a patch from Frederic Rodo <fred.rodo@gmail.com>
Cc: Frederic Rodo <fred.rodo@gmail.com>
Signed-off-by: Bryan Wu <bryan.wu@canonical.com>
Acked-by: Amit Kucheria <amit.kucheria@canonical.com>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Robert Hancock pointed out two problems about NETIF_F_HIGHDMA:
-Many drivers only set the flag when they detect they can use 64-bit DMA,
since otherwise they could receive DMA addresses that they can't handle
(which on platforms without IOMMU/SWIOTLB support is fatal). This means that if
64-bit support isn't available, even buffers located below 4GB will get copied
unnecessarily.
-Some drivers set the flag even though they can't actually handle 64-bit DMA,
which would mean that on platforms without IOMMU/SWIOTLB they would get a DMA
mapping error if the memory they received happened to be located above 4GB.
http://lkml.org/lkml/2010/3/3/530
We can use the dma_mask if we need bouncing or not here. Then we can
safely fix drivers that misuse NETIF_F_HIGHDMA.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding GRO support on top of the HW LRO (TPA) support –
there is no measurable performance drawback of adding GRO
on top of it, and it allows better performance when LRO (TPA)
is turned off for virtualization or bridging.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Group all per-cpu data to one structure instead of having many
globals. Also prepare the internals so that we can have multiple
instances of the flow cache if needed.
Only the kmem_cache is left as a global as all flow caches share
the same element size, and benefit from using a common cache.
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
All of the code considers ->dead as a hint that the cached policy
needs to get refreshed. The read side can just drop the read lock
without any side effects.
The write side needs to make sure that it's written only exactly
once. Only possible race is at xfrm_policy_kill(). This is fixed
by checking result of __xfrm_policy_unlink() when needed. It will
always succeed if the policy object is looked up from the hash
list (so some checks are removed), but it needs to be checked if
we are trying to unlink policy via a reference (appropriate
checks added).
Since policy->walk.dead is written exactly once, it no longer
needs to be protected with a write lock.
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add missing check for policy direction verification. This is
especially important since without this xfrm_user may end up
deleting per-socket policy which is not allowed.
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The xfrm state genid only needs to be matched against the copy
saved in xfrm_dst. So we don't need a global genid at all. In
fact, we don't even need to initialise it.
Based on observation by Timo Teräs.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The head element of rt6_info{} is dst_entry{}, and
IPv6 specific elements follow.
Because elements at the end of dst_entry{} are frequently
updated, it is not good to put frequently-used static
elements, such as rt6i_idev, rt6i_dst or rt6i_flags in the
same cache line.
On the other hand, fib6_table, rt6i_node or rt6i_gateway are
rarely used, so it is okay to stay in the same cache line.
Let's rearrange rt6_info{}.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
keep the old behavior on SMP without rps
RPS introduces a lock operation to per cpu variable input_pkt_queue on
SMP whenever rps is enabled or not. On SMP without RPS, this lock isn't
needed at all.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
net/core/dev.c | 42 ++++++++++++++++++++++++++++--------------
1 file changed, 28 insertions(+), 14 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
In case a reset is performed, rtl8169_rx_interrupt() is called from
process context instead of softirq context. Special care must be taken
to call appropriate network core services (netif_rx() instead of
netif_receive_skb()). VLAN handling also corrected.
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Diagnosed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
One of my test machine got a deadlock during "tc" sessions,
adding/deleting classes & filters, using traffic estimators.
After some analysis, I believe we have a potential use after free case
in est_timer() :
spin_lock(e->stats_lock); << HERE >>
read_lock(&est_lock);
if (e->bstats == NULL) << TEST >>
goto skip;
Test is done a bit late, because after estimator is killed, and before
rcu grace period elapsed, we might already have freed/reuse memory where
e->stats_locks points to (some qdisc->q.lock)
A possible fix is to respect a rcu grace period at Qdisc dismantle time.
On 64bit, sizeof(struct Qdisc) is exactly 192 bytes. Adding 16 bytes to
it (for struct rcu_head) is a problem because it might change
performance, given QDISC_ALIGNTO is 32 bytes.
This is why I also change QDISC_ALIGNTO to 64 bytes, to satisfy most
current alignment requirements.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The check if error signaling is wanted (inet->recverr != 0) is done by
the caller: raw.c:raw_err() and udp.c:__udp4_lib_err(), so there is no
need to check this condition again.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
acenic wrongly assumes that zero is an invalid dma address (calls
dma_unmap_page for only non zero dma addresses). Zero is a valid dma
address on some architectures. The dma length can be used here.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove duplicate declaration of symbol: struct hlist_node *node was
already declared, the seconds declaration shadows the first one.
CC: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct _zone *tipc_zones has local scope level and
should defined with the correct scoping.
CC: Per Liden <per.liden@nospam.ericsson.com>
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
eth_type_trans(skb, netdev) does the "skb->dev = netdev;"
initialization, we can remove it from various network drivers.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chip model can now be selected directly by matching the modalias name
(instead of filling the .model field in platform_data), and allows the
module to be auto-loaded. Previous behaviour is of course still supported.
Convert the two in-tree users to this feature (icontrol & zeus).
Tested on an Zeus platform (mcp2515).
Signed-off-by: Marc Zyngier <maz@misterjones.org>
Acked-by: Christian Pellegrin <chripell@fsfe.org>
Cc: Edwin Peer <epeer@tmtservices.co.za>
Acked-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix coding style errors and warnings output while running checkpatch.pl
on the file net/core/dst.c.
Signed-off-by: chavey <chavey@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds ethtool and device feature flag to allow control
of receive hashing offload.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for per-packet timestamping for the
82580 adapter. The rx timestamp code is also pulled out of the
inlined rx hotpath and instead moved to a seperate function.
This version adds a comment explaining the per-packet timestamping
code added to igb_hwtstamp_ioctl().
Signed-off-by: Nicholas Nunley <nicholasx.d.nunley@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some no longer valid IPG_DDEBUG_MSG uses are removed
Validate IPG_DDEBUG_MSG arguments when not #defined
Neaten #defines
marco/macro typo correction
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
addr_bit_test() is used in various places in IPv6 routing table
subsystem. It checks if the given fn_bit is set,
where fn_bit counts bits from MSB in words in network-order.
fn_bit : 0 .... 31 32 .... 64 65 .... 95 96 ....127
fn_bit >> 5 gives offset of word, and (~fn_bit & 0x1f) gives
count from LSB in the network-endian word in question.
fn_bit >> 5 : 0 1 2 3
~fn_bit & 0x1f: 31 .... 0 31 .... 0 31 .... 0 31 .... 0
Thus, the mask was generated as htonl(1 << (~fn_bit & 0x1f)).
This can be optimized by "sweezle" (See include/asm-generic/bitops/le.h).
In little-endian,
htonl(1 << bit) = 1 << (bit ^ BITOP_BE32_SWIZZLE)
where
BITOP_BE32_SWIZZLE is (0x1f & ~7)
So,
htonl(1 << (~fn_bit & 0x1f)) = 1 << ((~fn_bit & 0x1f) ^ (0x1f & ~7))
= 1 << ((~fn_bit ^ ~7) & 0x1f)
= 1 << ((~fn_bit ^ BITOP_BE32_SWIZZLE) & 0x1f)
In big-endian, BITOP_BE32_SWIZZLE is equal to 0.
1 << ((~fn_bit ^ BITOP_BE32_SWIZZLE) & 0x1f)
= 1 << ((~fn_bit) & 0x1f)
= htonl(1 << (~fn_bit & 0x1f))
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>