Adapt qeth l3 to:
commit c2e21293c0
(ipv6: convert addrconf list to hlist)
converted lst_next member of inet6_ifaddr struct to a hlist.
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We never actually use iph again so this assignment can be removed.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
add routines for 8, 16 and 32-bit access like in
drivers/i2c/busses/i2c-pca-platform.c
Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
Acked-by: Wolfgang Grandegger <wg@grandegger.com>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
extend the AND mask, so that IRQF_SHARED flag remains
Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
Acked-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here is patch to manipulate packet node allocation and implicitly
how packets are DMA'd etc.
The flag NODE_ALLOC enables the function and numa_node_id();
when enabled it can also be explicitly controlled via a new
node parameter
Tested this with 10 Intel 82599 ports w. TYAN S7025 E5520 CPU's.
Was able to TX/DMA ~80 Gbit/s to Ethernet wires.
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use RCU to avoid RTNL use in dev_getfirstbyhwtype()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no point to align or pad mibs to cache lines, they are per cpu
allocated with a 8 bytes alignment anyway.
This wastes space for no gain. This patch removes __SNMP_MIB_ALIGN__
Since SNMP mibs contain "unsigned long" fields only, we can relax the
allocation alignment from "unsigned long long" to "unsigned long"
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently force a synchronize_net() in netdev_set_master()
This seems necessary only when a slave had a master and we dismantle it.
In the other case ("ifenslave bond0 ethO"), we dont need this long
delay.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Its currently hard to diagnose when ACK frames are dropped because an
application set TCP_DEFER_ACCEPT on its listening socket.
See http://bugzilla.kernel.org/show_bug.cgi?id=15507
This patch adds a SNMP value, named TCPDeferAcceptDrop
netstat -s | grep TCPDeferAcceptDrop
TCPDeferAcceptDrop: 0
This counter is incremented every time we drop a pure ACK frame received
by a socket in SYN_RECV state because its SYNACK retrans count is lower
than defer_accept value.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the type change, addresses in unicast and multicast lists wouldn't make
sense, not to mention possible different lenghts. So flush both lists here.
Note "dev_addr_discard" will be very soon replaced by "dev_mc_flush" (once
mc_list conversion will be done).
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ignore the new NETDEV_PRE_TYPE_CHANGE event in rtnetlink_event() since
there have been no changes userspace needs to be notified of.
Also add a comment to the netdev notifier event definitions to remind
people to update the exclusion list when adding new event types.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
hlist_for_each_entry(p...) will not necessarily initialize 'p'
to anything if the hlist is empty. GCC notices this and emits
a warning.
Just return true explicitly when we hit a match, and return
false is we fall out of the loop without one.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch reduces timer events while keeping accuracy by rounding
our timer and/or batching several address validations in addrconf_verify().
addrconf_verify() is called at earliest timeout among interface addresses'
timeouts, but at maximum ADDR_CHECK_FREQUENCY (120 secs).
In most cases, all of timeouts of interface addresses are long enough
(e.g. several hours or days vs 2 minutes), this timer is usually called
every ADDR_CHECK_FREQUENCY, and it is okay to be lazy.
(Note this timer could be eliminated if all code paths which modifies
variables related to timeouts call us manually, but it is another story.)
However, in other least but important cases, we try keeping accuracy.
When the real interface address timeout is coming, and the timeout
is just before the rounded timeout, we accept some error.
When a timeout has been reached, we also try batching other several
events in very near future.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The variable regen_advance is only used in the privacy case.
Move it to simplify code and eliminate ifdef's
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some minor stuff, reformat comments and add whitespace for clarity
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert to list macro's for the list of addresses per interface
in IPv6.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing hash function has a couple of issues:
* it is hardwired to 16 for IN6_ADDR_HSIZE
* limited to 256 and callers using int
* use jhash2 rather than some old BSD algorithm
No need for random seed since this is local only (based on assigned
addresses) table.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert from reader/writer lock to RCU and spinlock for addrconf
hash list.
Adds an additional helper macro for hlist_for_each_entry_continue_rcu
to handle the continue case.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using hash list macros, simplifies code and helps later RCU.
This patch includes some initialization that is not strictly necessary,
since an empty hlist node/list is all zero; and list is in BSS
and node is allocated with kzalloc.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use list macros instead of open coded linked list.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a bug that allows to lose events when reliable
event delivery mode is used, ie. if NETLINK_BROADCAST_SEND_ERROR
and NETLINK_RECV_NO_ENOBUFS socket options are set.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, ENOBUFS errors are reported to the socket via
netlink_set_err() even if NETLINK_RECV_NO_ENOBUFS is set. However,
that should not happen. This fixes this problem and it changes the
prototype of netlink_set_err() to return the number of sockets that
have set the NETLINK_RECV_NO_ENOBUFS socket option. This return
value is used in the next patch in these bugfix series.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Under NET_DMA, data transfer can grind to a halt when userland issues a
large read on a socket with a high RCVLOWAT (i.e., 512 KB for both).
This appears to be because the NET_DMA design queues up lots of memcpy
operations, but doesn't issue or wait for them (and thus free the
associated skbs) until it is time for tcp_recvmesg() to return.
The socket hangs when its TCP window goes to zero before enough data is
available to satisfy the read.
Periodically issue asynchronous memcpy operations, and free skbs for ones
that have completed, to prevent sockets from going into zero-window mode.
Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a unaligned access in nla_get_be64() that was
introduced by myself in a17c859849.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
A packet is marked as lost in case packets == 0, although nothing should be done.
This results in a too early retransmitted packet during recovery in some cases.
This small patch fixes this issue by returning immediately.
Signed-off-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de>
Signed-off-by: Arnd Hannemann <hannemann@nets.rwth-aachen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
mfc_parent of cache entries is used to index into the vif_table and is
initialised from mfcctl->mfcc_parent. This can take values of to 2^16-1,
while the vif_table has only MAXVIFS (32) entries. The same problem
affects ip6mr.
Refuse invalid values to fix a potential out-of-bounds access. Unlike
the other validity checks, this is checked in ipmr_mfc_add() instead of
the setsockopt handler since its unused in the delete path and might be
uninitialized.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need to adjust the next rx descriptor after each packet,
so do it only once at the end of the routine.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clean up some text output formatting.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recovery from PF reset works better when you shorten up the delay
until the watchdog task executes.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The counters in the 82599 Virtual Function are not clear on read. They
accumulate to the maximum value and then roll over. They are also not
cleared when the VF executes a soft reset, so it is possible they are
non-zero when the driver loads and starts. This has all been accounted
for in the code that keeps the stats up to date but there is one case
that is not. When the PF driver is reset the counters in the VF are
all reset to zero. This adds an additional accounting overhead into
the VF driver when the PF is reset under its feet. This patch adds
additional counters that are used by the VF driver to accumulate and
save stats after a PF reset has been detected. Prior to this patch
displaying the stats in the VF after the PF has reset would show
bogus data.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As per Simon Horman's feedback set IXGBE_RSC_CB(skb)->dma to zero
after unmapping HWRSC DMA address to avoid double freeing.
Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently netdev_features_change is called before fcoe tx queues
setup is done, so this patch moves calling of netdev_features_change
after tx queues setup is done in ixgbe_init_interrupt_scheme, so
that real_num_tx_queues is updated correctly on each fcoe enable
or disable.
This allows additional fcoe queues updated correctly in vlan driver
for their correct queue selection.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds RFC5082 checks for TTL on received ICMP packets.
It adds some security against spoofed ICMP packets
disrupting GTSM protected sessions.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As the only path leading to ip6_dst_check makes an indirect call
through dst->ops, dst cannot be NULL in ip6_dst_check.
This patch removes this check in case it misleads people who
come across this code.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xfrm_dst keeps a reference to ipv4 rtable entries on each
cached bundle. The only way to renew xfrm_dst when the underlying
route has changed, is to implement dst_check for this. This is
what ipv6 side does too.
The problems started after 87c1e12b5e
("ipsec: Fix bogus bundle flowi") which fixed a bug causing xfrm_dst
to not get reused, until that all lookups always generated new
xfrm_dst with new route reference and path mtu worked. But after the
fix, the old routes started to get reused even after they were expired
causing pmtu to break (well it would occationally work if the rtable
gc had run recently and marked the route obsolete causing dst_check to
get called).
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
TX checksum offload does not work properly when transmitting
UDP packets with 0, 1 or 2 bytes of data. This patch works
around the problem by calculating checksums for these packets
in the driver.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Advanced Power Management is disabled for 82599 KX4 connections by
clearing GRC.APME bit, causing it to not wake the system from an
improper system shutdown. By default GRC.APME is enabled and software
is not supposed to clear these settings during adapter probe.
Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix 82599 link issues during driver load and unload test using multi-speed
10G & 1G fiber modules. When connected back to back sometime 82599 multispeed
fiber modules would link at 1G speed instead of 10G highest speed, due to a
race condition in autotry process involving Tx laser flapping. Move autotry
autoneg-37 tx laser flapping process from multispeed module init setup
to driver unload. This will alert the link partner to restart its
autotry process when it tries to establish the link with the link partner
Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1) Change enic driver description to "Cisco VIC Ethernet NIC Driver"
2) Fix tab space
3) Update MAINTAINERS list
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add wrapper functions vnic_dev_notify_setcmd and vnic_dev_notify_unsetcmd
for firmware notify commands.
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hardware does not honor vlan filters from the host and so the driver does
not need to advertise this.
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The timeout for hardware Tx and Rx queue disable operations is increased to
work-around an erratum for "unnamed" chipset where a DMA completion may take
upto 10ms. We have to wait atleast this long for hardware to signal that Tx
and Rx queues are quiesced.
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The last bit written to a completion descriptor by hardware is the color
bit. Driver must read all other descriptor fields only after reading the
color bit to avoid reading stale descriptor fields. There is a rmb() after
reading the color bit to avoid any compiler/cpu reordering of the reads.
The color bit is the generation bit that toggles each pass through the
completion descriptor ring.
Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When doing "ifenslave -d bond0 eth0", there is chance to get NULL
dereference in netif_receive_skb(), because dev->master suddenly becomes
NULL after we tested it.
We should use ACCESS_ONCE() to avoid this (or rcu_dereference())
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>