As hypervior does not have the knowledge of guest network configuration, it's
better to ask guest to send gratuitous packets when needed.
This patch implements VIRTIO_NET_F_GUEST_ANNOUNCE feature: hypervisor would
notice the guest when it thinks it's time for guest to announce the link
presnece. Guest tests VIRTIO_NET_S_ANNOUNCE bit during config change interrupt
and woule send gratuitous packets through netif_notify_peers() and ack the
notification through ctrl vq.
We need to make sure the atomicy of read and ack in guest otherwise we may ack
more times than being notified. This is done through handling the whole config
change interrupt in an non-reentrant workqueue.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch replaces the code for getting an number from a
userspace buffer by a simple call to kstroul_from_user.
This makes it easier to read and less error prone.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_enter_quickack_mode() already calls tcp_incr_quickack() and sets
icsk->icsk_ack.ato to TCP_ATO_MIN. This patch removes the duplication.
Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Reviewed-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We must try harder to get unique (addr, port) pairs when
doing port autoselection for sockets with SO_REUSEADDR
option set.
We achieve this by adding a relaxation parameter to
inet_csk_bind_conflict. When 'relax' parameter is off
we return a conflict whenever the current searched
pair (addr, port) is not unique.
This tries to address the problems reported in patch:
8d238b25b1
Revert "tcp: bind() fix when many ports are bound"
Tests where ran for creating and binding(0) many sockets
on 100 IPs. The results are, on average:
* 60000 sockets, 600 ports / IP:
* 0.210 s, 620 (IP, port) duplicates without patch
* 0.219 s, no duplicates with patch
* 100000 sockets, 1000 ports / IP:
* 0.371 s, 1720 duplicates without patch
* 0.373 s, no duplicates with patch
* 200000 sockets, 2000 ports / IP:
* 0.766 s, 6900 duplicates without patch
* 0.768 s, no duplicates with patch
* 500000 sockets, 5000 ports / IP:
* 2.227 s, 41500 duplicates without patch
* 2.284 s, no duplicates with patch
Signed-off-by: Alex Copot <alex.mihai.c@gmail.com>
Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two struct request_sock_ops providers, tcp and dccp.
inet_csk_reqsk_queue_prune() can avoid testing syn_ack_timeout being
NULL if we make it non NULL like syn_ack_timeout
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Cc: dccp@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Updates some comments to track RFC6298
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch converts the drivers in drivers/net/wan/* to use
module_pci_driver() macro which makes the code smaller and a bit simpler.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch converts the drivers in drivers/net/tokenring/* to use
module_pci_driver() macro which makes the code smaller and a bit simpler.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert the per-cpu statistics kept for GRE, IPIP, and SIT tunnels
to use 64 bit statistics.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for I2C clock stretching which is required per
SFF-8636. Customers with passive DA cables implement clock stretching
would fail without this patch.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Replace occurrences of 'if (<bool expr> == <1|0>)' with
'if ([!]<bool expr>)'
Replace occurrences of '<bool var> = (<non-bool expr>) ? true : false'
with '<bool var> = <non-bool expr>'.
Replace occurrence of '<bool var> = <non-bool expr>' with
'<bool var> = !!<non-bool expr>'
While the latter replacement is not really necessary, it is done here for
consistency and clarity. No functional changes.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Now that split strings generate checkpatch warnings (per Chapter 2 of
Documentation/CodingStyle to make it easier to grep the code for the
string) cleanup the remaining instances of them in the driver.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch enables software (and phy device) transmit time stamping.
Tested on an old PIII laptop with built in NIC.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
net/core/rtnetlink.c: In function ‘rtnl_create_link’:
net/core/rtnetlink.c:1645:3: warning: passing argument 2 of ‘ops->get_tx_queues’ from incompatible pointer type [enabled by default]
net/core/rtnetlink.c:1645:3: note: expected ‘const struct nlattr **’ but argument is of type ‘struct nlattr **’
Signed-off-by: David S. Miller <davem@davemloft.net>
nokia.com MX does not cope well with kernel.org.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
neigh_table_init_no_netlink() is only used in net/core/neighbour.c file.
Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most machines dont use UDP encapsulation (L2TP)
Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a
test if L2TP never setup the encap_rcv on a socket.
Idea of this patch came after Simon Horman proposal to add a hook on TCP
as well.
If static_key is not yet enabled, the fast path does a single JMP .
When static_key is enabled, JMP destination is patched to reach the real
encap_type/encap_rcv logic, possibly adding cache misses.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: dev@openvswitch.org
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes possible null dereference in probe() function: when both
.mac_addr and .link_gpio are unknown, dev.platform_data may be NULL
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mike Sinkovsky <msink@permonline.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix spelling and references in rtnetlink.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change get_tx_queues, drop unsused arg/return value real_tx_queues,
and use return by value (with error) rather than call by reference.
Probably bonding should just change to LLTX and the whole get_tx_queues
API could disappear!
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In ancient times it was necessary to manually initialize the bus field of an
spi_driver to spi_bus_type. These days this is done in spi_driver_register() so
we can drop the manual assignment.
The patch was generated using the following coccinelle semantic patch:
// <smpl>
@@
identifier _driver;
@@
struct spi_driver _driver = {
.driver = {
- .bus = &spi_bus_type,
},
};
// </smpl>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Gabor Juhos <juhosg@openwrt.org>
Cc: Frederic Lambert <frdrc66@gmail.com>
Cc: netdev@vger.kernel.org
Acked-by: Gabor Juhos <juhosg@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The comparison operators were backwards in both garp_attr_lookup and
garp_attr_create, so the entire GID rbtree was in reverse order.
(There was no practical side effect to this though, except that PDUs
were sent with attributes listed in reverse order, which is still
valid by the protocol. This change is only for clarity.)
Signed-off-by: David Ward <david.ward@ll.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
We discovered that PPPoATM has an excessively deep transmit queue. A
queue the size of the default socket send buffer (wmem_default) is
maintained between the PPP generic core and the ATM device.
Fix it to queue a maximum of *two* packets. The one the ATM device is
currently working on, and one more for the ATM driver to process
immediately in its TX done interrupt handler. The PPP core is designed
to feed packets to the channel with minimal latency, so that really
ought to be enough to keep the ATM device busy.
While we're at it, fix the fact that we were triggering the wakeup
tasklet on *every* pppoatm_pop() call. The comment saying "this is
inefficient, but doing it right is too hard" turns out to be overly
pessimistic... I think :)
On machines like the Traverse Geos, with a slow Geode CPU and two
high-speed ADSL2+ interfaces, there were reports of extremely high CPU
usage which could partly be attributed to the extra wakeups.
(The wakeup handling could actually be made a whole lot easier if we
stop checking sk->sk_sndbuf altogether. Given that we now only queue
*two* packets ever, one wonders what the point is. As it is, you could
already deadlock the thing by setting the sk_sndbuf to a value lower
than the MTU of the device, and it'd just block for ever.)
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do the initialization of the HSI interface when the
interface is opened, instead of upon registration.
When the interface is closed the HSI interface is
de-initialized, allowing other modules to use the
HSI interface.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CAIF HSI is currently a virtual device. Stopping/starting the
queues is wrong on a virtual device.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement aggregation algorithm, combining more data into a single
HSI transfer. 4 different traffic categories are supported:
1. TC_PRIO_CONTROL .. TC_PRIO_MAX (CTL)
2. TC_PRIO_INTERACTIVE (VO)
3. TC_PRIO_INTERACTIVE_BULK (VI)
4. TC_PRIO_BESTEFFORT, TC_PRIO_BULK, TC_PRIO_FILLER (BEBK)
Signed-off-by: Dmitry Tarnyagin <dmitry.tarnyagin@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set traffic class for CAIF packets, based on socket
priority, CAIF protocol type, or type of message.
Traffic class mapping for different packet types:
- control: TC_PRIO_CONTROL;
- flow control: TC_PRIO_CONTROL;
- at: TC_PRIO_CONTROL;
- rfm: TC_PRIO_INTERACTIVE_BULK;
- other sockets: equals to socket's TC;
- network data: no change.
Signed-off-by: Dmitry Tarnyagin <dmitry.tarnyagin@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull in the 'net' tree to get CAIF bug fixes upon which
the following set of CAIF feature patches depend.
Signed-off-by: David S. Miller <davem@davemloft.net>
Use dev_dbg instead of dev_err for reporting in cfhsi_wakeup_cb.
Signed-off-by: Kim Lilliestierna <kim.xx.lilliestierna@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix memory leak of RX flip-buffer.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added kfree_skb() calls in the chnk_net.c file on
the error paths.
Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Applications using L2TP/IP sockets want to be able to bind() an L2TP/IP
socket to set the local tunnel id while leaving the auto-assigned source
address alone. So if no source address is supplied, don't overwrite
the address already stored in the socket.
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The l2tp_ip socket close handler does not update the module refcount
correctly which prevents module unload after the first bind() call on
an L2TPv3 IP encapulation socket.
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 2e57b79cce misplaced its
parenthesis and now tx_fifo_errors will only be incremented if an
ENOMEM error is not written to the syslog.
Correct the parenthesis and indentation to the original goal of
counting all non ENOMEM errors and ratelimiting only the messages.
Signed-of-by: Torsten Kaiser <just.for.lkml@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At the point of this error-handling code, alloc_skb has succeded, so free
the resulting skb by jumping to the err label.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recently an oops was reported in phonet if there was a failure during
network namespace creation.
[ 163.733755] ------------[ cut here ]------------
[ 163.734501] kernel BUG at include/net/netns/generic.h:45!
[ 163.734501] invalid opcode: 0000 [#1] PREEMPT SMP
[ 163.734501] CPU 2
[ 163.734501] Pid: 19145, comm: trinity Tainted: G W 3.4.0-rc1-next-20120405-sasha-dirty #57
[ 163.734501] RIP: 0010:[<ffffffff824d6062>] [<ffffffff824d6062>] phonet_pernet+0x182/0x1a0
[ 163.734501] RSP: 0018:ffff8800674d5ca8 EFLAGS: 00010246
[ 163.734501] RAX: 000000003fffffff RBX: 0000000000000000 RCX: ffff8800678c88d8
[ 163.734501] RDX: 00000000003f4000 RSI: ffff8800678c8910 RDI: 0000000000000282
[ 163.734501] RBP: ffff8800674d5cc8 R08: 0000000000000000 R09: 0000000000000000
[ 163.734501] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880068bec920
[ 163.734501] R13: ffffffff836b90c0 R14: 0000000000000000 R15: 0000000000000000
[ 163.734501] FS: 00007f055e8de700(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000
[ 163.734501] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 163.734501] CR2: 00007f055e6bb518 CR3: 0000000070c16000 CR4: 00000000000406e0
[ 163.734501] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 163.734501] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 163.734501] Process trinity (pid: 19145, threadinfo ffff8800674d4000, task ffff8800678c8000)
[ 163.734501] Stack:
[ 163.734501] ffffffff824d5f00 ffffffff810e2ec1 ffff880067ae0000 00000000ffffffd4
[ 163.734501] ffff8800674d5cf8 ffffffff824d667a ffff880067ae0000 00000000ffffffd4
[ 163.734501] ffffffff836b90c0 0000000000000000 ffff8800674d5d18 ffffffff824d707d
[ 163.734501] Call Trace:
[ 163.734501] [<ffffffff824d5f00>] ? phonet_pernet+0x20/0x1a0
[ 163.734501] [<ffffffff810e2ec1>] ? get_parent_ip+0x11/0x50
[ 163.734501] [<ffffffff824d667a>] phonet_device_destroy+0x1a/0x100
[ 163.734501] [<ffffffff824d707d>] phonet_device_notify+0x3d/0x50
[ 163.734501] [<ffffffff810dd96e>] notifier_call_chain+0xee/0x130
[ 163.734501] [<ffffffff810dd9d1>] raw_notifier_call_chain+0x11/0x20
[ 163.734501] [<ffffffff821cce12>] call_netdevice_notifiers+0x52/0x60
[ 163.734501] [<ffffffff821cd235>] rollback_registered_many+0x185/0x270
[ 163.734501] [<ffffffff821cd334>] unregister_netdevice_many+0x14/0x60
[ 163.734501] [<ffffffff823123e3>] ipip_exit_net+0x1b3/0x1d0
[ 163.734501] [<ffffffff82312230>] ? ipip_rcv+0x420/0x420
[ 163.734501] [<ffffffff821c8515>] ops_exit_list+0x35/0x70
[ 163.734501] [<ffffffff821c911b>] setup_net+0xab/0xe0
[ 163.734501] [<ffffffff821c9416>] copy_net_ns+0x76/0x100
[ 163.734501] [<ffffffff810dc92b>] create_new_namespaces+0xfb/0x190
[ 163.734501] [<ffffffff810dca21>] unshare_nsproxy_namespaces+0x61/0x80
[ 163.734501] [<ffffffff810afd1f>] sys_unshare+0xff/0x290
[ 163.734501] [<ffffffff8187622e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 163.734501] [<ffffffff82665539>] system_call_fastpath+0x16/0x1b
[ 163.734501] Code: e0 c3 fe 66 0f 1f 44 00 00 48 c7 c2 40 60 4d 82 be 01 00 00 00 48 c7 c7 80 d1 23 83 e8 48 2a c4 fe e8 73 06 c8 fe 48 85 db 75 0e <0f> 0b 0f 1f 40 00 eb fe 66 0f 1f 44 00 00 48 83 c4 10 48 89 d8
[ 163.734501] RIP [<ffffffff824d6062>] phonet_pernet+0x182/0x1a0
[ 163.734501] RSP <ffff8800674d5ca8>
[ 163.861289] ---[ end trace fb5615826c548066 ]---
After investigation it turns out there were two issues.
1) Phonet was not implementing network devices but was using register_pernet_device
instead of register_pernet_subsys.
This was allowing there to be cases when phonenet was not initialized and
the phonet net_generic was not set for a network namespace when network
device events were being reported on the netdevice_notifier for a network
namespace leading to the oops above.
2) phonet_exit_net was implementing a confusing and special case of handling all
network devices from going away that it was hard to see was correct, and would
only occur when the phonet module was removed.
Now that unregister_netdevice_notifier has been modified to synthesize unregistration
events for the network devices that are extant when called this confusing special
case in phonet_exit_net is no longer needed.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We already synthesize events in register_netdevice_notifier and synthesizing
events in unregister_netdevice_notifier allows to us remove the need for
special case cleanup code.
This change should be safe as it adds no new cases for existing callers
of unregiser_netdevice_notifier to handle.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>