Commit Graph

402974 Commits

Author SHA1 Message Date
Eric Dumazet
3797d3e846 net: flow_dissector: small optimizations in IPv4 dissect
By moving code around, we avoid :

1) A reload of iph->ihl (bit field, so needs a mask)

2) A conditional test (replaced by a conditional mov on x86)
   Fast path loads iph->protocol anyway.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-08 13:30:02 -05:00
Baruch Siach
cdc4ead09d netdev: smc91x: enable for xtensa
Tested in VLAB Works Xtensa simulation.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-08 13:27:55 -05:00
David S. Miller
74ecd3d1dd Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
Here is one more pull request for the 3.13 window.  This is primarily
composed of downstream pull requests that were posted while I was
traveling during the last part of the 3.12 release.

For the mac80211 bits, Johannes says:

"I have two DFS fixes (ath9k already supports DFS) and a fix for a
pointer race."

And...

"In this round for mac80211-next I have:
 * mesh channel switch support
 * a CCM rewrite, using potential hardware offloads
 * SMPS for AP mode
 * RF-kill GPIO driver updates to make it usable as an ACPI driver
 * regulatory improvements
 * documentation fixes
 * DFS for IBSS mode
 * and a few small other fixes/improvements"

For the TI driver bits, Luca says:

"Some patches intended for 3.13.  Eliad continues upstreaming pending
patches from the internal tree."

For the iwlwifi bits, Emmanuel says:

"There are a few fixes from Johannes mostly clean up patches. We have
also a few other fixes that are relevant for the new firmware that has
not been released yet."

For the Bluetooth bits, Gustavo says:

"A last fix to the 3.12. I ended forgetting to send it before, I hope we can
still make the way to 3.12. It is a revert and it fixes an issue with bluetooth
suspend/hibernate that had many bug reports. Please pull or let me know of any
problems. Thanks!"  (Obviously, that one didn't make 3.12...)

Also...

"One more big pull request for 3.13. These are the patches we queued during
last week. Here you will find a lot of improvements to the HCI and L2CAP and
MGMT layers with the main ones being a better debugfs support and end of work
of splitting L2CAP into Core and Socket parts."

Additionally, there is one ath9k patch to enable DFS in IBSS mode for
that driver.

I appreciate your consideration for taking this extra pull request
this cycle.  Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-08 13:15:39 -05:00
John W. Linville
c1f3bb6bd3 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2013-11-08 09:03:10 -05:00
Eric Dumazet
dcd6077183 inet: fix a UFO regression
While testing virtio_net and skb_segment() changes, Hannes reported
that UFO was sending wrong frames.

It appears this was introduced by a recent commit :
8c3a897bfa ("inet: restore gso for vxlan")

The old condition to perform IP frag was :

tunnel = !!skb->encapsulation;
...
        if (!tunnel && proto == IPPROTO_UDP) {

So the new one should be :

udpfrag = !skb->encapsulation && proto == IPPROTO_UDP;
...
        if (udpfrag) {

Initialization of udpfrag must be done before call
to ops->callbacks.gso_segment(skb, features), as
skb_udp_tunnel_segment() clears skb->encapsulation

(We want udpfrag to be true for UFO, false for VXLAN)

With help from Alexei Starovoitov

Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-08 02:07:59 -05:00
David S. Miller
0b2e2d36d1 Merge branch 'pskb_put'
Mathias Krause says:

====================
move pskb_put (was: IPsec improvements)

This series moves pskb_put() to the core code, making the code
duplication in caif obsolete (patches 1 and 2).
Patch 3 fixes a few kernel-doc issues.

v2 of this series does no longer contain the skb_cow_data() patch and
therefore no performance improvements for IPsec. The change is still
under discussion, but otherwise independent from the above changes.

Please apply!

v2:
- kernel-doc fixes for pskb_put, as noticed by Ben
- dropped skb_cow_data patch as it's still discussed
- added a kernel-doc fixes patch (patch 3)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:29:06 -05:00
Mathias Krause
bc32383cd6 net: skbuff - kernel-doc fixes
Use "@" to refer to parameters in the kernel-doc description. According
to Documentation/kernel-doc-nano-HOWTO.txt "&" shall be used to refer to
structures only.

Signed-off-by: Mathias Krause <mathias.krause@secunet.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:28:59 -05:00
Mathias Krause
253c6daa34 caif: use pskb_put() instead of reimplementing its functionality
Also remove the warning for fragmented packets -- skb_cow_data() will
linearize the buffer, removing all fragments.

Signed-off-by: Mathias Krause <mathias.krause@secunet.com>
Cc: Dmitry Tarnyagin <dmitry.tarnyagin@lockless.no>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:28:59 -05:00
Mathias Krause
0c7ddf36c2 net: move pskb_put() to core code
This function has usage beside IPsec so move it to the core skbuff code.
While doing so, give it some documentation and change its return type to
'unsigned char *' to be in line with skb_put().

Signed-off-by: Mathias Krause <mathias.krause@secunet.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:28:58 -05:00
Andreas Herrmann
b5ad795e52 net: calxedaxgmac: Fix panic caused by MTU change of active interface
Changing MTU size of an xgmac network interface while it is active can
cause a panic like

  skbuff: skb_over_panic: text:c03bc62c len:1090 put:1090 head:edfb6900 data:edfb6942 tail:0xedfb6d84 end:0xedfb6bc0 dev:eth0
  ------------[ cut here ]------------
  kernel BUG at net/core/skbuff.c:126!
  Internal error: Oops - BUG: 0 [#1] SMP ARM
  Modules linked in:
  CPU: 0 PID: 762 Comm: python Tainted: G        W    3.10.0-00015-g3e33cd7 #309
  task: edcfe000 ti: ed67e000 task.ti: ed67e000
  PC is at skb_panic+0x64/0x70
  LR is at wake_up_klogd+0x5c/0x68

This happens because xgmac_change_mtu modifies dev->mtu before the
network interface is quiesced. And thus there still might be buffers
in use which have a buffer size based on the old MTU.

To fix this I moved the change of dev->mtu after the call to
xgmac_stop.

Another modification is required (in xgmac_stop) to ensure that
xgmac_xmit is really not called anymore (xgmac_tx_complete might wake
up the queue again).

I've tested the fix by switching MTU size every second between 600 and
1500 while network traffic was going on. The test box survived a test
of several hours (until I've stopped it) whereas w/o this fix above
panic occurs after several minutes (at most).

Change since v1:
- remove call to netif_stop_queue at beginning of xgmac_stop
- use netif_tx_disable instead of locking+netif_stop_queue

Signed-off-by: Andreas Herrmann <andreas.herrmann@calxeda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:25:53 -05:00
David S. Miller
e21dd863ac Merge branch 'mlx4'
Amir Vadai says:

====================
net/mlx4: Mellanox driver update 07-11-2013

This patchset contains some enhancements and bug fixes for the mlx4_* drivers.
Patchset was applied and tested against commit: "9bb8ca8 virtio-net: switch to
use XPS to choose txq"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:22:54 -05:00
Eugenia Emantayev
163561a4e2 net/mlx4_en: Datapath structures are allocated per NUMA node
For each RX/TX ring and its CQ, allocation is done on a NUMA node that
corresponds to the core that the data structure should operate on.
The assumption is that the core number is reflected by the ring index.
The affected allocations are the ring/CQ data structures,
the TX/RX info and the shared HW/SW buffer.
For TX rings, each core has rings of all UPs.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:22:48 -05:00
Eugenia Emantayev
6e7136ed77 net/mlx4_core: ICM pages are allocated on device NUMA node
This is done to optimize FW/HW access to host memory.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:22:48 -05:00
Eugenia Emantayev
41d942d56c net/mlx4_en: Datapath resources allocated dynamically
Currently all TX/RX rings and completion queues are part of the
netdev priv structure and are allocated statically. This patch
will change the priv to hold only arrays of pointers and therefore
all TX/RX rings and completetion queues will be allocated
dynamically. This is in preparation for NUMA aware allocations.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:22:48 -05:00
Rony Efraim
f0f829bf42 net/mlx4_core: Add immediate activate for VGT->VST->VGT
Allow immediate activate of VGT->VST and VST->VGT transitions, without
the need of rebinding in mlx4_master_immediate_activate_vlan_qos().

Also in struct res_qp: add qp parameters (vlan_index,fvl,vlan_cntrol..)
to the saved set, in order to restore when move to VGT.
 - Clear at mlx4_RST2INIT_QP_wrapper()
 - Save at mlx4_INIT2RTR_QP_wrapper()
 - Restore at mlx4_vf_immed_vlan_work_handler()

Update mlx4_vf_immed_vlan_work_handler() to support VGT.

Signed-off-by: Rony Efraim <ronye@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:22:47 -05:00
Jack Morgenstein
571b8b92c7 net/mlx4_core: Initialize all mailbox buffers to zero before use
To guarantee that all unused fields in all FW commands for both inboxes
and outboxes are zeroed out, initialize the mailbox buffer to all zeroes.

This is especially important for SRIOV comm-channel virtual commands
(such as QUERY_FUNC_CAP), where if new fields are added to support new
features, the driver can depend on older kernels passing zeroes in these
fields.

In addition to zeroing out the mailbox buffer at allocation time, all
(now unnecessary) calls to memset by the callers of
mlx4_alloc_cmd_mailbox() are removed.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:22:47 -05:00
Eyal Perry
75a353d476 net/mlx4_en: Add RFS support in UDP
Modify RFS code to support applying filters for incoming UDP streams.

Signed-off-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:22:47 -05:00
David S. Miller
3cdcf1334c Merge branch 'macvlan_hwaccel'
John Fastabend says:

====================
l2 hardware accelerated macvlans

This patch adds support to offload macvlan net_devices to the
hardware. With these patches packets are pushed to the macvlan
net_device directly and do not pass through the lower dev.

The patches here have made it through multiple iterations
each with a slightly different focus. First I tried to
push these as a new link type called "VMDQ". The patches
shown here,

http://comments.gmane.org/gmane.linux.network/237617

Following this implementation I renamed the link type
"VSI" and addressed various comments. Finally Neil
Horman picked up the patches and integrated the offload
into the macvlan code. Here,

http://permalink.gmane.org/gmane.linux.network/285658

The attached series is clean-up of his patches, with a
few fixes.

If folks find this series acceptable there are a few
items we can work on next. First broadcast and multicast
will use the hardware even for local traffic with this
series. It would be best (I think) to use the software
path for macvlan to macvlan traffic and save the PCIe
bus. This depends on how much you value CPU time vs
PCIE bandwidth. This will need another patch series
to flush out.

Also this series only allows for layer 2 mac forwarding
where some hardware supports more interesting forwarding
capabilities. Integrating with OVS may be useful here.

As always any comments/feedback welcome.

My basic I/O test is here but I've also done some link
testing, SRIOV/DCB with macvlans and others,

Changelog:
v2: two fixes to ixgbe when all features DCB, FCoE, SR-IOV
    are enabled with macvlans. A VMDQ_P() reference
    should have been accel->pool and do not set the offset
    of the ring index from dfwd add call. The offset is used
    by SR-IOV so clearing it can cause SR-IOV quue index's
    to go sideways. With these fixes testing macvlan's with
    SRIOV enabled was successful.
v3: addressed Neil's comments in ixgbe
    fixed error path on dfwd_add_station() in ixgbe
    fixed ixgbe to allow SRIOV and accelerated macvlans to
    coexist.
v4: Dave caught some strange indentation, fixed it here
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:11:58 -05:00
John Fastabend
2a47fa45d4 ixgbe: enable l2 forwarding acceleration for macvlans
Now that l2 acceleration ops are in place from the prior patch,
enable ixgbe to take advantage of these operations.  Allow it to
allocate queues for a macvlan so that when we transmit a frame,
we can do the switching in hardware inside the ixgbe card, rather
than in software.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:11:41 -05:00
John Fastabend
a6cc0cfa72 net: Add layer 2 hardware acceleration operations for macvlan devices
Add a operations structure that allows a network interface to export
the fact that it supports package forwarding in hardware between
physical interfaces and other mac layer devices assigned to it (such
as macvlans). This operaions structure can be used by virtual mac
devices to bypass software switching so that forwarding can be done
in hardware more efficiently.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:11:41 -05:00
Amir Vadai
1ec4864b10 net/mlx4_en: Fixed crash when port type is changed
timecounter_init() was was called only after first potential
timecounter_read().
Moved mlx4_en_init_timestamp() before mlx4_en_init_netdev()

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:11:13 -05:00
Dan Carpenter
5600090eb1 isdn: icn: NULL dereference printing error message
"card2" is NULL here so I have changed it to use "id2" instead of
"card2->interface.id".

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:11:13 -05:00
Dan Carpenter
df42153c59 net: make ndev->irq signed for error handling
There is a bug in cpsw_probe() where we do:

	ndev->irq = platform_get_irq(pdev, 0);
	if (ndev->irq < 0) {

The problem is that "ndev->irq" is unsigned so the error handling
doesn't work.  I have changed it to a regular int.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:11:13 -05:00
Dan Carpenter
78032f9b3e 6lowpan: release device on error path
We recently added a new error path and it needs a dev_put().

Fixes: 7adac1ec81 ('6lowpan: Only make 6lowpan links to IEEE802154 devices')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:11:13 -05:00
Eyal Perry
eb072c4b8d RDMA/cma: Set IBoE SL (user-priority) by egress map when using vlans
On top of commit 366cddb40 "IB/rdma_cm: TOS <=> UP mapping for IBoE", add
support for case vlan egress map is used.

When the IBoE session is being set over a vlan, inherit the socket priority
to vlan priority mapping which was configured for the vlan device egress map.

Signed-off-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:09:44 -05:00
Eyal Perry
d324353919 net/vlan: Provide read access to the vlan egress map
Provide a method for read-only access to the vlan device egress mapping.

Do this by refactoring vlan_dev_get_egress_qos_mask() such that now it
receives as an argument the skb priority instead of pointer to the skb.

Such an access is needed for the IBoE stack where the control plane
goes through the network stack. This is an add-on step on top of commit
d4a968658c "net/route: export symbol ip_tos2prio" which allowed the RDMA-CM
to use ip_tos2prio.

Signed-off-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:09:44 -05:00
Ivan Vecera
85aec73d59 tg3: avoid double-freeing of rx data memory
If build_skb fails the memory associated with the ring buffer is freed but
the ri->data member is not zeroed in this case. This causes a double-free
of this memory in tg3_free_rings->... path. The patch moves this block after
setting ri->data to NULL.
It would be nice to fix this bug also in stable >= v3.4 trees.

Cc: Nithin Nayak Sujir <nsujir@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:09:44 -05:00
Eilon Greenstein
28fb965524 MAINTAINERS: Update bnx2x maintainer
Ariel Elior will take over the bnx2x maintenance.

It's been a pleasure!

Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Acked-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:09:44 -05:00
Andrey Vagin
98bbc06aab net: x86: bpf: don't forget to free sk_filter (v2)
sk_filter isn't freed if bpf_func is equal to sk_run_filter.

This memory leak was introduced by v3.12-rc3-224-gd45ed4a4
"net: fix unsafe set_memory_rw from softirq".

Before this patch sk_filter was freed in sk_filter_release_rcu,
now it should be freed in bpf_jit_free.

Here is output of kmemleak:
unreferenced object 0xffff8800b774eab0 (size 128):
  comm "systemd", pid 1, jiffies 4294669014 (age 124.062s)
  hex dump (first 32 bytes):
    00 00 00 00 0b 00 00 00 20 63 7f b7 00 88 ff ff  ........ c......
    60 d4 55 81 ff ff ff ff 30 d9 55 81 ff ff ff ff  `.U.....0.U.....
  backtrace:
    [<ffffffff816444be>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff811845af>] __kmalloc+0xef/0x260
    [<ffffffff81534028>] sock_kmalloc+0x38/0x60
    [<ffffffff8155d4dd>] sk_attach_filter+0x5d/0x190
    [<ffffffff815378a1>] sock_setsockopt+0x991/0x9e0
    [<ffffffff81531bd6>] SyS_setsockopt+0xb6/0xd0
    [<ffffffff8165f3e9>] system_call_fastpath+0x16/0x1b
    [<ffffffffffffffff>] 0xffffffffffffffff

v2: add extra { } after else

Fixes: d45ed4a4e3 ("net: fix unsafe set_memory_rw from softirq")
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 19:06:52 -05:00
David S. Miller
95ed40196f Merge branch 'tipc_fragmentation'
Erik Hugne says:

====================
tipc: message reassembly using fragment chain

We introduce a new reassembly algorithm that improves performance
and eliminates the risk of causing out-of-memory situations.

v3: -Use skb_try_coalesce, and revert to fraglist if this does not succeed.
    -Make sure reassembly list head is uncloned.

v2: -Rebased on Ying's indentation fix.
    -Node unlock call in msg_fragmenter case moved from patch #2 to #1.
     ('continue' with this lock held would cause spinlock recursion if only
      patch #1 is used)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 18:30:35 -05:00
Erik Hugne
a715b49e79 tipc: reassembly failures should cause link reset
If appending a received fragment to the pending fragment chain
in a unicast link fails, the current code tries to force a retransmission
of the fragment by decrementing the 'next received sequence number'
field in the link. This is done under the assumption that the failure
is caused by an out-of-memory situation, an assumption that does
not hold true after the previous patch in this series.

A failure to append a fragment can now only be caused by a protocol
violation by the sending peer, and it must hence be assumed that it
is either malicious or buggy.  Either way, the correct behavior is now
to reset the link instead of trying to revert its sequence number.
So, this is what we do in this commit.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 18:30:11 -05:00
Erik Hugne
40ba3cdf54 tipc: message reassembly using fragment chain
When the first fragment of a long data data message is received on a link, a
reassembly buffer large enough to hold the data from this and all subsequent
fragments of the message is allocated. The payload of each new fragment is
copied into this buffer upon arrival. When the last fragment is received, the
reassembled message is delivered upwards to the port/socket layer.

Not only is this an inefficient approach, but it may also cause bursts of
reassembly failures in low memory situations. since we may fail to allocate
the necessary large buffer in the first place. Furthermore, after 100 subsequent
such failures the link will be reset, something that in reality aggravates the
situation.

To remedy this problem, this patch introduces a different approach. Instead of
allocating a big reassembly buffer, we now append the arriving fragments
to a reassembly chain on the link, and deliver the whole chain up to the
socket layer once the last fragment has been received. This is safe because
the retransmission layer of a TIPC link always delivers packets in strict
uninterrupted order, to the reassembly layer as to all other upper layers.
Hence there can never be more than one fragment chain pending reassembly at
any given time in a link, and we can trust (but still verify) that the
fragments will be chained up in the correct order.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 18:30:11 -05:00
Erik Hugne
528f6f4bf3 tipc: don't reroute message fragments
When a message fragment is received in a broadcast or unicast link,
the reception code will append the fragment payload to a big reassembly
buffer through a call to the function tipc_recv_fragm(). However, after
the return of that call, the logics goes on and passes the fragment
buffer to the function tipc_net_route_msg(), which will simply drop it.
This behavior is a remnant from the now obsolete multi-cluster
functionality, and has no relevance in the current code base.

Although currently harmless, this unnecessary call would be fatal
after applying the next patch in this series, which introduces
a completely new reassembly algorithm. So we change the code to
eliminate the redundant call.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 18:30:11 -05:00
Simon Wunderlich
3c57e865cf ath9k: enable DFS for IBSS mode
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-11-07 16:28:51 -05:00
Jonas Jensen
b0db7b0c21 phy: Add MOXA MDIO driver
The MOXA UC-711X hardware(s) has an ethernet controller that seem
to be developed internally. The IC used is "RTL8201CP".

This patch adds an MDIO driver which handles the MII bus.

Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 15:37:09 -05:00
Nikolay Aleksandrov
12465fb833 bonding: document the new packets_per_slave option
Add new documentation for the packets_per_slave option available
for balance-rr mode.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 15:10:21 -05:00
Nikolay Aleksandrov
73958329ea bonding: extend round-robin mode with packets_per_slave
This patch aims to extend round-robin mode with a new option called
packets_per_slave which can have the following values and effects:
0 - choose a random slave
1 (default) - standard round-robin, 1 packet per slave
 >1 - round-robin when >1 packets have been transmitted per slave
The allowed values are between 0 and 65535.
This patch also fixes the comment style in bond_xmit_roundrobin().

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 15:10:21 -05:00
Ursula Braun
6fb392b1a6 qeth: avoid buffer overflow in snmp ioctl
Check user-defined length in snmp ioctl request and allow request
only if it fits into a qeth command buffer.

Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reviewed-by: Heiko Carstens <heicars2@linux.vnet.ibm.com>
Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 03:01:59 -05:00
Duan Jiong
17102f8be4 net:drivers/net: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO
This patch fixes coccinelle error regarding usage of IS_ERR and
PTR_ERR instead of PTR_ERR_OR_ZERO.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 03:01:59 -05:00
Duan Jiong
c1fcbaa57a smsc: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO
This patch fixes coccinelle error regarding usage of IS_ERR and
PTR_ERR instead of PTR_ERR_OR_ZERO.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 03:01:59 -05:00
Joe Perches
f9bddcdf1f udp: Remove unnecessary semicolon from do{}while (0) macro
Just an unnecessary semicolon that should be removed...

Whitespace neatening of macro too.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 02:14:33 -05:00
Joe Perches
acec6d75ac smsc9420: Use netif_<level>
Use a more standard logging style.

Convert smsc_<level> macros to use netif_<level>.
Remove unused #define PFX
Add pr_fmt and neaten pr_<level> uses.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 02:14:32 -05:00
Joe Perches
f5ba0b0eda jme: Remove unused #define PFX
It's unused, remove it.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-07 02:14:32 -05:00
Jason Wang
9bb8ca8607 virtio-net: switch to use XPS to choose txq
We used to use a percpu structure vq_index to record the cpu to queue
mapping, this is suboptimal since it duplicates the work of XPS and
loses all other XPS functionality such as allowing user to configure
their own transmission steering strategy.

So this patch switches to use XPS and suggest a default mapping when
the number of cpus is equal to the number of queues. With XPS support,
there's no need for keeping per-cpu vq_index and .ndo_select_queue(),
so they were removed also.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-05 22:20:29 -05:00
Duan Jiong
249a3630c4 ipv6: drop the judgement in rt6_alloc_cow()
Now rt6_alloc_cow() is only called by ip6_pol_route() when
rt->rt6i_flags doesn't contain both RTF_NONEXTHOP and RTF_GATEWAY,
and rt->rt6i_flags hasn't been changed in ip6_rt_copy().
So there is no neccessary to judge whether rt->rt6i_flags contains
RTF_GATEWAY or not.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-05 22:17:05 -05:00
Hannes Frederic Sowa
0e033e04c2 ipv6: fix headroom calculation in udp6_ufo_fragment
Commit 1e2bd517c1 ("udp6: Fix udp
fragmentation for tunnel traffic.") changed the calculation if
there is enough space to include a fragment header in the skb from a
skb->mac_header dervived one to skb_headroom. Because we already peeled
off the skb to transport_header this is wrong. Change this back to check
if we have enough room before the mac_header.

This fixes a panic Saran Neti reported. He used the tbf scheduler which
skb_gso_segments the skb. The offsets get negative and we panic in memcpy
because the skb was erroneously not expanded at the head.

Reported-by: Saran Neti <Saran.Neti@telus.com>
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-05 22:09:53 -05:00
Jason Gunthorpe
1cce16d37d net: mv643xx_eth: Add missing phy_addr_set in DT mode
Commit cc9d4598 'net: mv643xx_eth: use of_phy_connect if phy_node
present' made the call to phy_scan optional, if the DT has a link to
the phy node.

However phy_scan has the side effect of calling phy_addr_set, which
writes the phy MDIO address to the ethernet controller. If phy_addr_set
is not called, and the bootloader has not set the correct address then
the driver will fail to function.

Tested on Kirkwood.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-05 22:07:03 -05:00
Hannes Frederic Sowa
482fc6094a ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE
Sockets marked with IP_PMTUDISC_INTERFACE won't do path mtu discovery,
their sockets won't accept and install new path mtu information and they
will always use the interface mtu for outgoing packets. It is guaranteed
that the packet is not fragmented locally. But we won't set the DF-Flag
on the outgoing frames.

Florian Weimer had the idea to use this flag to ensure DNS servers are
never generating outgoing fragments. They may well be fragmented on the
path, but the server never stores or usees path mtu values, which could
well be forged in an attack.

(The root of the problem with path MTU discovery is that there is
no reliable way to authenticate ICMP Fragmentation Needed But DF Set
messages because they are sent from intermediate routers with their
source addresses, and the IMCP payload will not always contain sufficient
information to identify a flow.)

Recent research in the DNS community showed that it is possible to
implement an attack where DNS cache poisoning is feasible by spoofing
fragments. This work was done by Amir Herzberg and Haya Shulman:
<https://sites.google.com/site/hayashulman/files/fragmentation-poisoning.pdf>

This issue was previously discussed among the DNS community, e.g.
<http://www.ietf.org/mail-archive/web/dnsext/current/msg01204.html>,
without leading to fixes.

This patch depends on the patch "ipv4: fix DO and PROBE pmtu mode
regarding local fragmentation with UFO/CORK" for the enforcement of the
non-fragmentable checks. If other users than ip_append_page/data should
use this semantic too, we have to add a new flag to IPCB(skb)->flags to
suppress local fragmentation and check for this in ip_finish_output.

Many thanks to Florian Weimer for the idea and feedback while implementing
this patch.

Cc: David S. Miller <davem@davemloft.net>
Suggested-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-05 21:52:27 -05:00
John W. Linville
6b732323c1 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth 2013-11-05 15:58:21 -05:00
John W. Linville
c046555966 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next 2013-11-05 15:53:10 -05:00