Commit Graph

136 Commits

Author SHA1 Message Date
Gao Feng
e5dadc65f9 ppp: Fix false xmit recursion detect with two ppp devices
The global percpu variable ppp_xmit_recursion is used to detect the ppp
xmit recursion to avoid the deadlock, which is caused by one CPU tries to
lock the xmit lock twice. But it would report false recursion when one CPU
wants to send the skb from two different PPP devices, like one L2TP on the
PPPoE. It is a normal case actually.

Now use one percpu member of struct ppp instead of the gloable variable to
detect the xmit recursion of one ppp device.

Fixes: 55454a5658 ("ppp: avoid dealock on recursive xmit")
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Signed-off-by: Liu Jianying <jianying.liu@ikuai8.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18 11:20:33 -07:00
Matthias Schiffer
a8b8a889e3 net: add netlink_ext_ack argument to rtnl_link_ops.validate
Add support for extended error reporting.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26 23:13:22 -04:00
Matthias Schiffer
7a3f4a1851 net: add netlink_ext_ack argument to rtnl_link_ops.newlink
Add support for extended error reporting.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26 23:13:21 -04:00
yuan linyu
b952f4dff2 net: manual clean code which call skb_put_[data:zero]
Signed-off-by: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20 13:30:15 -04:00
Christos Gkekas
47b3e2f701 pptp: Remove unused variable in pptp_release()
Variable opt in pptp_release() is set but never used, thus needs to be
removed.

Signed-off-by: Christos Gkekas <chris.gekas@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-18 23:53:56 -04:00
Johannes Berg
d58ff35122 networking: make skb_push & __skb_push return void pointers
It seems like a historic accident that these return unsigned char *,
and in many places that means casts are required, more often than not.

Make these functions return void * and remove all the casts across
the tree, adding a (u8 *) cast only where the unsigned char pointer
was used directly, all done with the following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

    @@
    expression SKB, LEN;
    identifier fn = { skb_push, __skb_push, skb_push_rcsum };
    @@
    - fn(SKB, LEN)[0]
    + *(u8 *)fn(SKB, LEN)

Note that the last part there converts from push(...)[0] to the
more idiomatic *(u8 *)push(...).

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16 11:48:40 -04:00
Johannes Berg
4df864c1d9 networking: make skb_put & friends return void pointers
It seems like a historic accident that these return unsigned char *,
and in many places that means casts are required, more often than not.

Make these functions (skb_put, __skb_put and pskb_put) return void *
and remove all the casts across the tree, adding a (u8 *) cast only
where the unsigned char pointer was used directly, all done with the
following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = { skb_put, __skb_put };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = { skb_put, __skb_put };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

which actually doesn't cover pskb_put since there are only three
users overall.

A handful of stragglers were converted manually, notably a macro in
drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many
instances in net/bluetooth/hci_sock.c. In the former file, I also
had to fix one whitespace problem spatch introduced.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16 11:48:39 -04:00
Johannes Berg
59ae1d127a networking: introduce and use skb_put_data()
A common pattern with skb_put() is to just want to memcpy()
some data into the new space, introduce skb_put_data() for
this.

An spatch similar to the one for skb_put_zero() converts many
of the places using it:

    @@
    identifier p, p2;
    expression len, skb, data;
    type t, t2;
    @@
    (
    -p = skb_put(skb, len);
    +p = skb_put_data(skb, data, len);
    |
    -p = (t)skb_put(skb, len);
    +p = skb_put_data(skb, data, len);
    )
    (
    p2 = (t2)p;
    -memcpy(p2, data, len);
    |
    -memcpy(p, data, len);
    )

    @@
    type t, t2;
    identifier p, p2;
    expression skb, data;
    @@
    t *p;
    ...
    (
    -p = skb_put(skb, sizeof(t));
    +p = skb_put_data(skb, data, sizeof(t));
    |
    -p = (t *)skb_put(skb, sizeof(t));
    +p = skb_put_data(skb, data, sizeof(t));
    )
    (
    p2 = (t2)p;
    -memcpy(p2, data, sizeof(*p));
    |
    -memcpy(p, data, sizeof(*p));
    )

    @@
    expression skb, len, data;
    @@
    -memcpy(skb_put(skb, len), data, len);
    +skb_put_data(skb, data, len);

(again, manually post-processed to retain some comments)

Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16 11:48:37 -04:00
Joe Perches
4f5a98410d ppp: mppe: Use vsnprintf extension %phN
Using this extension reduces the object size.

$ size drivers/net/ppp/ppp_mppe.o*
   text	   data	    bss	    dec	    hex	filename
   5683	    216	      8	   5907	   1713	drivers/net/ppp/ppp_mppe.o.new
   5808	    216	      8	   6032	   1790	drivers/net/ppp/ppp_mppe.o.old

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-06 15:16:33 -04:00
Gao Feng
97fcc193f6 ppp: remove unnecessary bh disable in xmit path
Since the commit 55454a5658 ("ppp: avoid dealock on recursive xmit"),
the PPP xmit path is protected by wrapper functions which disable the
bh already. So it is unnecessary to disable the bh again in the real
xmit path.

Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Acked-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-01 11:57:36 -04:00
Ingo Molnar
174cd4b1e5 sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h>
Fix up affected files that include this signal functionality via sched.h.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:32 +01:00
stephen hemminger
bc1f44709c net: make ndo_get_stats64 a void function
The network device operation for reading statistics is only called
in one place, and it ignores the return value. Having a structure
return value is potentially confusing because some future driver could
incorrectly assume that the return value was used.

Fix all drivers with ndo_get_stats64 to have a void function.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08 17:51:44 -05:00
Linus Torvalds
7c0f6ba682 Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24 11:46:01 -08:00
Alexey Dobriyan
c7d03a00b5 netns: make struct pernet_operations::id unsigned int
Make struct pernet_operations::id unsigned.

There are 2 reasons to do so:

1)
This field is really an index into an zero based array and
thus is unsigned entity. Using negative value is out-of-bound
access by definition.

2)
On x86_64 unsigned 32-bit data which are mixed with pointers
via array indexing or offsets added or subtracted to pointers
are preffered to signed 32-bit data.

"int" being used as an array index needs to be sign-extended
to 64-bit before being used.

	void f(long *p, int i)
	{
		g(p[i]);
	}

  roughly translates to

	movsx	rsi, esi
	mov	rdi, [rsi+...]
	call 	g

MOVSX is 3 byte instruction which isn't necessary if the variable is
unsigned because x86_64 is zero extending by default.

Now, there is net_generic() function which, you guessed it right, uses
"int" as an array index:

	static inline void *net_generic(const struct net *net, int id)
	{
		...
		ptr = ng->ptr[id - 1];
		...
	}

And this function is used a lot, so those sign extensions add up.

Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
messing with code generation):

	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

Unfortunately some functions actually grow bigger.
This is a semmingly random artefact of code generation with register
allocator being used differently. gcc decides that some variable
needs to live in new r8+ registers and every access now requires REX
prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
used which is longer than [r8]

However, overall balance is in negative direction:

	add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
	function                                     old     new   delta
	nfsd4_lock                                  3886    3959     +73
	tipc_link_build_proto_msg                   1096    1140     +44
	mac80211_hwsim_new_radio                    2776    2808     +32
	tipc_mon_rcv                                1032    1058     +26
	svcauth_gss_legacy_init                     1413    1429     +16
	tipc_bcbase_select_primary                   379     392     +13
	nfsd4_exchange_id                           1247    1260     +13
	nfsd4_setclientid_confirm                    782     793     +11
		...
	put_client_renew_locked                      494     480     -14
	ip_set_sockfn_get                            730     716     -14
	geneve_sock_add                              829     813     -16
	nfsd4_sequence_done                          721     703     -18
	nlmclnt_lookup_host                          708     686     -22
	nfsd4_lockt                                 1085    1063     -22
	nfs_get_client                              1077    1050     -27
	tcf_bpf_init                                1106    1076     -30
	nfsd4_encode_fattr                          5997    5930     -67
	Total: Before=154856051, After=154854321, chg -0.00%

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18 10:59:15 -05:00
Guillaume Nault
077127705a ppp: declare PPP devices as LLTX
ppp_xmit_process() already locks the xmit path. If HARD_TX_LOCK() tries
to hold the _xmit_lock we can get lock inversion.

[  973.726130] ======================================================
[  973.727311] [ INFO: possible circular locking dependency detected ]
[  973.728546] 4.8.0-rc2 #1 Tainted: G           O
[  973.728986] -------------------------------------------------------
[  973.728986] accel-pppd/1806 is trying to acquire lock:
[  973.728986]  (&qdisc_xmit_lock_key){+.-...}, at: [<ffffffff8146f6fe>] sch_direct_xmit+0x8d/0x221
[  973.728986]
[  973.728986] but task is already holding lock:
[  973.728986]  (l2tp_sock){+.-...}, at: [<ffffffffa0202c4a>] l2tp_xmit_skb+0x1e8/0x5d7 [l2tp_core]
[  973.728986]
[  973.728986] which lock already depends on the new lock.
[  973.728986]
[  973.728986]
[  973.728986] the existing dependency chain (in reverse order) is:
[  973.728986]
-> #3 (l2tp_sock){+.-...}:
[  973.728986]        [<ffffffff810b3130>] lock_acquire+0x150/0x217
[  973.728986]        [<ffffffff815752f4>] _raw_spin_lock+0x2d/0x3c
[  973.728986]        [<ffffffffa0202c4a>] l2tp_xmit_skb+0x1e8/0x5d7 [l2tp_core]
[  973.728986]        [<ffffffffa01b2466>] pppol2tp_xmit+0x1f2/0x25e [l2tp_ppp]
[  973.728986]        [<ffffffffa0184f59>] ppp_channel_push+0xb5/0x14a [ppp_generic]
[  973.728986]        [<ffffffffa01853ed>] ppp_write+0x104/0x11c [ppp_generic]
[  973.728986]        [<ffffffff811b2ec6>] __vfs_write+0x56/0x120
[  973.728986]        [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b
[  973.728986]        [<ffffffff811b4cb2>] SyS_write+0x5e/0x96
[  973.728986]        [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8
[  973.728986]
-> #2 (&(&pch->downl)->rlock){+.-...}:
[  973.728986]        [<ffffffff810b3130>] lock_acquire+0x150/0x217
[  973.728986]        [<ffffffff81575334>] _raw_spin_lock_bh+0x31/0x40
[  973.728986]        [<ffffffffa01808e2>] ppp_push+0xa7/0x82d [ppp_generic]
[  973.728986]        [<ffffffffa0184675>] __ppp_xmit_process+0x48/0x877 [ppp_generic]
[  973.728986]        [<ffffffffa018505b>] ppp_xmit_process+0x4b/0xaf [ppp_generic]
[  973.728986]        [<ffffffffa01853f7>] ppp_write+0x10e/0x11c [ppp_generic]
[  973.728986]        [<ffffffff811b2ec6>] __vfs_write+0x56/0x120
[  973.728986]        [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b
[  973.728986]        [<ffffffff811b4cb2>] SyS_write+0x5e/0x96
[  973.728986]        [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8
[  973.728986]
-> #1 (&(&ppp->wlock)->rlock){+.-...}:
[  973.728986]        [<ffffffff810b3130>] lock_acquire+0x150/0x217
[  973.728986]        [<ffffffff81575334>] _raw_spin_lock_bh+0x31/0x40
[  973.728986]        [<ffffffffa0184654>] __ppp_xmit_process+0x27/0x877 [ppp_generic]
[  973.728986]        [<ffffffffa018505b>] ppp_xmit_process+0x4b/0xaf [ppp_generic]
[  973.728986]        [<ffffffffa01852da>] ppp_start_xmit+0x21b/0x22a [ppp_generic]
[  973.728986]        [<ffffffff8143f767>] dev_hard_start_xmit+0x1a9/0x43d
[  973.728986]        [<ffffffff8146f747>] sch_direct_xmit+0xd6/0x221
[  973.728986]        [<ffffffff814401e4>] __dev_queue_xmit+0x62a/0x912
[  973.728986]        [<ffffffff814404d7>] dev_queue_xmit+0xb/0xd
[  973.728986]        [<ffffffff81449978>] neigh_direct_output+0xc/0xe
[  973.728986]        [<ffffffff8150e62b>] ip6_finish_output2+0x5a9/0x623
[  973.728986]        [<ffffffff81512128>] ip6_output+0x15e/0x16a
[  973.728986]        [<ffffffff8153ef86>] dst_output+0x76/0x7f
[  973.728986]        [<ffffffff8153f737>] mld_sendpack+0x335/0x404
[  973.728986]        [<ffffffff81541c61>] mld_send_initial_cr.part.21+0x99/0xa2
[  973.728986]        [<ffffffff8154441d>] ipv6_mc_dad_complete+0x42/0x71
[  973.728986]        [<ffffffff8151c4bd>] addrconf_dad_completed+0x1cf/0x2ea
[  973.728986]        [<ffffffff8151e4fa>] addrconf_dad_work+0x453/0x520
[  973.728986]        [<ffffffff8107a393>] process_one_work+0x365/0x6f0
[  973.728986]        [<ffffffff8107aecd>] worker_thread+0x2de/0x421
[  973.728986]        [<ffffffff810816fb>] kthread+0x121/0x130
[  973.728986]        [<ffffffff81575dbf>] ret_from_fork+0x1f/0x40
[  973.728986]
-> #0 (&qdisc_xmit_lock_key){+.-...}:
[  973.728986]        [<ffffffff810b28d6>] __lock_acquire+0x1118/0x1483
[  973.728986]        [<ffffffff810b3130>] lock_acquire+0x150/0x217
[  973.728986]        [<ffffffff815752f4>] _raw_spin_lock+0x2d/0x3c
[  973.728986]        [<ffffffff8146f6fe>] sch_direct_xmit+0x8d/0x221
[  973.728986]        [<ffffffff814401e4>] __dev_queue_xmit+0x62a/0x912
[  973.728986]        [<ffffffff814404d7>] dev_queue_xmit+0xb/0xd
[  973.728986]        [<ffffffff81449978>] neigh_direct_output+0xc/0xe
[  973.728986]        [<ffffffff81487811>] ip_finish_output2+0x5db/0x609
[  973.728986]        [<ffffffff81489590>] ip_finish_output+0x152/0x15e
[  973.728986]        [<ffffffff8148a0d4>] ip_output+0x8c/0x96
[  973.728986]        [<ffffffff81489652>] ip_local_out+0x41/0x4a
[  973.728986]        [<ffffffff81489e7d>] ip_queue_xmit+0x5a5/0x609
[  973.728986]        [<ffffffffa0202fe4>] l2tp_xmit_skb+0x582/0x5d7 [l2tp_core]
[  973.728986]        [<ffffffffa01b2466>] pppol2tp_xmit+0x1f2/0x25e [l2tp_ppp]
[  973.728986]        [<ffffffffa0184f59>] ppp_channel_push+0xb5/0x14a [ppp_generic]
[  973.728986]        [<ffffffffa01853ed>] ppp_write+0x104/0x11c [ppp_generic]
[  973.728986]        [<ffffffff811b2ec6>] __vfs_write+0x56/0x120
[  973.728986]        [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b
[  973.728986]        [<ffffffff811b4cb2>] SyS_write+0x5e/0x96
[  973.728986]        [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8
[  973.728986]
[  973.728986] other info that might help us debug this:
[  973.728986]
[  973.728986] Chain exists of:
  &qdisc_xmit_lock_key --> &(&pch->downl)->rlock --> l2tp_sock

[  973.728986]  Possible unsafe locking scenario:
[  973.728986]
[  973.728986]        CPU0                    CPU1
[  973.728986]        ----                    ----
[  973.728986]   lock(l2tp_sock);
[  973.728986]                                lock(&(&pch->downl)->rlock);
[  973.728986]                                lock(l2tp_sock);
[  973.728986]   lock(&qdisc_xmit_lock_key);
[  973.728986]
[  973.728986]  *** DEADLOCK ***
[  973.728986]
[  973.728986] 6 locks held by accel-pppd/1806:
[  973.728986]  #0:  (&(&pch->downl)->rlock){+.-...}, at: [<ffffffffa0184efa>] ppp_channel_push+0x56/0x14a [ppp_generic]
[  973.728986]  #1:  (l2tp_sock){+.-...}, at: [<ffffffffa0202c4a>] l2tp_xmit_skb+0x1e8/0x5d7 [l2tp_core]
[  973.728986]  #2:  (rcu_read_lock){......}, at: [<ffffffff81486981>] rcu_lock_acquire+0x0/0x20
[  973.728986]  #3:  (rcu_read_lock_bh){......}, at: [<ffffffff81486981>] rcu_lock_acquire+0x0/0x20
[  973.728986]  #4:  (rcu_read_lock_bh){......}, at: [<ffffffff814340e3>] rcu_lock_acquire+0x0/0x20
[  973.728986]  #5:  (dev->qdisc_running_key ?: &qdisc_running_key#2){+.....}, at: [<ffffffff8144011e>] __dev_queue_xmit+0x564/0x912
[  973.728986]
[  973.728986] stack backtrace:
[  973.728986] CPU: 2 PID: 1806 Comm: accel-pppd Tainted: G           O    4.8.0-rc2 #1
[  973.728986] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[  973.728986]  ffff7fffffffffff ffff88003436f850 ffffffff812a20f4 ffffffff82156e30
[  973.728986]  ffffffff82156920 ffff88003436f890 ffffffff8115c759 ffff88003344ae00
[  973.728986]  ffff88003344b5c0 0000000000000002 0000000000000006 ffff88003344b5e8
[  973.728986] Call Trace:
[  973.728986]  [<ffffffff812a20f4>] dump_stack+0x67/0x90
[  973.728986]  [<ffffffff8115c759>] print_circular_bug+0x22e/0x23c
[  973.728986]  [<ffffffff810b28d6>] __lock_acquire+0x1118/0x1483
[  973.728986]  [<ffffffff810b3130>] lock_acquire+0x150/0x217
[  973.728986]  [<ffffffff810b3130>] ? lock_acquire+0x150/0x217
[  973.728986]  [<ffffffff8146f6fe>] ? sch_direct_xmit+0x8d/0x221
[  973.728986]  [<ffffffff815752f4>] _raw_spin_lock+0x2d/0x3c
[  973.728986]  [<ffffffff8146f6fe>] ? sch_direct_xmit+0x8d/0x221
[  973.728986]  [<ffffffff8146f6fe>] sch_direct_xmit+0x8d/0x221
[  973.728986]  [<ffffffff814401e4>] __dev_queue_xmit+0x62a/0x912
[  973.728986]  [<ffffffff814404d7>] dev_queue_xmit+0xb/0xd
[  973.728986]  [<ffffffff81449978>] neigh_direct_output+0xc/0xe
[  973.728986]  [<ffffffff81487811>] ip_finish_output2+0x5db/0x609
[  973.728986]  [<ffffffff81486853>] ? dst_mtu+0x29/0x2e
[  973.728986]  [<ffffffff81489590>] ip_finish_output+0x152/0x15e
[  973.728986]  [<ffffffff8148a0bc>] ? ip_output+0x74/0x96
[  973.728986]  [<ffffffff8148a0d4>] ip_output+0x8c/0x96
[  973.728986]  [<ffffffff81489652>] ip_local_out+0x41/0x4a
[  973.728986]  [<ffffffff81489e7d>] ip_queue_xmit+0x5a5/0x609
[  973.728986]  [<ffffffff814c559e>] ? udp_set_csum+0x207/0x21e
[  973.728986]  [<ffffffffa0202fe4>] l2tp_xmit_skb+0x582/0x5d7 [l2tp_core]
[  973.728986]  [<ffffffffa01b2466>] pppol2tp_xmit+0x1f2/0x25e [l2tp_ppp]
[  973.728986]  [<ffffffffa0184f59>] ppp_channel_push+0xb5/0x14a [ppp_generic]
[  973.728986]  [<ffffffffa01853ed>] ppp_write+0x104/0x11c [ppp_generic]
[  973.728986]  [<ffffffff811b2ec6>] __vfs_write+0x56/0x120
[  973.728986]  [<ffffffff8124c11d>] ? fsnotify_perm+0x27/0x95
[  973.728986]  [<ffffffff8124d41d>] ? security_file_permission+0x4d/0x54
[  973.728986]  [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b
[  973.728986]  [<ffffffff811b4cb2>] SyS_write+0x5e/0x96
[  973.728986]  [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8
[  973.728986]  [<ffffffff810ae0fa>] ? trace_hardirqs_off_caller+0x121/0x12f

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31 14:33:09 -07:00
Guillaume Nault
55454a5658 ppp: avoid dealock on recursive xmit
In case of misconfiguration, a virtual PPP channel might send packets
back to their parent PPP interface. This typically happens in
misconfigured L2TP setups, where PPP's peer IP address is set with the
IP of the L2TP peer.
When that happens the system hangs due to PPP trying to recursively
lock its xmit path.

[  243.332155] BUG: spinlock recursion on CPU#1, accel-pppd/926
[  243.333272]  lock: 0xffff880033d90f18, .magic: dead4ead, .owner: accel-pppd/926, .owner_cpu: 1
[  243.334859] CPU: 1 PID: 926 Comm: accel-pppd Not tainted 4.8.0-rc2 #1
[  243.336010] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[  243.336018]  ffff7fffffffffff ffff8800319a77a0 ffffffff8128de85 ffff880033d90f18
[  243.336018]  ffff880033ad8000 ffff8800319a77d8 ffffffff810ad7c0 ffffffff0000039e
[  243.336018]  ffff880033d90f18 ffff880033d90f60 ffff880033d90f18 ffff880033d90f28
[  243.336018] Call Trace:
[  243.336018]  [<ffffffff8128de85>] dump_stack+0x4f/0x65
[  243.336018]  [<ffffffff810ad7c0>] spin_dump+0xe1/0xeb
[  243.336018]  [<ffffffff810ad7f0>] spin_bug+0x26/0x28
[  243.336018]  [<ffffffff810ad8b9>] do_raw_spin_lock+0x5c/0x160
[  243.336018]  [<ffffffff815522aa>] _raw_spin_lock_bh+0x35/0x3c
[  243.336018]  [<ffffffffa01a88e2>] ? ppp_push+0xa7/0x82d [ppp_generic]
[  243.336018]  [<ffffffffa01a88e2>] ppp_push+0xa7/0x82d [ppp_generic]
[  243.336018]  [<ffffffff810adada>] ? do_raw_spin_unlock+0xc2/0xcc
[  243.336018]  [<ffffffff81084962>] ? preempt_count_sub+0x13/0xc7
[  243.336018]  [<ffffffff81552438>] ? _raw_spin_unlock_irqrestore+0x34/0x49
[  243.336018]  [<ffffffffa01ac657>] ppp_xmit_process+0x48/0x877 [ppp_generic]
[  243.336018]  [<ffffffff81084962>] ? preempt_count_sub+0x13/0xc7
[  243.336018]  [<ffffffff81408cd3>] ? skb_queue_tail+0x71/0x7c
[  243.336018]  [<ffffffffa01ad1c5>] ppp_start_xmit+0x21b/0x22a [ppp_generic]
[  243.336018]  [<ffffffff81426af1>] dev_hard_start_xmit+0x15e/0x32c
[  243.336018]  [<ffffffff81454ed7>] sch_direct_xmit+0xd6/0x221
[  243.336018]  [<ffffffff814273a8>] __dev_queue_xmit+0x52a/0x820
[  243.336018]  [<ffffffff814276a9>] dev_queue_xmit+0xb/0xd
[  243.336018]  [<ffffffff81430a3c>] neigh_direct_output+0xc/0xe
[  243.336018]  [<ffffffff8146b5d7>] ip_finish_output2+0x4d2/0x548
[  243.336018]  [<ffffffff8146a8e6>] ? dst_mtu+0x29/0x2e
[  243.336018]  [<ffffffff8146d49c>] ip_finish_output+0x152/0x15e
[  243.336018]  [<ffffffff8146df84>] ? ip_output+0x74/0x96
[  243.336018]  [<ffffffff8146df9c>] ip_output+0x8c/0x96
[  243.336018]  [<ffffffff8146d55e>] ip_local_out+0x41/0x4a
[  243.336018]  [<ffffffff8146dd15>] ip_queue_xmit+0x531/0x5c5
[  243.336018]  [<ffffffff814a82cd>] ? udp_set_csum+0x207/0x21e
[  243.336018]  [<ffffffffa01f2f04>] l2tp_xmit_skb+0x582/0x5d7 [l2tp_core]
[  243.336018]  [<ffffffffa01ea458>] pppol2tp_xmit+0x1eb/0x257 [l2tp_ppp]
[  243.336018]  [<ffffffffa01acf17>] ppp_channel_push+0x91/0x102 [ppp_generic]
[  243.336018]  [<ffffffffa01ad2d8>] ppp_write+0x104/0x11c [ppp_generic]
[  243.336018]  [<ffffffff811a3c1e>] __vfs_write+0x56/0x120
[  243.336018]  [<ffffffff81239801>] ? fsnotify_perm+0x27/0x95
[  243.336018]  [<ffffffff8123ab01>] ? security_file_permission+0x4d/0x54
[  243.336018]  [<ffffffff811a4ca4>] vfs_write+0xbd/0x11b
[  243.336018]  [<ffffffff811a5a0a>] SyS_write+0x5e/0x96
[  243.336018]  [<ffffffff81552a1b>] entry_SYSCALL_64_fastpath+0x13/0x94

The main entry points for sending packets over a PPP unit are the
.write() and .ndo_start_xmit() callbacks (simplified view):

.write(unit fd) or .ndo_start_xmit()
       \
        CALL ppp_xmit_process()
               \
                LOCK unit's xmit path (ppp->wlock)
                |
                CALL ppp_push()
                       \
                        LOCK channel's xmit path (chan->downl)
                        |
                        CALL lower layer's .start_xmit() callback
                               \
                                ... might recursively call .ndo_start_xmit() ...
                               /
                        RETURN from .start_xmit()
                        |
                        UNLOCK channel's xmit path
                       /
                RETURN from ppp_push()
                |
                UNLOCK unit's xmit path
               /
        RETURN from ppp_xmit_process()

Packets can also be directly sent on channels (e.g. LCP packets):

.write(channel fd) or ppp_output_wakeup()
       \
        CALL ppp_channel_push()
               \
                LOCK channel's xmit path (chan->downl)
                |
                CALL lower layer's .start_xmit() callback
                       \
                        ... might call .ndo_start_xmit() ...
                       /
                RETURN from .start_xmit()
                |
                UNLOCK channel's xmit path
               /
        RETURN from ppp_channel_push()

Key points about the lower layer's .start_xmit() callback:

  * It can be called directly by a channel fd .write() or by
    ppp_output_wakeup() or indirectly by a unit fd .write() or by
    .ndo_start_xmit().

  * In any case, it's always called with chan->downl held.

  * It might route the packet back to its parent unit using
    .ndo_start_xmit() as entry point.

This patch detects and breaks recursion in ppp_xmit_process(). This
function is a good candidate for the task because it's called early
enough after .ndo_start_xmit(), it's always part of the recursion
loop and it's on the path of whatever entry point is used to send
a packet on a PPP unit.

Recursion detection is done using the per-cpu ppp_xmit_recursion
variable.

Since ppp_channel_push() too locks the channel's xmit path and calls
the lower layer's .start_xmit() callback, we need to also increment
ppp_xmit_recursion there. However there's no need to check for
recursion, as it's out of the recursion loop.

Reported-by: Feng Gao <gfree.wind@gmail.com>
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-31 14:33:08 -07:00
Gao Feng
03459345bc pptp: Refactor the struct and macros of PPTP codes
1. Use struct gre_base_hdr directly in pptp_gre_header instead of
duplicated members;
2. Use existing macros like GRE_KEY, GRE_SEQ, and so on instead of
duplicated macros defined by PPTP;
3. Add new macros like GRE_IS_ACK/SEQ and so on instead of
PPTP_GRE_IS_A/S and so on;

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Reviewed-by: Philip Prindeville <philipp@redfish-solutions.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-15 10:55:53 -07:00
Gao Feng
ab10dccb11 rps: Inspect PPTP encapsulated by GRE to get flow hash
The PPTP is encapsulated by GRE header with that GRE_VERSION bits
must contain one. But current GRE RPS needs the GRE_VERSION must be
zero. So RPS does not work for PPTP traffic.

In my test environment, there are four MIPS cores, and all traffic
are passed through by PPTP. As a result, only one core is 100% busy
while other three cores are very idle. After this patch, the usage
of four cores are balanced well.

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Reviewed-by: Philip Prindeville <philipp@redfish-solutions.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-10 17:22:14 -07:00
Guillaume Nault
bb8082f691 ppp: build ifname using unit identifier for rtnl based devices
Userspace programs generally need to know the name of the ppp devices
they create. Both ioctl and rtnl interfaces use the ppp<suffix> sheme
to name them. But although the suffix used by the ioctl interface can
be known by userspace (it's the PPP unit identifier returned by the
PPPIOCGUNIT ioctl), the one used by the rtnl is only known by the
kernel.

This patch brings more consistency between ioctl and rtnl based ppp
devices by generating device names using the PPP unit identifer as
suffix in both cases. This way, userspace can always infer the name of
the devices they create.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-09 14:56:21 -07:00
David S. Miller
de0ba9a0d8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Just several instances of overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-24 00:53:32 -04:00
WANG Cong
205e1e255c ppp: defer netns reference release for ppp channel
Matt reported that we have a NULL pointer dereference
in ppp_pernet() from ppp_connect_channel(),
i.e. pch->chan_net is NULL.

This is due to that a parallel ppp_unregister_channel()
could happen while we are in ppp_connect_channel(), during
which pch->chan_net set to NULL. Since we need a reference
to net per channel, it makes sense to sync the refcnt
with the life time of the channel, therefore we should
release this reference when we destroy it.

Fixes: 1f461dcdd2 ("ppp: take reference on channels netns")
Reported-by: Matt Bennett <Matt.Bennett@alliedtelesis.co.nz>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linux-ppp@vger.kernel.org
Cc: Guillaume Nault <g.nault@alphalink.fr>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-08 23:46:37 -04:00
Eric Dumazet
d3fff6c443 net: add netdev_lockdep_set_classes() helper
It is time to add netdev_lockdep_set_classes() helper
so that lockdep annotations per device type are easier to manage.

This removes a lot of copies and missing annotations.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 13:28:37 -07:00
Eric Dumazet
f9eb8aea2a net_sched: transform qdisc running bit into a seqcount
Instead of using a single bit (__QDISC___STATE_RUNNING)
in sch->__state, use a seqcount.

This adds lockdep support, but more importantly it will allow us
to sample qdisc/class statistics without having to grab qdisc root lock.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-07 16:37:13 -07:00
Guillaume Nault
96d934c70d ppp: add rtnetlink device creation support
Define PPP device handler for use with rtnetlink.
The only PPP specific attribute is IFLA_PPP_DEV_FD. It is mandatory and
contains the file descriptor of the associated /dev/ppp instance (the
file descriptor which would have been used for ioctl(PPPIOCNEWUNIT) in
the ioctl-based API). The PPP device is removed when this file
descriptor is released (same behaviour as with ioctl based PPP
devices).

PPP devices created with the rtnetlink API behave like the ones created
with ioctl(PPPIOCNEWUNIT). In particular existing ioctls work the same
way, no matter how the PPP device was created.
The rtnl callbacks are also assigned to ioctl based PPP devices. This
way, rtnl messages have the same effect on any PPP devices.
The immediate effect is that all PPP devices, even ioctl-based
ones, can now be removed with "ip link del".

A minor difference still exists between ioctl and rtnl based PPP
interfaces: in the device name, the number following the "ppp" prefix
corresponds to the PPP unit number for ioctl based devices, while it is
just an unrelated incrementing index for rtnl ones.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:09:44 -04:00
Guillaume Nault
7d9f0b4874 ppp: define reusable device creation functions
Move PPP device initialisation and registration out of
ppp_create_interface().
This prepares code for device registration with rtnetlink.

While there, simplify the prototype of ppp_create_interface():

  * Since ppp_dev_configure() takes care of setting file->private_data,
    there's no need to return a ppp structure to ppp_unattached_ioctl()
    anymore.

  * The unit parameter is made read/write so that ppp_create_interface()
    can tell which unit number has been assigned.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:09:44 -04:00
Guillaume Nault
1f461dcdd2 ppp: take reference on channels netns
Let channels hold a reference on their network namespace.
Some channel types, like ppp_async and ppp_synctty, can have their
userspace controller running in a different namespace. Therefore they
can't rely on them to preclude their netns from being removed from
under them.

==================================================================
BUG: KASAN: use-after-free in ppp_unregister_channel+0x372/0x3a0 at
addr ffff880064e217e0
Read of size 8 by task syz-executor/11581
=============================================================================
BUG net_namespace (Not tainted): kasan: bad access detected
-----------------------------------------------------------------------------

Disabling lock debugging due to kernel taint
INFO: Allocated in copy_net_ns+0x6b/0x1a0 age=92569 cpu=3 pid=6906
[<      none      >] ___slab_alloc+0x4c7/0x500 kernel/mm/slub.c:2440
[<      none      >] __slab_alloc+0x4c/0x90 kernel/mm/slub.c:2469
[<     inline     >] slab_alloc_node kernel/mm/slub.c:2532
[<     inline     >] slab_alloc kernel/mm/slub.c:2574
[<      none      >] kmem_cache_alloc+0x23a/0x2b0 kernel/mm/slub.c:2579
[<     inline     >] kmem_cache_zalloc kernel/include/linux/slab.h:597
[<     inline     >] net_alloc kernel/net/core/net_namespace.c:325
[<      none      >] copy_net_ns+0x6b/0x1a0 kernel/net/core/net_namespace.c:360
[<      none      >] create_new_namespaces+0x2f6/0x610 kernel/kernel/nsproxy.c:95
[<      none      >] copy_namespaces+0x297/0x320 kernel/kernel/nsproxy.c:150
[<      none      >] copy_process.part.35+0x1bf4/0x5760 kernel/kernel/fork.c:1451
[<     inline     >] copy_process kernel/kernel/fork.c:1274
[<      none      >] _do_fork+0x1bc/0xcb0 kernel/kernel/fork.c:1723
[<     inline     >] SYSC_clone kernel/kernel/fork.c:1832
[<      none      >] SyS_clone+0x37/0x50 kernel/kernel/fork.c:1826
[<      none      >] entry_SYSCALL_64_fastpath+0x16/0x7a kernel/arch/x86/entry/entry_64.S:185

INFO: Freed in net_drop_ns+0x67/0x80 age=575 cpu=2 pid=2631
[<      none      >] __slab_free+0x1fc/0x320 kernel/mm/slub.c:2650
[<     inline     >] slab_free kernel/mm/slub.c:2805
[<      none      >] kmem_cache_free+0x2a0/0x330 kernel/mm/slub.c:2814
[<     inline     >] net_free kernel/net/core/net_namespace.c:341
[<      none      >] net_drop_ns+0x67/0x80 kernel/net/core/net_namespace.c:348
[<      none      >] cleanup_net+0x4e5/0x600 kernel/net/core/net_namespace.c:448
[<      none      >] process_one_work+0x794/0x1440 kernel/kernel/workqueue.c:2036
[<      none      >] worker_thread+0xdb/0xfc0 kernel/kernel/workqueue.c:2170
[<      none      >] kthread+0x23f/0x2d0 kernel/drivers/block/aoe/aoecmd.c:1303
[<      none      >] ret_from_fork+0x3f/0x70 kernel/arch/x86/entry/entry_64.S:468
INFO: Slab 0xffffea0001938800 objects=3 used=0 fp=0xffff880064e20000
flags=0x5fffc0000004080
INFO: Object 0xffff880064e20000 @offset=0 fp=0xffff880064e24200

CPU: 1 PID: 11581 Comm: syz-executor Tainted: G    B           4.4.0+
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
 00000000ffffffff ffff8800662c7790 ffffffff8292049d ffff88003e36a300
 ffff880064e20000 ffff880064e20000 ffff8800662c77c0 ffffffff816f2054
 ffff88003e36a300 ffffea0001938800 ffff880064e20000 0000000000000000
Call Trace:
 [<     inline     >] __dump_stack kernel/lib/dump_stack.c:15
 [<ffffffff8292049d>] dump_stack+0x6f/0xa2 kernel/lib/dump_stack.c:50
 [<ffffffff816f2054>] print_trailer+0xf4/0x150 kernel/mm/slub.c:654
 [<ffffffff816f875f>] object_err+0x2f/0x40 kernel/mm/slub.c:661
 [<     inline     >] print_address_description kernel/mm/kasan/report.c:138
 [<ffffffff816fb0c5>] kasan_report_error+0x215/0x530 kernel/mm/kasan/report.c:236
 [<     inline     >] kasan_report kernel/mm/kasan/report.c:259
 [<ffffffff816fb4de>] __asan_report_load8_noabort+0x3e/0x40 kernel/mm/kasan/report.c:280
 [<     inline     >] ? ppp_pernet kernel/include/linux/compiler.h:218
 [<ffffffff83ad71b2>] ? ppp_unregister_channel+0x372/0x3a0 kernel/drivers/net/ppp/ppp_generic.c:2392
 [<     inline     >] ppp_pernet kernel/include/linux/compiler.h:218
 [<ffffffff83ad71b2>] ppp_unregister_channel+0x372/0x3a0 kernel/drivers/net/ppp/ppp_generic.c:2392
 [<     inline     >] ? ppp_pernet kernel/drivers/net/ppp/ppp_generic.c:293
 [<ffffffff83ad6f26>] ? ppp_unregister_channel+0xe6/0x3a0 kernel/drivers/net/ppp/ppp_generic.c:2392
 [<ffffffff83ae18f3>] ppp_asynctty_close+0xa3/0x130 kernel/drivers/net/ppp/ppp_async.c:241
 [<ffffffff83ae1850>] ? async_lcp_peek+0x5b0/0x5b0 kernel/drivers/net/ppp/ppp_async.c:1000
 [<ffffffff82c33239>] tty_ldisc_close.isra.1+0x99/0xe0 kernel/drivers/tty/tty_ldisc.c:478
 [<ffffffff82c332c0>] tty_ldisc_kill+0x40/0x170 kernel/drivers/tty/tty_ldisc.c:744
 [<ffffffff82c34943>] tty_ldisc_release+0x1b3/0x260 kernel/drivers/tty/tty_ldisc.c:772
 [<ffffffff82c1ef21>] tty_release+0xac1/0x13e0 kernel/drivers/tty/tty_io.c:1901
 [<ffffffff82c1e460>] ? release_tty+0x320/0x320 kernel/drivers/tty/tty_io.c:1688
 [<ffffffff8174de36>] __fput+0x236/0x780 kernel/fs/file_table.c:208
 [<ffffffff8174e405>] ____fput+0x15/0x20 kernel/fs/file_table.c:244
 [<ffffffff813595ab>] task_work_run+0x16b/0x200 kernel/kernel/task_work.c:115
 [<     inline     >] exit_task_work kernel/include/linux/task_work.h:21
 [<ffffffff81307105>] do_exit+0x8b5/0x2c60 kernel/kernel/exit.c:750
 [<ffffffff813fdd20>] ? debug_check_no_locks_freed+0x290/0x290 kernel/kernel/locking/lockdep.c:4123
 [<ffffffff81306850>] ? mm_update_next_owner+0x6f0/0x6f0 kernel/kernel/exit.c:357
 [<ffffffff813215e6>] ? __dequeue_signal+0x136/0x470 kernel/kernel/signal.c:550
 [<ffffffff8132067b>] ? recalc_sigpending_tsk+0x13b/0x180 kernel/kernel/signal.c:145
 [<ffffffff81309628>] do_group_exit+0x108/0x330 kernel/kernel/exit.c:880
 [<ffffffff8132b9d4>] get_signal+0x5e4/0x14f0 kernel/kernel/signal.c:2307
 [<     inline     >] ? kretprobe_table_lock kernel/kernel/kprobes.c:1113
 [<ffffffff8151d355>] ? kprobe_flush_task+0xb5/0x450 kernel/kernel/kprobes.c:1158
 [<ffffffff8115f7d3>] do_signal+0x83/0x1c90 kernel/arch/x86/kernel/signal.c:712
 [<ffffffff8151d2a0>] ? recycle_rp_inst+0x310/0x310 kernel/include/linux/list.h:655
 [<ffffffff8115f750>] ? setup_sigcontext+0x780/0x780 kernel/arch/x86/kernel/signal.c:165
 [<ffffffff81380864>] ? finish_task_switch+0x424/0x5f0 kernel/kernel/sched/core.c:2692
 [<     inline     >] ? finish_lock_switch kernel/kernel/sched/sched.h:1099
 [<ffffffff81380560>] ? finish_task_switch+0x120/0x5f0 kernel/kernel/sched/core.c:2678
 [<     inline     >] ? context_switch kernel/kernel/sched/core.c:2807
 [<ffffffff85d794e9>] ? __schedule+0x919/0x1bd0 kernel/kernel/sched/core.c:3283
 [<ffffffff81003901>] exit_to_usermode_loop+0xf1/0x1a0 kernel/arch/x86/entry/common.c:247
 [<     inline     >] prepare_exit_to_usermode kernel/arch/x86/entry/common.c:282
 [<ffffffff810062ef>] syscall_return_slowpath+0x19f/0x210 kernel/arch/x86/entry/common.c:344
 [<ffffffff85d88022>] int_ret_from_sys_call+0x25/0x9f kernel/arch/x86/entry/entry_64.S:281
Memory state around the buggy address:
 ffff880064e21680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff880064e21700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff880064e21780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                       ^
 ffff880064e21800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff880064e21880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

Fixes: 273ec51dd7 ("net: ppp_generic - introduce net-namespace functionality v2")
Reported-by: Baozeng Ding <sploving1@gmail.com>
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-23 14:35:31 -04:00
Linus Torvalds
1200b6809d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
 "Highlights:

   1) Support more Realtek wireless chips, from Jes Sorenson.

   2) New BPF types for per-cpu hash and arrap maps, from Alexei
      Starovoitov.

   3) Make several TCP sysctls per-namespace, from Nikolay Borisov.

   4) Allow the use of SO_REUSEPORT in order to do per-thread processing
   of incoming TCP/UDP connections.  The muxing can be done using a
   BPF program which hashes the incoming packet.  From Craig Gallek.

   5) Add a multiplexer for TCP streams, to provide a messaged based
      interface.  BPF programs can be used to determine the message
      boundaries.  From Tom Herbert.

   6) Add 802.1AE MACSEC support, from Sabrina Dubroca.

   7) Avoid factorial complexity when taking down an inetdev interface
      with lots of configured addresses.  We were doing things like
      traversing the entire address less for each address removed, and
      flushing the entire netfilter conntrack table for every address as
      well.

   8) Add and use SKB bulk free infrastructure, from Jesper Brouer.

   9) Allow offloading u32 classifiers to hardware, and implement for
      ixgbe, from John Fastabend.

  10) Allow configuring IRQ coalescing parameters on a per-queue basis,
      from Kan Liang.

  11) Extend ethtool so that larger link mode masks can be supported.
      From David Decotigny.

  12) Introduce devlink, which can be used to configure port link types
      (ethernet vs Infiniband, etc.), port splitting, and switch device
      level attributes as a whole.  From Jiri Pirko.

  13) Hardware offload support for flower classifiers, from Amir Vadai.

  14) Add "Local Checksum Offload".  Basically, for a tunneled packet
      the checksum of the outer header is 'constant' (because with the
      checksum field filled into the inner protocol header, the payload
      of the outer frame checksums to 'zero'), and we can take advantage
      of that in various ways.  From Edward Cree"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits)
  bonding: fix bond_get_stats()
  net: bcmgenet: fix dma api length mismatch
  net/mlx4_core: Fix backward compatibility on VFs
  phy: mdio-thunder: Fix some Kconfig typos
  lan78xx: add ndo_get_stats64
  lan78xx: handle statistics counter rollover
  RDS: TCP: Remove unused constant
  RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket
  net: smc911x: convert pxa dma to dmaengine
  team: remove duplicate set of flag IFF_MULTICAST
  bonding: remove duplicate set of flag IFF_MULTICAST
  net: fix a comment typo
  ethernet: micrel: fix some error codes
  ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it
  bpf, dst: add and use dst_tclassid helper
  bpf: make skb->tc_classid also readable
  net: mvneta: bm: clarify dependencies
  cls_bpf: reset class and reuse major in da
  ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c
  ldmvsw: Add ldmvsw.c driver code
  ...
2016-03-19 10:05:34 -07:00
Linus Torvalds
70477371dc Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 "Here is the crypto update for 4.6:

  API:
   - Convert remaining crypto_hash users to shash or ahash, also convert
     blkcipher/ablkcipher users to skcipher.
   - Remove crypto_hash interface.
   - Remove crypto_pcomp interface.
   - Add crypto engine for async cipher drivers.
   - Add akcipher documentation.
   - Add skcipher documentation.

  Algorithms:
   - Rename crypto/crc32 to avoid name clash with lib/crc32.
   - Fix bug in keywrap where we zero the wrong pointer.

  Drivers:
   - Support T5/M5, T7/M7 SPARC CPUs in n2 hwrng driver.
   - Add PIC32 hwrng driver.
   - Support BCM6368 in bcm63xx hwrng driver.
   - Pack structs for 32-bit compat users in qat.
   - Use crypto engine in omap-aes.
   - Add support for sama5d2x SoCs in atmel-sha.
   - Make atmel-sha available again.
   - Make sahara hashing available again.
   - Make ccp hashing available again.
   - Make sha1-mb available again.
   - Add support for multiple devices in ccp.
   - Improve DMA performance in caam.
   - Add hashing support to rockchip"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits)
  crypto: qat - remove redundant arbiter configuration
  crypto: ux500 - fix checks of error code returned by devm_ioremap_resource()
  crypto: atmel - fix checks of error code returned by devm_ioremap_resource()
  crypto: qat - Change the definition of icp_qat_uof_regtype
  hwrng: exynos - use __maybe_unused to hide pm functions
  crypto: ccp - Add abstraction for device-specific calls
  crypto: ccp - CCP versioning support
  crypto: ccp - Support for multiple CCPs
  crypto: ccp - Remove check for x86 family and model
  crypto: ccp - memset request context to zero during import
  lib/mpi: use "static inline" instead of "extern inline"
  lib/mpi: avoid assembler warning
  hwrng: bcm63xx - fix non device tree compatibility
  crypto: testmgr - allow rfc3686 aes-ctr variants in fips mode.
  crypto: qat - The AE id should be less than the maximal AE number
  lib/mpi: Endianness fix
  crypto: rockchip - add hash support for crypto engine in rk3288
  crypto: xts - fix compile errors
  crypto: doc - add skcipher API documentation
  crypto: doc - update AEAD AD handling
  ...
2016-03-17 11:22:54 -07:00
Guillaume Nault
e8e56ffd9d ppp: ensure file->private_data can't be overridden
Locking ppp_mutex must be done before dereferencing file->private_data,
otherwise it could be modified before ppp_unattached_ioctl() takes the
lock. This could lead ppp_unattached_ioctl() to override ->private_data,
thus leaking reference to the ppp_file previously pointed to.

v2: lock all ppp_ioctl() instead of just checking private_data in
    ppp_unattached_ioctl(), to avoid ambiguous behaviour.

Fixes: f3ff8a4d80 ("ppp: push BKL down into the driver")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-16 19:35:06 -04:00
David S. Miller
810813c47a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several cases of overlapping changes, as well as one instance
(vxlan) of a bug fix in 'net' overlapping with code movement
in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-08 12:34:12 -05:00
Guillaume Nault
6faac63a69 ppp: release rtnl mutex when interface creation fails
Add missing rtnl_unlock() in the error path of ppp_create_interface().

Fixes: 58a89ecaca ("ppp: fix lockdep splat in ppp_dev_uninit()")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-07 16:11:31 -05:00
Guillaume Nault
edffc2178d ppp: lock ppp->flags in ppp_read() and ppp_poll()
ppp_read() and ppp_poll() can be called concurrently with ppp_ioctl().
In this case, ppp_ioctl() might call ppp_ccp_closed(), which may update
ppp->flags while ppp_read() or ppp_poll() is reading it.
The update done by ppp_ccp_closed() isn't atomic due to the bit mask
operation ('ppp->flags &= ~(SC_CCP_OPEN | SC_CCP_UP)'), so concurrent
readers might get transient values.
Reading incorrect ppp->flags may disturb the 'ppp->flags & SC_LOOP_TRAFFIC'
test in ppp_read() and ppp_poll(), which in turn can lead to improper
decision on whether the PPP unit file is ready for reading or not.

Since ppp_ccp_closed() is protected by the Rx and Tx locks (with
ppp_lock()), taking the Rx lock is enough for ppp_read() and ppp_poll()
to guarantee that ppp_ccp_closed() won't update ppp->flags
concurrently.

The same reasoning applies to ppp->n_channels. The 'n_channels' field
can also be written to concurrently by ppp_ioctl() (through
ppp_connect_channel() or ppp_disconnect_channel()). These writes aren't
atomic (simple increment/decrement), but are protected by both the Rx
and Tx locks (like in the ppp->flags case). So holding the Rx lock
before reading ppp->n_channels also prevents concurrent writes.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-01 16:15:07 -05:00
Guillaume Nault
555d5b70f1 ppp: clarify parsing of user supplied data in ppp_set_compress()
* Split big conditional statement.
  * Check (data.length <= CCP_MAX_OPTION_LENGTH) only once.
  * Don't read ccp_option[1] if not initialised.

Reading uninitialised ccp_option[1] was harmless, because this could
only happen when data.length was 0 or 1. So even then, we couldn't pass
the (ccp_option[1] < 2 || ccp_option[1] > data.length) test anyway.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-24 23:52:51 -05:00
Guillaume Nault
29e73269aa pppoe: fix reference counting in PPPoE proxy
Drop reference on the relay_po socket when __pppoe_xmit() succeeds.
This is already handled correctly in the error path.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17 16:02:01 -05:00
Herbert Xu
fdb89b1b8f ppp_mppe: Use skcipher and ahash
This patch replaces uses of blkcipher with skcipher, and the long
obsolete hash interface with ahash.  This is a bug-for-bug conversion
and no attempt has been made to fix bugs such as the ignored return
values of the crypto operations.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-01-27 20:35:45 +08:00
Hannes Frederic Sowa
9a368aff9c pptp: fix illegal memory access caused by multiple bind()s
Several times already this has been reported as kasan reports caused by
syzkaller and trinity and people always looked at RCU races, but it is
much more simple. :)

In case we bind a pptp socket multiple times, we simply add it to
the callid_sock list but don't remove the old binding. Thus the old
socket stays in the bucket with unused call_id indexes and doesn't get
cleaned up. This causes various forms of kasan reports which were hard
to pinpoint.

Simply don't allow multiple binds and correct error handling in
pptp_bind. Also keep sk_state bits in place in pptp_connect.

Fixes: 00959ade36 ("PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol)")
Cc: Dmitry Kozlov <xeb@mail.ru>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-24 22:18:26 -08:00
David S. Miller
b3e0d3d7ba Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/geneve.c

Here we had an overlapping change, where in 'net' the extraneous stats
bump was being removed whilst in 'net-next' the final argument to
udp_tunnel6_xmit_skb() was being changed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-17 22:08:28 -05:00
WANG Cong
09ccfd238e pptp: verify sockaddr_len in pptp_bind() and pptp_connect()
Reported-by: Dmitry Vyukov <dvyukov@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15 00:29:34 -05:00
Guillaume Nault
69d9728d00 ppp: declare ppp devices as enumerated interfaces
Let user space be aware of the naming scheme used by ppp interfaces
(visible in /sys/class/net/<iface>/name_assign_type).

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14 16:20:57 -05:00
Guillaume Nault
94dbffe16e ppp: define "ppp" device type
Let PPP devices be identified as such in /sys/class/net/<iface>/uevent.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14 16:20:57 -05:00
Guillaume Nault
fe53985aaa pppoe: fix memory corruption in padt work structure
pppoe_connect() mustn't touch the padt_work field of pppoe sockets
because that work could be already pending.

[   21.473147] BUG: unable to handle kernel NULL pointer dereference at 00000004
[   21.474523] IP: [<c1043177>] process_one_work+0x29/0x31c
[   21.475164] *pde = 00000000
[   21.475513] Oops: 0000 [#1] SMP
[   21.475910] Modules linked in: pppoe pppox ppp_generic slhc crc32c_intel aesni_intel virtio_net xts aes_i586 lrw gf128mul ablk_helper cryptd evdev acpi_cpufreq processor serio_raw button ext4 crc16 mbcache jbd2 virtio_blk virtio_pci virtio_ring virtio
[   21.476168] CPU: 2 PID: 164 Comm: kworker/2:2 Not tainted 4.4.0-rc1 #1
[   21.476168] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[   21.476168] task: f5f83c00 ti: f5e28000 task.ti: f5e28000
[   21.476168] EIP: 0060:[<c1043177>] EFLAGS: 00010046 CPU: 2
[   21.476168] EIP is at process_one_work+0x29/0x31c
[   21.484082] EAX: 00000000 EBX: f678b2a0 ECX: 00000004 EDX: 00000000
[   21.484082] ESI: f6c69940 EDI: f5e29ef0 EBP: f5e29f0c ESP: f5e29edc
[   21.484082]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[   21.484082] CR0: 80050033 CR2: 000000a4 CR3: 317ad000 CR4: 00040690
[   21.484082] Stack:
[   21.484082]  00000000 f6c69950 00000000 f6c69940 c0042338 f5e29f0c c1327945 00000000
[   21.484082]  00000008 f678b2a0 f6c69940 f678b2b8 f5e29f30 c1043984 f5f83c00 f6c69970
[   21.484082]  f678b2a0 c10437d3 f6775e80 f678b2a0 c10437d3 f5e29fac c1047059 f5e29f74
[   21.484082] Call Trace:
[   21.484082]  [<c1327945>] ? _raw_spin_lock_irq+0x28/0x30
[   21.484082]  [<c1043984>] worker_thread+0x1b1/0x244
[   21.484082]  [<c10437d3>] ? rescuer_thread+0x229/0x229
[   21.484082]  [<c10437d3>] ? rescuer_thread+0x229/0x229
[   21.484082]  [<c1047059>] kthread+0x8f/0x94
[   21.484082]  [<c1327a32>] ? _raw_spin_unlock_irq+0x22/0x26
[   21.484082]  [<c1327ee9>] ret_from_kernel_thread+0x21/0x38
[   21.484082]  [<c1046fca>] ? kthread_parkme+0x19/0x19
[   21.496082] Code: 5d c3 55 89 e5 57 56 53 89 c3 83 ec 24 89 d0 89 55 e0 8d 7d e4 e8 6c d8 ff ff b9 04 00 00 00 89 45 d8 8b 43 24 89 45 dc 8b 45 d8 <8b> 40 04 8b 80 e0 00 00 00 c1 e8 05 24 01 88 45 d7 8b 45 e0 8d
[   21.496082] EIP: [<c1043177>] process_one_work+0x29/0x31c SS:ESP 0068:f5e29edc
[   21.496082] CR2: 0000000000000004
[   21.496082] ---[ end trace e362cc9cf10dae89 ]---

Reported-by: Andrew <nitr0@seti.kr.ua>
Fixes: 287f3a943f ("pppoe: Use workqueue to die properly when a PADT is received")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-04 16:48:52 -05:00
Guillaume Nault
681b4d88ad pppox: use standard module auto-loading feature
* Register PF_PPPOX with pppox module rather than with pppoe,
    so that pppoe doesn't get loaded for any PF_PPPOX socket.

  * Register PX_PROTO_* with standard MODULE_ALIAS_NET_PF_PROTO()
    instead of using pppox's own naming scheme.

  * While there, add auto-loading feature for pptp.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 15:12:54 -05:00
Guillaume Nault
a8acce6aa5 ppp: remove PPPOX_ZOMBIE socket state
PPPOX_ZOMBIE is never set anymore.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-20 11:31:26 -05:00
Guillaume Nault
8734e485fe ppp: don't set sk_state to PPPOX_ZOMBIE in pppoe_disc_rcv()
Since 287f3a943f ("pppoe: Use workqueue to die properly when a PADT
is received"), pppoe_disc_rcv() disconnects the socket by scheduling
pppoe_unbind_sock_work(). This is enough to stop socket transmission
and makes the PPPOX_ZOMBIE state uncessary.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-20 11:31:26 -05:00
David S. Miller
73186df8d7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor overlapping changes in net/ipv4/ipmr.c, in 'net' we were
fixing the "BH-ness" of the counter bumps whilst in 'net-next'
the functions were modified to take an explicit 'net' parameter.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-03 13:41:45 -05:00
Ben Hutchings
4ab42d78e3 ppp, slip: Validate VJ compression slot parameters completely
Currently slhc_init() treats out-of-range values of rslots and tslots
as equivalent to 0, except that if tslots is too large it will
dereference a null pointer (CVE-2015-7799).

Add a range-check at the top of the function and make it return an
ERR_PTR() on error instead of NULL.  Change the callers accordingly.

Compile-tested only.

Reported-by: 郭永刚 <guoyonggang@360.cn>
References: http://article.gmane.org/gmane.comp.security.oss.general/17908
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-02 16:25:00 -05:00
David S. Miller
ba3e2084f2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/ipv6/xfrm6_output.c
	net/openvswitch/flow_netlink.c
	net/openvswitch/vport-gre.c
	net/openvswitch/vport-vxlan.c
	net/openvswitch/vport.c
	net/openvswitch/vport.h

The openvswitch conflicts were overlapping changes.  One was
the egress tunnel info fix in 'net' and the other was the
vport ->send() op simplification in 'net-next'.

The xfrm6_output.c conflicts was also a simplification
overlapping a bug fix.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-24 06:54:12 -07:00
Guillaume Nault
1acea4f6ce ppp: fix pppoe_dev deletion condition in pppoe_release()
We can't rely on PPPOX_ZOMBIE to decide whether to clear po->pppoe_dev.
PPPOX_ZOMBIE can be set by pppoe_disc_rcv() even when po->pppoe_dev is
NULL. So we have no guarantee that (sk->sk_state & PPPOX_ZOMBIE) implies
(po->pppoe_dev != NULL).
Since we're releasing a PPPoE socket, we want to release the pppoe_dev
if it exists and reset sk_state to PPPOX_DEAD, no matter the previous
value of sk_state. So we can just check for po->pppoe_dev and avoid any
assumption on sk->sk_state.

Fixes: 2b018d57ff ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-23 03:30:01 -07:00
David S. Miller
26440c835f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/asix_common.c
	net/ipv4/inet_connection_sock.c
	net/switchdev/switchdev.c

In the inet_connection_sock.c case the request socket hashing scheme
is completely different in net-next.

The other two conflicts were overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-20 06:08:27 -07:00
Eric W. Biederman
33224b16ff ipv4, ipv6: Pass net into ip_local_out and ip6_local_out
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-08 04:27:02 -07:00