Commit Graph

664680 Commits

Author SHA1 Message Date
Tyler Baker
75eb5e1e7b irqchip/irq-imx-gpcv2: Fix spinlock initialization
The raw_spinlock in the IMX GPCV2 interupt chip is not initialized before
usage. That results in a lockdep splat:

  INFO: trying to register non-static key.
  the code is fine but needs lockdep annotation.
  turning off the locking correctness validator.

Add the missing raw_spin_lock_init() to the setup code.

Fixes: e324c4dc4a ("irqchip/imx-gpcv2: IMX GPCv2 driver for wakeup sources")
Signed-off-by: Tyler Baker <tyler.baker@linaro.org>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Cc: jason@lakedaemon.net
Cc: marc.zyngier@arm.com
Cc: shawnguo@kernel.org
Cc: andrew.smirnov@gmail.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170413222731.5917-1-tyler.baker@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-14 10:55:05 +02:00
Peter Zijlstra
f2200ac311 perf/x86: Avoid exposing wrong/stale data in intel_pmu_lbr_read_32()
When the perf_branch_entry::{in_tx,abort,cycles} fields were added,
intel_pmu_lbr_read_32() wasn't updated to initialize them.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org>
Fixes: 135c5612c4 ("perf/x86/intel: Support Haswell/v4 LBR format")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-04-14 10:18:00 +02:00
Linus Torvalds
a232591ba2 Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "11 fixes.

  The presence of 'thp: reduce indentation level in change_huge_pmd()'
  is unfortunate. But the patchset had been decently reviewed and tested
  before we decided it was needed in -stable and I felt it best not to
  churn things at the last minute"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mailmap: add Martin Kepplinger's email
  zsmalloc: expand class bit
  zram: do not use copy_page with non-page aligned address
  zram: fix operator precedence to get offset
  hugetlbfs: fix offset overflow in hugetlbfs mmap
  thp: fix MADV_DONTNEED vs clear soft dirty race
  thp: fix MADV_DONTNEED vs. MADV_FREE race
  mm: drop unused pmdp_huge_get_and_clear_notify()
  thp: fix MADV_DONTNEED vs. numa balancing race
  thp: reduce indentation level in change_huge_pmd()
  z3fold: fix page locking in z3fold_alloc()
2017-04-13 20:08:33 -07:00
Martin Kepplinger
5714320d31 mailmap: add Martin Kepplinger's email
Set the partly deprecated companies' email addresses as alias for the
personal one.

Link: http://lkml.kernel.org/r/1491984622-17321-1-git-send-email-martin.kepplinger@ginzinger.com
Signed-off-by: Martin Kepplinger <martin.kepplinger@ginzinger.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-13 18:24:21 -07:00
Minchan Kim
85d492f28d zsmalloc: expand class bit
Now 64K page system, zsamlloc has 257 classes so 8 class bit is not
enough.  With that, it corrupts the system when zsmalloc stores
65536byte data(ie, index number 256) so that this patch increases class
bit for simple fix for stable backport.  We should clean up this mess
soon.

  index	size
  0	32
  1	288
  ..
  ..
  204	52256
  256	65536

Fixes: 3783689a1 ("zsmalloc: introduce zspage structure")
Link: http://lkml.kernel.org/r/1492042622-12074-3-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-13 18:24:21 -07:00
Minchan Kim
d72e9a7a93 zram: do not use copy_page with non-page aligned address
The copy_page is optimized memcpy for page-alinged address.  If it is
used with non-page aligned address, it can corrupt memory which means
system corruption.  With zram, it can happen with

1. 64K architecture
2. partial IO
3. slub debug

Partial IO need to allocate a page and zram allocates it via kmalloc.
With slub debug, kmalloc(PAGE_SIZE) doesn't return page-size aligned
address.  And finally, copy_page(mem, cmem) corrupts memory.

So, this patch changes it to memcpy.

Actuaully, we don't need to change zram_bvec_write part because zsmalloc
returns page-aligned address in case of PAGE_SIZE class but it's not
good to rely on the internal of zsmalloc.

Note:
 When this patch is merged to stable, clear_page should be fixed, too.
 Unfortunately, recent zram removes it by "same page merge" feature so
 it's hard to backport this patch to -stable tree.

I will handle it when I receive the mail from stable tree maintainer to
merge this patch to backport.

Fixes: 42e99bd ("zram: optimize memory operations with clear_page()/copy_page()")
Link: http://lkml.kernel.org/r/1492042622-12074-2-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-13 18:24:21 -07:00
Minchan Kim
4ca82dabc9 zram: fix operator precedence to get offset
In zram_rw_page, the logic to get offset is wrong by operator precedence
(i.e., "<<" is higher than "&").  With wrong offset, zram can corrupt
the user's data.  This patch fixes it.

Fixes: 8c7f01025 ("zram: implement rw_page operation of zram")
Link: http://lkml.kernel.org/r/1492042622-12074-1-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-13 18:24:21 -07:00
Mike Kravetz
045c7a3f53 hugetlbfs: fix offset overflow in hugetlbfs mmap
If mmap() maps a file, it can be passed an offset into the file at which
the mapping is to start.  Offset could be a negative value when
represented as a loff_t.  The offset plus length will be used to update
the file size (i_size) which is also a loff_t.

Validate the value of offset and offset + length to make sure they do
not overflow and appear as negative.

Found by syzcaller with commit ff8c0c53c4 ("mm/hugetlb.c: don't call
region_abort if region_chg fails") applied.  Prior to this commit, the
overflow would still occur but we would luckily return ENOMEM.

To reproduce:

   mmap(0, 0x2000, 0, 0x40021, 0xffffffffffffffffULL, 0x8000000000000000ULL);

Resulted in,

  kernel BUG at mm/hugetlb.c:742!
  Call Trace:
   hugetlbfs_evict_inode+0x80/0xa0
   evict+0x24a/0x620
   iput+0x48f/0x8c0
   dentry_unlink_inode+0x31f/0x4d0
   __dentry_kill+0x292/0x5e0
   dput+0x730/0x830
   __fput+0x438/0x720
   ____fput+0x1a/0x20
   task_work_run+0xfe/0x180
   exit_to_usermode_loop+0x133/0x150
   syscall_return_slowpath+0x184/0x1c0
   entry_SYSCALL_64_fastpath+0xab/0xad

Fixes: ff8c0c53c4 ("mm/hugetlb.c: don't call region_abort if region_chg fails")
Link: http://lkml.kernel.org/r/1491951118-30678-1-git-send-email-mike.kravetz@oracle.com
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-13 18:24:21 -07:00
Kirill A. Shutemov
5b7abeae3a thp: fix MADV_DONTNEED vs clear soft dirty race
Yet another instance of the same race.

Fix is identical to change_huge_pmd().

See "thp: fix MADV_DONTNEED vs.  numa balancing race" for more details.

Link: http://lkml.kernel.org/r/20170302151034.27829-5-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-13 18:24:21 -07:00
Kirill A. Shutemov
58ceeb6bec thp: fix MADV_DONTNEED vs. MADV_FREE race
Both MADV_DONTNEED and MADV_FREE handled with down_read(mmap_sem).

It's critical to not clear pmd intermittently while handling MADV_FREE
to avoid race with MADV_DONTNEED:

	CPU0:				CPU1:
				madvise_free_huge_pmd()
				 pmdp_huge_get_and_clear_full()
madvise_dontneed()
 zap_pmd_range()
  pmd_trans_huge(*pmd) == 0 (without ptl)
  // skip the pmd
				 set_pmd_at();
				 // pmd is re-established

It results in MADV_DONTNEED skipping the pmd, leaving it not cleared.
It violates MADV_DONTNEED interface and can result is userspace
misbehaviour.

Basically it's the same race as with numa balancing in
change_huge_pmd(), but a bit simpler to mitigate: we don't need to
preserve dirty/young flags here due to MADV_FREE functionality.

[kirill.shutemov@linux.intel.com: Urgh... Power is special again]
  Link: http://lkml.kernel.org/r/20170303102636.bhd2zhtpds4mt62a@black.fi.intel.com
Link: http://lkml.kernel.org/r/20170302151034.27829-4-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-13 18:24:21 -07:00
Kirill A. Shutemov
c0c379e293 mm: drop unused pmdp_huge_get_and_clear_notify()
Dave noticed that after fixing MADV_DONTNEED vs numa balancing race the
last pmdp_huge_get_and_clear_notify() user is gone.

Let's drop the helper.

Link: http://lkml.kernel.org/r/20170306112047.24809-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-13 18:24:21 -07:00
Kirill A. Shutemov
ced108037c thp: fix MADV_DONTNEED vs. numa balancing race
In case prot_numa, we are under down_read(mmap_sem).  It's critical to
not clear pmd intermittently to avoid race with MADV_DONTNEED which is
also under down_read(mmap_sem):

	CPU0:				CPU1:
				change_huge_pmd(prot_numa=1)
				 pmdp_huge_get_and_clear_notify()
madvise_dontneed()
 zap_pmd_range()
  pmd_trans_huge(*pmd) == 0 (without ptl)
  // skip the pmd
				 set_pmd_at();
				 // pmd is re-established

The race makes MADV_DONTNEED miss the huge pmd and don't clear it
which may break userspace.

Found by code analysis, never saw triggered.

Link: http://lkml.kernel.org/r/20170302151034.27829-3-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-13 18:24:20 -07:00
Kirill A. Shutemov
0a85e51d37 thp: reduce indentation level in change_huge_pmd()
Patch series "thp: fix few MADV_DONTNEED races"

For MADV_DONTNEED to work properly with huge pages, it's critical to not
clear pmd intermittently unless you hold down_write(mmap_sem).

Otherwise MADV_DONTNEED can miss the THP which can lead to userspace
breakage.

See example of such race in commit message of patch 2/4.

All these races are found by code inspection.  I haven't seen them
triggered.  I don't think it's worth to apply them to stable@.

This patch (of 4):

Restructure code in preparation for a fix.

Link: http://lkml.kernel.org/r/20170302151034.27829-2-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-13 18:24:20 -07:00
Vitaly Wool
76e32a2a08 z3fold: fix page locking in z3fold_alloc()
Stress testing of the current z3fold implementation on a 8-core system
revealed it was possible that a z3fold page deleted from its unbuddied
list in z3fold_alloc() would be put on another unbuddied list by
z3fold_free() while z3fold_alloc() is still processing it.  This has
been introduced with commit 5a27aa822 ("z3fold: add kref refcounting")
due to the removal of special handling of a z3fold page not on any list
in z3fold_free().

To fix this, the z3fold page lock should be taken in z3fold_alloc()
before the pool lock is released.  To avoid deadlocking, we just try to
lock the page as soon as we get a hold of it, and if trylock fails, we
drop this page and take the next one.

Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: <Oleksiy.Avramchenko@sony.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-13 18:24:20 -07:00
Jan Beulich
d8a6e3aed9 ia64: restore symbol versions for symbols defined in assembly
The ia64 build generates many warnings like this:

   WARNING: EXPORT symbol "empty_zero_page" [vmlinux] version generation failed, symbol will not be versioned.

Besides adding the necessary header this also requires fiddling with
some explicit .S -> .o rules.

Cc: IA64-ML <linux-ia64@vger.kernel.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-13 18:19:49 -07:00
Keith Busch
3412386b53 irq/affinity: Fix extra vecs calculation
This fixes a math error calculating the extra_vecs. The error assumed
only 1 cpu per vector, but the value needs to account for the actual
number of cpus per vector in order to get the correct remainder for
extra CPU assignment.

Fixes: 7bf8222b9b ("irq/affinity: Fix CPU spread for unbalanced nodes")
Reported-by: Xiaolong Ye <xiaolong.ye@intel.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Link: http://lkml.kernel.org/r/1492104492-19943-1-git-send-email-keith.busch@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-04-13 23:41:00 +02:00
Gao Feng
fe50543c19 netfilter: ipt_CLUSTERIP: Fix wrong conntrack netns refcnt usage
Current codes invoke wrongly nf_ct_netns_get in the destroy routine,
it should use nf_ct_netns_put, not nf_ct_netns_get.
It could cause some modules could not be unloaded.

Fixes: ecb2421b5d ("netfilter: add and use nf_ct_netns_get/put")
Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-13 23:21:40 +02:00
Liping Zhang
79e09ef96b netfilter: nft_hash: do not dump the auto generated seed
This can prevent the nft utility from printing out the auto generated
seed to the user, which is unnecessary and confusing.

Fixes: cb1b69b0b1 ("netfilter: nf_tables: add hash expression")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-04-13 23:20:13 +02:00
David S. Miller
ce07183282 Merge branch 'netlink_ext_ACK'
Johannes Berg says:

====================
netlink extended ACK reporting

Changes since v4:
 * use __NLMSGERR_ATTR_MAX instead of NUM_NLMSGERR_ATTRS

Changes since v3:
 * Add NLM_F_CAPPED and NLM_F_ACK_TLVS flags, to allow entirely
   stateless parsing of the ACK messages by looking at the new
   flags. Need to check NLM_F_ACK_TLVS first, since capping can
   be done in kernels before this patchset without setting the
   flag.
 * Remove "missing_attr" functionality - this can obviously be
   added back rather easily, but I'd rather have more discussion
   about the nesting problem there.
 * Improve documentation of NLMSGERR_ATTR_OFFS
 * Improve message structure documentation, documenting that the
   request message is always capped for success cases
 * fix nlmsg_len of the outer message by calling nlmsg_end()
 * fix memcpy() of the request in success cases, going back to
   the original code that I'd changed before due to the payload
   adjustments that I reverted when introducing tlvlen

Changes since v2:
 * add NUM_NLMSGERR_ATTRS, NLMSGERR_ATTR_MAX
 * fix cookie length to 20 (sha-1 length)
 * move struct members for cookie to patch 3 where they should be
 * another cleanup suggested by David Ahern

Changes since v1:
 * credit Pablo and Jamal
 * incorporate suggestion from David Ahern
 * fix compilation in decnet
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:58:23 -04:00
Johannes Berg
fe52145f91 netlink: pass extended ACK struct where available
This is an add-on to the previous patch that passes the extended ACK
structure where it's already available by existing genl_info or extack
function arguments.

This was done with this spatch (with some manual adjustment of
indentation):

@@
expression A, B, C, D, E;
identifier fn, info;
@@
fn(..., struct genl_info *info, ...) {
...
-nlmsg_parse(A, B, C, D, E, NULL)
+nlmsg_parse(A, B, C, D, E, info->extack)
...
}

@@
expression A, B, C, D, E;
identifier fn, info;
@@
fn(..., struct genl_info *info, ...) {
<...
-nla_parse_nested(A, B, C, D, NULL)
+nla_parse_nested(A, B, C, D, info->extack)
...>
}

@@
expression A, B, C, D, E;
identifier fn, extack;
@@
fn(..., struct netlink_ext_ack *extack, ...) {
<...
-nlmsg_parse(A, B, C, D, E, NULL)
+nlmsg_parse(A, B, C, D, E, extack)
...>
}

@@
expression A, B, C, D, E;
identifier fn, extack;
@@
fn(..., struct netlink_ext_ack *extack, ...) {
<...
-nla_parse(A, B, C, D, E, NULL)
+nla_parse(A, B, C, D, E, extack)
...>
}

@@
expression A, B, C, D, E;
identifier fn, extack;
@@
fn(..., struct netlink_ext_ack *extack, ...) {
...
-nlmsg_parse(A, B, C, D, E, NULL)
+nlmsg_parse(A, B, C, D, E, extack)
...
}

@@
expression A, B, C, D;
identifier fn, extack;
@@
fn(..., struct netlink_ext_ack *extack, ...) {
<...
-nla_parse_nested(A, B, C, D, NULL)
+nla_parse_nested(A, B, C, D, extack)
...>
}

@@
expression A, B, C, D;
identifier fn, extack;
@@
fn(..., struct netlink_ext_ack *extack, ...) {
<...
-nlmsg_validate(A, B, C, D, NULL)
+nlmsg_validate(A, B, C, D, extack)
...>
}

@@
expression A, B, C, D;
identifier fn, extack;
@@
fn(..., struct netlink_ext_ack *extack, ...) {
<...
-nla_validate(A, B, C, D, NULL)
+nla_validate(A, B, C, D, extack)
...>
}

@@
expression A, B, C;
identifier fn, extack;
@@
fn(..., struct netlink_ext_ack *extack, ...) {
<...
-nla_validate_nested(A, B, C, NULL)
+nla_validate_nested(A, B, C, extack)
...>
}

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:58:22 -04:00
Johannes Berg
fceb6435e8 netlink: pass extended ACK struct to parsing functions
Pass the new extended ACK reporting struct to all of the generic
netlink parsing functions. For now, pass NULL in almost all callers
(except for some in the core.)

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:58:22 -04:00
Johannes Berg
ba0dc5f6e0 netlink: allow sending extended ACK with cookie on success
Now that we have extended error reporting and a new message format for
netlink ACK messages, also extend this to be able to return arbitrary
cookie data on success.

This will allow, for example, nl80211 to not send an extra message for
cookies identifying newly created objects, but return those directly
in the ACK message.

The cookie data size is currently limited to 20 bytes (since Jamal
talked about using SHA1 for identifiers.)

Thanks to Jamal Hadi Salim for bringing up this idea during the
discussions.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:58:21 -04:00
Johannes Berg
7ab606d160 genetlink: pass extended ACK report down
Pass the extended ACK reporting struct down from generic netlink to
the families, using the existing struct genl_info for simplicity.

Also add support to set the extended ACK information from generic
netlink users.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:58:21 -04:00
Johannes Berg
2d4bc93368 netlink: extended ACK reporting
Add the base infrastructure and UAPI for netlink extended ACK
reporting. All "manual" calls to netlink_ack() pass NULL for now and
thus don't get extended ACK reporting.

Big thanks goes to Pablo Neira Ayuso for not only bringing up the
whole topic at netconf (again) but also coming up with the nlattr
passing trick and various other ideas.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:58:20 -04:00
Mahesh Bandewar
fb9eb899a6 bonding: handle link transition from FAIL to UP correctly
When link transitions from LINK_FAIL to LINK_UP, the commit phase is
not called. This leads to an erroneous state causing slave-link state to
get stuck in "going down" state while its speed and duplex are perfectly
fine. This issue is a side-effect of splitting link-set into propose and
commit phases introduced by de77ecd4ef ("bonding: improve link-status
update in mii-monitoring")

This patch fixes these issues by calling commit phase whenever link
state change is proposed.

Fixes: de77ecd4ef ("bonding: improve link-status update in mii-monitoring")
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:48:59 -04:00
Jie Deng
d4d49bc145 net: dwc-xlgmac: add the initial ethtool support
It is necessary to provide ethtool support for displaying and
modifying parameters of dwc-xlgmac.

Signed-off-by: Jie Deng <jiedeng@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:46:38 -04:00
Gao Feng
7ed14d973f net: ipv4: Refine the ipv4_default_advmss
1. Don't get the metric RTAX_ADVMSS of dst.
There are two reasons.
1) Its caller dst_metric_advmss has already invoke dst_metric_advmss
before invoke default_advmss.
2) The ipv4_default_advmss is used to get the default mss, it should
not try to get the metric like ip6_default_advmss.

2. Use sizeof(tcphdr)+sizeof(iphdr) instead of literal 40.

3. Define one new macro IPV4_MAX_PMTU instead of 65535 according to
RFC 2675, section 5.1.

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:19:48 -04:00
David S. Miller
29b904a0dd Merge branch 'rtnetlink-cleanup-user-notifications'
David Ahern says:

====================
rtnetlink: Cleanup user notifications for netdev events

Vlad's recent patch to add the event type to rtnetlink notifications
points out a number of redundant or unnecessary notifications sent to
userspace for events that are essentially internal to the kernel. Trim
the list to put a dent in the notification storm.

v2
- rebased to top of net-next with IFLA_EVENT patch reverted
- dropped removal NETDEV_CHANGEINFODATA since it is intentionally
  only to send a message to userspace
- dropped NOTIFY_PEERS since Vlad's says it is needed for macvlans
- add patches to remove NETDEV_CHANGEUPPER and NETDEV_CHANGE_TX_QUEUE_LEN
  from the event list
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:16:35 -04:00
David Ahern
27b3b551d8 rtnetlink: Do not generate notifications for NETDEV_CHANGE_TX_QUEUE_LEN event
Changing tx queue length generates identical messages:

[LINK]22: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
    link/ether 02:04:f4:b7:5c:d2 brd ff:ff:ff:ff:ff:ff promiscuity 0
    dummy numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
[LINK]22: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
    link/ether 02:04:f4:b7:5c:d2 brd ff:ff:ff:ff:ff:ff promiscuity 0
    dummy numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

Remove NETDEV_CHANGE_TX_QUEUE_LEN from the list of notifiers that generate
notifications.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:16:34 -04:00
David Ahern
b6b36eb23a rtnetlink: Do not generate notifications for NETDEV_CHANGEUPPER event
NETDEV_CHANGEUPPER is an internal event; do not generate userspace
notifications.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:16:34 -04:00
David Ahern
aed0735909 rtnetlink: Do not generate notifications for CHANGELOWERSTATE event
CHANGELOWERSTATE is an internal event; do not generate userspace
notifications.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:16:33 -04:00
David Ahern
bf2c2984d3 rtnetlink: Do not generate notifications for PRECHANGEUPPER event
PRECHANGEUPPER is an internal event; do not generate userspace
notifications.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:16:33 -04:00
David Ahern
aef091ae58 rtnetlink: Do not generate notifications for POST_TYPE_CHANGE event
Changing the master device for a link generates many messages; the one
generated for POST_TYPE_CHANGE is redundant:

[LINK]11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master br1 state UNKNOWN group default
    link/ether 02:02:02:02:02:03 brd ff:ff:ff:ff:ff:ff
[LINK]11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master br1 state UNKNOWN group default
    link/ether 02:02:02:02:02:03 brd ff:ff:ff:ff:ff:ff

Remove POST_TYPE_CHANGE from the list of notifiers that generate
notifications.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:16:33 -04:00
David Ahern
cd8966e75e rtnetlink: Do not generate notifications for CHANGEADDR event
Changing hardware address generates redundant messages:

[LINK]11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
    link/ether 02:02:02:02:02:02 brd ff:ff:ff:ff:ff:ff

[LINK]11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
    link/ether 02:02:02:02:02:02 brd ff:ff:ff:ff:ff:ff

Do not send a notification for the CHANGEADDR notifier.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:16:33 -04:00
David Ahern
46ede612c7 rtnetlink: Do not generate notification for UDP_TUNNEL_PUSH_INFO
NETDEV_UDP_TUNNEL_PUSH_INFO is an internal notifier; nothing userspace
can do so don't generate a netlink notification.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:16:33 -04:00
David Ahern
085e1a65f0 rtnetlink: Do not generate notifications for MTU events
Changing MTU on a link currently causes 3 messages to be sent to userspace:

[LINK]11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1490 qdisc noqueue state UNKNOWN group default
    link/ether f2:52:5c:6d:21:f3 brd ff:ff:ff:ff:ff:ff

[LINK]11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
    link/ether f2:52:5c:6d:21:f3 brd ff:ff:ff:ff:ff:ff

[LINK]11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default
    link/ether f2:52:5c:6d:21:f3 brd ff:ff:ff:ff:ff:ff

Remove the messages sent for PRE_CHANGE_MTU and CHANGE_MTU netdev events.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:16:32 -04:00
David Daney
b6518e6a00 tools: bpf_jit_disasm: Add option to dump JIT image to a file.
When debugging the JIT on an embedded platform or cross build
environment, libbfd may not be available, making it impossible to run
bpf_jit_disasm natively.

Add an option to emit a binary image of the JIT code to a file.  This
file can then be disassembled off line.  Typical usage in this case
might be (pasting mips64 dmesg output to cat command):

   $ cat > jit.raw
   $ bpf_jit_disasm -f jit.raw -O jit.bin
   $ mips64-linux-gnu-objdump -D -b binary -m mips:isa64r2 -EB jit.bin

Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 13:04:03 -04:00
Niklas Cassel
fe6af0e122 net: stmmac: set total length of the packet to be transmitted in TDES3
Field FL/TPL in register TDES3 is not correctly set on GMAC4.
TX appears to be functional on GMAC 4.10a even if this field is not set,
however, to avoid relying on undefined behavior, set the length in TDES3.

The field has a different meaning depending on if the TSE bit in TDES3
is set or not (TSO). However, regardless of the TSE bit, the field is
not optional. The field is already set correctly when the TSE bit is set.

Since there is no limit for the number of descriptors that can be
used for a single packet, the field should be set to the sum of
the buffers contained in:
[<desc with First Descriptor bit set> ... <desc n> ...
<desc with Last Descriptor bit set>], which should be equal to skb->len.

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 12:40:09 -04:00
Ganesh Goudar
6b254afd2e cxgb4: save tid while creating server filter
Save the filter tid while creating the server filter, which is used
later to retrieve the corresponding filter instance while handling
the filter reply.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 12:37:57 -04:00
Daniele Palmas
14cf4a771b drivers: net: usb: qmi_wwan: add QMI_QUIRK_SET_DTR for Telit PID 0x1201
Telit LE920A4 uses the same pid 0x1201 of LE920, but modem
implementation is different, since it requires DTR to be set for
answering to qmi messages.

This patch replaces QMI_FIXED_INTF with QMI_QUIRK_SET_DTR: tests on
LE920 have been performed in order to verify backward compatibility.

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 12:36:31 -04:00
Rafael J. Wysocki
1315f01632 Revert "ACPICA: Resources: Not a valid resource if buffer length too long"
Revert commit 57707a9a77 (ACPICA: Resources: Not a valid resource if
buffer length too long) as it is reported to prevent the TPM module
from loading on Lenovo X60 with Coreboot.

It also causes new confusing warnings to show up in the kernel log.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=195311
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-04-13 18:23:46 +02:00
Linus Torvalds
2760078203 Two pin control fixes arriving late:
- Make the Acer Chromebook keyboard work again with the Intel
   Cherryview driver.
 - Fix a merge error in the Exynos 5433 driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJY72FJAAoJEEEQszewGV1zF3sQAKA6RQnoWStOcjfN/oYBkMB1
 5oqY6hAVSKy7lEvSkRM4BSDGIA149UoeDcVcRRRcXqTviuHz4bj8Dp5srqkwqtmX
 lFxcwvzvQUjt91UGkez6o06fbKqyCldYsTweSACEoiTjFkbaDeEI0phIUDm053bf
 RCIOxreOsRJxnPIYPflWV/ylgAe57FNYu2VPmBF9Tl/oqf6YmxtuJ8j9GiDvo4Us
 sHof+Ir1Up4fTOAZPkyjtlWERDyiqKbuXa8tGsWXg1niKdHc8aLDa8vnM53+0v4z
 3KqcqA8nPzn+K7BQBk5bIJZXASMx03c6BoctYASO5bret9uukTRXy7EuYc9mkY7N
 7RG+6BKN39JqZEhVugiU6Qd6LXQqGwz5HNVto4mPKBkg6+LjybZhBu7he2F6ggFt
 mdO8xeg25pZMyM3pfsxOBHY9u7AvahndcxB0civZ6OB6hugK+NQCZYRxsq1hMS2v
 EmGtFC4z3X0Ju/tnVumELWt0FQfU7sSR7yRhsvZWNcEgR0c/FZmOvsJ7LpAaoB1k
 y64/2OFcWBssslgHb4u7x6cEXsf3MWuxTLiUy7+QmVKlAhWDhPVbnA8XZIVvozD4
 fSID6zB7nOWw+WqB4aJ594Fo4Zbfw9Rwf9KU9rp3Yl5pYAJs6+w2ZvJP4gniu3LW
 MoSuV9P8P7YRmRuck2NL
 =Uh95
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
 "Two pin control fixes arriving late, these are hopefully the last pin
  control fixes I send this kernel cycle. A Chromebook and an Exynos SoC
  thingie.

  The Exynos patch is pretty big, it is fixing unbroken a breakage
  caused by yours truly when trying to figure out the merge mess with
  the different Samsung platforms for this merge window. Sorry about
  that. We have countered this situation by assigning a Samsung pin
  control submaintainer to catch stuff earlier.

  Summary:

   - Make the Acer Chromebook keyboard work again with the Intel
     Cherryview driver.

   - Fix a merge error in the Exynos 5433 driver"

* tag 'pinctrl-v4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: cherryview: Add a quirk to make Acer Chromebook keyboard work again
  pinctrl: samsung: Add missing part for PINCFG_TYPE_DRV of Exynos5433
2017-04-13 09:08:29 -07:00
Pavel Shilovsky
67dbea2ce6 CIFS: Fix SMB3 mount without specifying a security mechanism
Commit ef65aaede2 ("smb2: Enforce sec= mount option") changed the
behavior of a mount command to enforce a specified security mechanism
during mounting. On another hand according to the spec if SMB3 server
doesn't respond with a security context it implies that it supports
NTLMSSP. The current code doesn't keep it in mind and fails a mount
for such servers if no security mechanism is specified. Fix this by
indicating that a server supports NTLMSSP if a security context isn't
returned during negotiate phase. This allows the code to use NTLMSSP
by default for SMB3 mounts.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
2017-04-13 10:03:26 -05:00
David S. Miller
19ec50839d Merge branch 'mvmdio-updates'
Russell King says:

====================
mvmdio updates

This series of patches update mvmdio for Armada 8k CP110.  A number of
issues were found:

1. The driver fails to disable an interrupt when something goes wrong
   in the probe function.

2. The interrupt is specified in DT to be optional, but the driver
   unconditionally writes to the interrupt mask register, which may
   not exist.

3. The DT binding specifies
    "reg: address and length of the SMI register"
   however, when supporting the interrupt, the size must cover the
   interrupt register as well.  Update the binding documentation
   with this information that was previously omitted.

4. If the register size is too small, have the driver print an error
   and disable use of the interrupt.

5. Armada 8k needs three clocks for the MDIO interface, otherwise the
   SoC hangs (since it is part of one of the ethernet interfaces.)
   GOP clock, MG core clock and MG clock are needed on 8k. Augment the
   binding and driver to allow three clocks to be specified.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 10:59:12 -04:00
Russell King
96cb434238 net: mvmdio: allow up to three clocks to be specified for orion-mdio
Allow up to three clocks to be specified and enabled for the orion-mdio
interface, which are required for this interface to be accessible on
Armada 8k platforms.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 10:59:12 -04:00
Russell King
6d6a331f44 dt-bindings: allow up to three clocks for orion-mdio
Armada 8040 needs three clocks to be enabled for MDIO accesses to work.
Update the binding to allow the extra clocks to be specified.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 10:59:11 -04:00
Russell King
a51e2c9da4 net: mvmdio: disable interrupt if resource size is too small
Disable the MDIO interrupt, falling back to polled mode, if the resource
size does not allow us to access the interrupt registers.  All current
DT bindings use a size of 0x84, which allows access, but verifying it is
good practice.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 10:59:11 -04:00
Russell King
2c26122e2c dt-bindings: correct marvell orion MDIO binding document
Correct the Marvell Orion MDIO binding document to properly reflect the
cases where an interrupt is present.  Augment the examples to show this.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 10:59:11 -04:00
Russell King
7093a9702e net: mvmdio: fix interrupt disable in remove path
The pre-existing write to disable interrupts on the remove path happens
whether we have an interrupt or not.  While this may seem to be a good
idea, this driver is re-used in many different implementations, some
where the binding only specifies four bytes of register space.  This
access causes us to access registers outside of the binding.

Make it conditional on the interrupt being present, which is the same
condition used when enabling the interrupt in the first place.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 10:59:11 -04:00
Russell King
37282485dd net: mvmdio: disable interrupts in driver failure path
When the mvmdio driver has an interrupt, it enables the "done" interrupt
after requesting its interrupt handler.  However, probe failure results
in the interrupt being left enabled.  Disable it on the failure path.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13 10:59:11 -04:00