Commit Graph

691501 Commits

Author SHA1 Message Date
Hans de Goede
3a4991a986 i2c: acpi: Do not create i2c-clients for LNXVIDEO ACPI devices
ACPI video devices get tagged by the kernel with the custom LNXVIDEO
HID so that normal pnp-id matching can be used and are handled by the
acpi-video driver.

Sometimes the ACPI nodes describing these contain a SERIAL_TYPE_I2C ACPI
resource. Before this commit the presence of this resource would cause the
i2c-core to create a /sys/bus/i2c/devices/i2c-LNXVIDEO:00 device for this
with a modalias of: "i2c:LNXVIDEO:00".

There is no i2c driver for this custom HID, the acpi-video driver binds
directly to the ACPI device /sys/bus/acpi/devices/LNXVIDEO\:00 which has
a modalias of "acpi:LNXVIDEO:" .

Not only is the creation of an i2c-client for this undesirable, it is
actually causing problems. This weird pseudo-resource claims an i2c
speed of 100KHz and typically points to the i2c bus which is used by the
touchscreen controller. Some touchscreen controllers only work properly at
400KHz, at 100KHz they cause errors like these:

i2c_designware 80860F41:03: i2c_dw_handle_tx_abort: lost arbitration
i2c_designware 80860F41:03: i2c_dw_handle_tx_abort: lost arbitration
i2c_designware 80860F41:03: i2c_dw_handle_tx_abort: lost arbitration
i2c_designware 80860F41:03: i2c_dw_handle_tx_abort: lost arbitration
silead_ts i2c-MSSL1680:00: Registers clear error -11

This commit makes the i2c-core ignore LNXVIDEO compatible ACPI devices
which has 2 positive results:

1) The bogus i2c-client for these is no longer created.
2) i2c_acpi_lookup_speed now ignores the 100KHz speed from the pseudo
i2c-resouce and properly returns 400KHz as speed for the touchscreen
i2c bus, fixing the touchscreen not working on various devies.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-07-04 15:57:34 +02:00
Colin Ian King
5122daa017 x86/platform/uv/BAU: Minor cleanup, make some local functions static
The functions handle_uv2_busy, uv_flush_send_and_wait and
find_another_by_swack are local to the source, so make them static.

Also remove normal_busy as it is no longer used.

Fixes various smatch warnings, such as:
"symbol 'find_another_by_swack' was not declared. Should it be static?"
"symbol 'handle_uv2_busy' was not declared. Should it be static?"

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Andrew Banman <abanman@hpe.com>
Cc: kernel-janitors@vger.kernel.org
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/20170704083129.10559-1-colin.king@canonical.com
2017-07-04 15:52:37 +02:00
Cornelia Huck
1372324b32 Update my email address
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 14:03:02 +02:00
Thomas Gleixner
2343877fbd genirq/timings: Move free timings out of spinlocked region
No point to do memory management from a interrupt disabled spin locked
region.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Julia Cartwright <julia@ni.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: linux-rockchip@lists.infradead.org
Cc: John Keeping <john@metanate.com>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214344.196130646@linutronix.de
2017-07-04 12:46:16 +02:00
Thomas Gleixner
46e48e2573 genirq: Move irq resource handling out of spinlocked region
Aside of being conceptually wrong, there is also an actual (hard to
trigger and mostly theoretical) problem.

CPU0				CPU1
free_irq(X)			interrupt X
				spin_lock(desc->lock)
				wake irq thread()
				spin_unlock(desc->lock)
spin_lock(desc->lock)
remove action()
shutdown_irq()			
release_resources()		thread_handler()
spin_unlock(desc->lock)		  access released resources.

synchronize_irq()

Move the release resources invocation after synchronize_irq() so it's
guaranteed that the threaded handler has finished.

Move the resource request call out of the desc->lock held region as well,
so the invocation context is the same for both request and release.

This solves the problems with those functions on RT as well.
 
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Julia Cartwright <julia@ni.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: linux-rockchip@lists.infradead.org
Cc: John Keeping <john@metanate.com>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214344.117028181@linutronix.de
2017-07-04 12:46:16 +02:00
Thomas Gleixner
9114014cf4 genirq: Add mutex to irq desc to serialize request/free_irq()
The irq_request/release_resources() callbacks ar currently invoked under
desc->lock with interrupts disabled. This is a source of problems on RT and
conceptually not required.

Add a seperate mutex to struct irq_desc which allows to serialize
request/free_irq(), which can be used to move the resource functions out of
the desc->lock held region.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Julia Cartwright <julia@ni.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: linux-rockchip@lists.infradead.org
Cc: John Keeping <john@metanate.com>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214344.039220922@linutronix.de
2017-07-04 12:46:16 +02:00
Thomas Gleixner
3a90795e1e genirq: Move bus locking into __setup_irq()
There is no point in having the irq_bus_lock() protection around all
callers to __setup_irq().

Move it into __setup_irq(). This is also a preparatory patch for addressing
the issues with the irq resource callbacks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: Julia Cartwright <julia@ni.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: linux-rockchip@lists.infradead.org
Cc: John Keeping <john@metanate.com>
Cc: linux-gpio@vger.kernel.org
Link: http://lkml.kernel.org/r/20170629214343.960949031@linutronix.de
2017-07-04 12:46:15 +02:00
Geert Uytterhoeven
2372a519f6 genirq: Force inlining of __irq_startup_managed to prevent build failure
If CONFIG_SMP=n, and gcc (e.g. 4.1.2) decides not to inline
__irq_startup_managed(), the build fails with:

    kernel/built-in.o: In function `irq_startup':
    (.text+0x38ed8): undefined reference to `irq_set_affinity_locked'

Fix this by forcing inlining of __irq_startup_managed().

Fixes: 761ea388e8 ("genirq: Handle managed irqs gracefully in irq_startup()")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: http://lkml.kernel.org/r/1499162761-12398-1-git-send-email-geert@linux-m68k.org
2017-07-04 12:36:44 +02:00
Sebastian Ott
e5682b4eec genirq/debugfs: Fix build for !CONFIG_IRQ_DOMAIN
Fix this build error:

kernel/irq/internals.h:440:20: error: inlining failed in call to always_inline
  'irq_domain_debugfs_init': function body not available
kernel/irq/debugfs.c:202:2: note: called from here
  irq_domain_debugfs_init(root_dir);
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.LFD.2.20.1707041124000.1712@schleppi
2017-07-04 12:36:43 +02:00
Ingo Molnar
3b9c08ae3d Revert "sched/cputime: Refactor the cputime_adjust() code"
This reverts commit 72298e5c92.

As Peter explains:

> Argh, no... That code was perfectly fine. The new code otoh is
> convoluted.
>
> The old code had the following form:
>
>         if (exception1)
>           deal with exception1
>
>         if (execption2)
>           deal with exception2
>
>         do normal stuff
>
> Which is as simple and straight forward as it gets.
>
> The new code otoh reads like:
>
>         if (!exception1) {
>                 if (exception2)
>                   deal with exception 2
>                 else
>                   do normal stuff
>         }

So restore the old form.

Also fix the comment describing the logic, as it was confusing.

Requested-by: Peter Zijlstra <peterz@infradead.org>
Cc: Gustavo A. R. Silva <garsilva@embeddedor.com>
Cc: Frans Klaver <fransklaver@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-04 11:58:05 +02:00
Haozhong Zhang
691bd4340b kvm: vmx: allow host to access guest MSR_IA32_BNDCFGS
It's easier for host applications, such as QEMU, if they can always
access guest MSR_IA32_BNDCFGS in VMCS, even though MPX is disabled in
guest cpuid.

Cc: stable@vger.kernel.org
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 11:30:52 +02:00
Jaegeuk Kim
34dc77ad74 f2fs: add ioctl to do gc with target block address
This patch adds f2fs_ioc_gc_range() to move blocks located in the given
range.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:50 -07:00
Jaegeuk Kim
a9bcf9bcd0 f2fs: don't need to check encrypted inode for partial truncation
The cache_only is always false, if inode is encrypted.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:49 -07:00
Chao Yu
0eb0adadf2 f2fs: measure inode.i_blocks as generic filesystem
Both in memory or on disk, generic filesystems record i_blocks with
512bytes sized sector count, also VFS sub module such as disk quota
follows this rule, but f2fs records it with 4096bytes sized block
count, this difference leads to that once we use dquota's function
which inc/dec iblocks, it will make i_blocks of f2fs being inconsistent
between in memory and on disk.

In order to resolve this issue, this patch changes to make in-memory
i_blocks of f2fs recording sector count instead of block count,
meanwhile leaving on-disk i_blocks recording block count.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:48 -07:00
Chao Yu
663f387b71 f2fs: set CP_TRIMMED_FLAG correctly
Don't set CP_TRIMMED_FLAG for non-zoned block device or discard
unsupported device, it can avoid to trigger unneeded checkpoint for
that kind of device.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:47 -07:00
Eric Biggers
67773a1fbd f2fs: require key for truncate(2) of encrypted file
Currently, filesystems allow truncate(2) on an encrypted file without
the encryption key.  However, it's impossible to correctly handle the
case where the size being truncated to is not a multiple of the
filesystem block size, because that would require decrypting the final
block, zeroing the part beyond i_size, then encrypting the block.

As other modifications to encrypted file contents are prohibited without
the key, just prohibit truncate(2) as well, making it fail with ENOKEY.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:46 -07:00
Chao Yu
8ceffcb29e f2fs: move sysfs code from super.c to fs/f2fs/sysfs.c
Codes related to sysfs and procfs are dispersive and mixed with sb
related codes, but actually these codes are independent from others,
so split them from super.c, and reorgnize and manger them in sysfs.c.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:45 -07:00
Chao Yu
a398101aa1 f2fs: clean up sysfs codes
Just cleanup.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:45 -07:00
Chao Yu
56412894b3 f2fs: fix to document fault injection option and sysfs file
Commit 73faec4d99 ("f2fs: add mount option to select fault injection
ratio") and Commit 087968974f ("f2fs: add fault injection to sysfs")
forget to document mount option and sysfs file.

This patch fixes to document them.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:44 -07:00
Chao Yu
1727f31721 f2fs: fix wrong error number of fill_super
This patch fixes incorrect error number in error path of fill_super.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:43 -07:00
Chao Yu
6f6d9fe2ab f2fs: fix incorrect document of batched_trim_sections
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:42 -07:00
Chao Yu
44529f8975 f2fs: fix to show injection rate in ->show_options
If fault injection functionality is enabled, show additional injection
rate in ->show_options.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:41 -07:00
Christophe JAILLET
b63def9112 f2fs: Fix a return value in case of error in 'f2fs_fill_super'
err must be set to -ENOMEM, otherwise we return 0.

Fixes: a912b54d3a ("f2fs: split bio cache")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:40 -07:00
Tiezhu Yang
a005774c8d f2fs: use proper variable name
It is better to use variable name "inline_dentry" instead of "dentry_blk"
when data type is "struct f2fs_inline_dentry". This patch has no functional
changes, just to make code more readable especially when call the function
make_dentry_ptr_inline() and f2fs_convert_inline_dir().

Signed-off-by: Tiezhu Yang <kernelpatch@126.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:40 -07:00
Chao Yu
1f258ec13b f2fs: fix to avoid panic when encountering corrupt node
With fault_injection option, generic/361 of fstests will complain us
with below message:

Call Trace:
 get_node_page+0x12/0x20 [f2fs]
 f2fs_iget+0x92/0x7d0 [f2fs]
 f2fs_fill_super+0x10fb/0x15e0 [f2fs]
 mount_bdev+0x184/0x1c0
 f2fs_mount+0x15/0x20 [f2fs]
 mount_fs+0x39/0x150
 vfs_kern_mount+0x67/0x110
 do_mount+0x1bb/0xc70
 SyS_mount+0x83/0xd0
 do_syscall_64+0x6e/0x160
 entry_SYSCALL64_slow_path+0x25/0x25

Since mkfs loop device in f2fs partition can be failed silently due to
checkpoint error injection, so root inode page can be corrupted, in order
to avoid needless panic, in get_node_page, it's better to leave message
and return error to caller, and let fsck repaire it later.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:39 -07:00
Chao Yu
febeca6d37 f2fs: don't track newly allocated nat entry in list
We will never persist newly allocated nat entries during checkpoint(), so
we don't need to track such nat entries in nat dirty list in order to
avoid:
- more latency during traversing dirty list;
- sorting nat sets incorrectly due to recording wrong entry_cnt in nat
entry set.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:38 -07:00
Chao Yu
d9703d9097 f2fs: add f2fs_bug_on in __remove_discard_cmd
Recently, discard related codes have changed a lot, so add f2fs_bug_on to
detect potential bug.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:37 -07:00
Chao Yu
2a510c005c f2fs: introduce __wait_one_discard_bio
In order to avoid copied codes.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:36 -07:00
Qiuyang Sun
5a3a2d83cd f2fs: dax: fix races between page faults and truncating pages
Currently in F2FS, page faults and operations that truncate the pagecahe
or data blocks, are completely unsynchronized. This can result in page
fault faulting in a page into a range that we are changing after
truncating, and thus we can end up with a page mapped to disk blocks that
will be shortly freed. Filesystem corruption will shortly follow.

This patch fixes the problem by creating new rw semaphore i_mmap_sem in
f2fs_inode_info and grab it for functions removing blocks from extent tree
and for read over page faults. The mechanism is similar to that in ext4.

Signed-off-by: Qiuyang Sun <sunqiuyang@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:35 -07:00
Fan Li
72fdbe2efe f2fs: simplify the way of calulating next nat address
The index of segment which the next nat block is in has only one different
bit than the current one, so to get the next nat address, we can simply
alter that one bit.

Signed-off-by: Fan Li <fanofcode.li@samsung.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:34 -07:00
Jin Qian
21d3f8e1c3 f2fs: sanity check size of nat and sit cache
Make sure number of entires doesn't exceed max journal size.

Cc: stable@vger.kernel.org
Signed-off-by: Jin Qian <jinqian@android.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:34 -07:00
Yunlei He
d4fdf8ba0e f2fs: fix a panic caused by NULL flush_cmd_control
Mount fs with option noflush_merge, boot failed for illegal address
fcc in function f2fs_issue_flush:

        if (!test_opt(sbi, FLUSH_MERGE)) {
                ret = submit_flush_wait(sbi);
                atomic_inc(&fcc->issued_flush);   ->  Here, fcc illegal
                return ret;
        }

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:33 -07:00
Zhang Shengju
68390dd9bd f2fs: remove the unnecessary cast for PTR_ERR
It's not necessary to specify 'int' casting for PTR_ERR.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:32 -07:00
Jaegeuk Kim
d8c4256c17 f2fs: remove false-positive bug_on
For example,

f2fs_create
 - new_node_page is failed
 - handle_failed_inode
  - skip to add it into orphan list, since ni.blk_addr == NULL_ADDR
   : set_inode_flag(inode, FI_FREE_NID)

f2fs_evict_inode
 - EIO due to fault injection
 - f2fs_bug_on() is triggered

So, we don't need to call f2fs_bug_on in this case.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:11:31 -07:00
Damien Le Moal
acfd2810c7 f2fs: Do not issue small discards in LFS mode
clear_prefree_segments() issues small discards after discarding full
segments. These small discards may not be section aligned, so not zone
aligned on a zoned block device, causing __f2fs_iissue_discard_zone() to fail.
Fix this by not issuing small discards for a volume mounted with the BLKZONED
feature enabled.

Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-04 02:10:42 -07:00
Roopa Prabhu
397fc9e5ce mpls: route get support
This patch adds RTM_GETROUTE doit handler for mpls routes.

Input:
RTA_DST - input label
RTA_NEWDST - labels in packet for multipath selection

By default the getroute handler returns matched
nexthop label, via and oif

With RTM_F_FIB_MATCH flag, full matched route is
returned.

example (with patched iproute2):
$ip -f mpls route show
101
        nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
        nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12
201
        nexthop as to 202/203 via inet6 2001:db8:2::2 dev virt1-2
        nexthop as to 402/403 via inet6 2001:db8:12::2 dev virt1-12

$ip -f mpls route get 103
RTNETLINK answers: Network is unreachable

$ip -f mpls route get 101
101 as to 102/103 via inet 172.16.2.2 dev virt1-2

$ip -f mpls route get as to 302/303 101
101 as to 302/303 via inet 172.16.12.2 dev virt1-12

$ip -f mpls route get fibmatch 103
RTNETLINK answers: Network is unreachable

$ip -f mpls route get fibmatch 101
101
        nexthop as to 102/103 via inet 172.16.2.2 dev virt1-2
        nexthop as to 302/303 via inet 172.16.12.2 dev virt1-12

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-04 01:50:38 -07:00
Nikolay Aleksandrov
7597b266c5 bridge: allow ext learned entries to change ports
current code silently ignores change of port in the request
message. This patch makes sure the port is modified and
notification is sent to userspace.

Fixes: cf6b8e1eed ("bridge: add API to notify bridge driver of learned FBD on offloaded device")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-04 01:49:44 -07:00
Jamal Hadi Salim
e05a90ec9e net: reflect mark on tcp syn ack packets
SYN-ACK responses on a server in response to a SYN from a client
did not get the injected skb mark that was tagged on the SYN packet.

Fixes: 84f39b08d7 ("net: support marking accepting TCP sockets")
Reviewed-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-04 01:46:10 -07:00
Sean Wang
8d32e06243 net: ethernet: mediatek: fixed deadlock captured by lockdep
Lockdep found an inconsistent lock state when mtk_get_stats64 is called
in user context while NAPI updates MAC statistics in softirq.

Use spin_trylock_bh/spin_unlock_bh fix following lockdep warning.

[   81.321030] WARNING: inconsistent lock state
[   81.325266] 4.12.0-rc1-00035-gd9dda65 #32 Not tainted
[   81.330273] --------------------------------
[   81.334505] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[   81.340464] ksoftirqd/0/7 [HC0[0]:SC1[1]:HE1:SE0] takes:
[   81.345731]  (&syncp->seq#2){+.?...}, at: [<c054ba3c>] mtk_handle_status_irq.part.6+0x70/0x84
[   81.354219] {SOFTIRQ-ON-W} state was registered at:
[   81.359062]   lock_acquire+0xfc/0x2b0
[   81.362696]   mtk_stats_update_mac+0x60/0x2c0
[   81.367017]   mtk_get_stats64+0x17c/0x18c
[   81.370995]   dev_get_stats+0x48/0xbc
[   81.374628]   rtnl_fill_stats+0x48/0x128
[   81.378520]   rtnl_fill_ifinfo+0x4ac/0xd1c
[   81.382584]   rtmsg_ifinfo_build_skb+0x7c/0xe0
[   81.386991]   rtmsg_ifinfo.part.5+0x24/0x54
[   81.391139]   rtmsg_ifinfo+0x24/0x28
[   81.394685]   __dev_notify_flags+0xa4/0xac
[   81.398749]   dev_change_flags+0x50/0x58
[   81.402640]   devinet_ioctl+0x768/0x85c
[   81.406444]   inet_ioctl+0x1a4/0x1d0
[   81.409990]   sock_ioctl+0x16c/0x33c
[   81.413538]   do_vfs_ioctl+0xb4/0xa34
[   81.417169]   SyS_ioctl+0x44/0x6c
[   81.420458]   ret_fast_syscall+0x0/0x1c
[   81.424260] irq event stamp: 3354692
[   81.427806] hardirqs last  enabled at (3354692): [<c0678168>] net_rx_action+0xc0/0x504
[   81.435660] hardirqs last disabled at (3354691): [<c0678134>] net_rx_action+0x8c/0x504
[   81.443515] softirqs last  enabled at (3354106): [<c0101944>] __do_softirq+0x4b4/0x614
[   81.451370] softirqs last disabled at (3354109): [<c012f0c4>] run_ksoftirqd+0x44/0x80
[   81.459134]
[   81.459134] other info that might help us debug this:
[   81.465608]  Possible unsafe locking scenario:
[   81.465608]
[   81.471478]        CPU0
[   81.473900]        ----
[   81.476321]   lock(&syncp->seq#2);
[   81.479701]   <Interrupt>
[   81.482294]     lock(&syncp->seq#2);
[   81.485847]
[   81.485847]  *** DEADLOCK ***
[   81.485847]
[   81.491720] 1 lock held by ksoftirqd/0/7:
[   81.495693]  #0:  (&(&mac->hw_stats->stats_lock)->rlock){+.+...}, at: [<c054ba14>] mtk_handle_status_irq.part.6+0x48/0x84
[   81.506579]
[   81.506579] stack backtrace:
[   81.510904] CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.12.0-rc1-00035-gd9dda65 #32
[   81.518668] Hardware name: Mediatek Cortex-A7 (Device Tree)
[   81.524208] [<c0113dc4>] (unwind_backtrace) from [<c010e3f0>] (show_stack+0x20/0x24)
[   81.531899] [<c010e3f0>] (show_stack) from [<c03f9c64>] (dump_stack+0xb4/0xe0)
[   81.539072] [<c03f9c64>] (dump_stack) from [<c017e970>] (print_usage_bug+0x234/0x2e0)
[   81.546846] [<c017e970>] (print_usage_bug) from [<c017f058>] (mark_lock+0x63c/0x7bc)
[   81.554532] [<c017f058>] (mark_lock) from [<c017fe90>] (__lock_acquire+0x654/0x1bfc)
[   81.562217] [<c017fe90>] (__lock_acquire) from [<c0181d04>] (lock_acquire+0xfc/0x2b0)
[   81.569990] [<c0181d04>] (lock_acquire) from [<c054b76c>] (mtk_stats_update_mac+0x60/0x2c0)
[   81.578283] [<c054b76c>] (mtk_stats_update_mac) from [<c054ba3c>] (mtk_handle_status_irq.part.6+0x70/0x84)
[   81.587865] [<c054ba3c>] (mtk_handle_status_irq.part.6) from [<c054c2b8>] (mtk_napi_tx+0x358/0x37c)
[   81.596845] [<c054c2b8>] (mtk_napi_tx) from [<c06782ec>] (net_rx_action+0x244/0x504)
[   81.604533] [<c06782ec>] (net_rx_action) from [<c01015c4>] (__do_softirq+0x134/0x614)
[   81.612306] [<c01015c4>] (__do_softirq) from [<c012f0c4>] (run_ksoftirqd+0x44/0x80)
[   81.619907] [<c012f0c4>] (run_ksoftirqd) from [<c0154680>] (smpboot_thread_fn+0x14c/0x25c)
[   81.628110] [<c0154680>] (smpboot_thread_fn) from [<c014f8cc>] (kthread+0x150/0x180)
[   81.635798] [<c014f8cc>] (kthread) from [<c0109290>] (ret_from_fork+0x14/0x24)

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-04 01:43:38 -07:00
David S. Miller
2671e9fc62 Merge branch 'ipv4-ipv6-refcount_t'
Elena Reshetova says:

====================
v2 ipv4/ipv6 refcount conversions

Changes in v2:
 * rebase on top of net-next
 * currently by default refcount_t = atomic_t (*) and uses all
   atomic standard operations unless CONFIG_REFCOUNT_FULL is enabled.
   This is a compromise for the systems that are critical on
   performance (such as net) and cannot accept even slight delay
   on the refcounter operations.

This series, for ipv4/ipv6 network components, replaces atomic_t reference
counters with the new refcount_t type and API (see include/linux/refcount.h).
By doing this we prevent intentional or accidental
underflows or overflows that can led to use-after-free vulnerabilities.

The patches are fully independent and can be cherry-picked separately.
In order to try with refcount functionality enabled in run-time,
CONFIG_REFCOUNT_FULL must be enabled.

NOTE: automatic kernel builder for some reason doesn't like all my
network branches and regularly times out the builds on these branches.
Suggestion for "waiting a day for a good coverage" doesn't work, as
we have seen with generic network conversions. So please wait for the
full report from kernel test rebot before merging further up.
This has been compile-tested in 116 configs, but 71 timed out (including
all s390-related configs again). I am trying to see if they can fix
build coverage for me in meanwhile.

* The respective change is currently merged into -next as
  "locking/refcount: Create unchecked atomic_t implementation".
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-04 01:29:05 -07:00
Reshetova, Elena
0029c0deb5 net, ipv4: convert fib_info.fib_clntref from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-04 01:29:04 -07:00
Reshetova, Elena
f6a6fede28 net, ipv4: convert cipso_v4_doi.refcount from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-04 01:29:04 -07:00
Reshetova, Elena
87078f26b6 net, ipv6: convert ip6addrlbl_entry.refcnt from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-04 01:29:04 -07:00
Reshetova, Elena
d12f3827e0 net, ipv6: convert xfrm6_tunnel_spi.refcnt from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-04 01:29:04 -07:00
Reshetova, Elena
affa78bc6a net, ipv6: convert ifacaddr6.aca_refcnt from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-04 01:29:04 -07:00
Reshetova, Elena
d3981bc615 net, ipv6: convert ifmcaddr6.mca_refcnt from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-04 01:29:04 -07:00
Reshetova, Elena
271201c09c net, ipv6: convert inet6_ifaddr.refcnt from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-04 01:29:04 -07:00
Reshetova, Elena
1be9246077 net, ipv6: convert inet6_dev.refcnt from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-04 01:29:04 -07:00
Reshetova, Elena
0aeea21ada net, ipv6: convert ipv6_txoptions.refcnt from atomic_t to refcount_t
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-04 01:29:03 -07:00
Sagi Grimberg
e5859d3a0e nvme-fc: use blk_mq_delay_run_hw_queue instead of open-coding it
Cc: James Smart <james.smart@broadcom.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2017-07-04 10:44:46 +03:00