The input compat code should work like all other compat code: for 32-bit
syscalls, use the 32-bit ABI and for 64-bit syscalls, use the 64-bit
ABI. We have a helper for that (in_compat_syscall()): just use it.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
amdkfd wants to know syscall type, not task type. Check directly.
Unfortunately, amdkfd is making nasty assumptions that a process'
bitness is a well-defined constant thing. This isn't the case on x86.
I don't know how much this matters, but this patch has no effect on
generated code on x86, so amdkfd is equally broken with and without this
patch.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This should make no difference on any architecture, as x86's historical
is_compat_task behavior really did check whether the calling syscall was
a compat syscall. x86's is_compat_task is going away, though.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Firewire was using is_compat_task to check whether it was in a compat
ioctl or a non-compat ioctl. Use is_compat_syscall instead so it works
properly on all architectures.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The code wants to prevent compat code from receiving messages. Use
in_compat_syscall for this.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SCTP unfortunately has a different ABI for SCTP_SOCKOPT_CONNECTX3 for
32-bit and 64-bit callers. Use in_compat_syscall to correctly
distinguish them on all architectures.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ext4 treats directory offsets differently for 32-bit and 64-bit callers.
Check the caller type using in_compat_syscall, not is_compat_task. This
changes behavior on SPARC slightly.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
AFAICT, lustre is trying to determine syscall bitness. Use the new
accessor.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Except on SPARC, this is what the code always did. SPARC compat seccomp
was buggy, although the impact of the bug was limited because SPARC
32-bit and 64-bit syscall numbers are the same.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Users of the 32-bit ptrace() ABI expect the full 32-bit ABI. siginfo
translation should check ptrace() ABI, not caller task ABI.
This is an ABI change on SPARC. Let's hope that no one relied on the
old buggy ABI.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Seccomp wants to know the syscall bitness, not the caller task bitness,
when it selects the syscall whitelist.
As far as I know, this makes no difference on any architecture, so it's
not a security problem. (It generates identical code everywhere except
sparc, and, on sparc, the syscall numbering is the same for both ABIs.)
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sparc's syscall_get_arch was buggy: it returned the task arch, not the
syscall arch. This could confuse seccomp and audit.
I don't think this is as bad for seccomp as it looks: sparc's 32-bit and
64-bit syscalls are numbered the same.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On sparc64 compat-enabled kernels, any task can make 32-bit and 64-bit
syscalls. is_compat_task returns true in 32-bit tasks, which does not
necessarily imply that the current syscall is 32-bit.
Provide an in_compat_syscall implementation that checks whether the
current syscall is compat.
As far as I know, sparc is the only architecture on which is_compat_task
checks the compat status of the task and on which the compat status of a
syscall can differ from the compat status of the task. On x86,
is_compat_task checks the syscall type, not the task type.
[akpm@linux-foundation.org: add comment, per Sam]
[akpm@linux-foundation.org: update comment, per Andy]
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A lot of code currently abuses is_compat_task to determine this.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Airlie <airlied@linux.ie>
Cc: David Herrmann <dh.herrmann@googlemail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When new timeout is written to /proc/sys/kernel/hung_task_timeout_secs,
khungtaskd is interrupted and again sleeps for full timeout duration.
This means that hang task will not be checked if new timeout is written
periodically within old timeout duration and/or checking of hang task
will be delayed for up to previous timeout duration. Fix this by
remembering last time khungtaskd checked hang task.
This change will allow other watchdog tasks (if any) to share khungtaskd
by sleeping for minimal timeout diff of all watchdog tasks. Doing more
watchdog tasks from khungtaskd will reduce the possibility of printk()
collisions by multiple watchdog threads.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit b430e9d1c6 ("remove compressed copy from zram in-memory")
applied swap_slot_free_notify call in *end_swap_bio_read* to remove
duplicated memory between zram and memory.
However, with the introduction of rw_page in zram: 8c7f01025f ("zram:
implement rw_page operation of zram"), it became void because rw_page
doesn't need bio.
Memory footprint is really important in embedded platforms which have
small memory, for example, 512M) recently because it could start to kill
processes if memory footprint exceeds some threshold by LMK or some
similar memory management modules.
This patch restores the function for rw_page, thereby eliminating this
duplication.
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: karam.lee <karam.lee@lge.com>
Cc: <sangseok.lee@lge.com>
Cc: Chan Jeong <chan.jeong@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This document will describe OCFS2 online file check feature. OCFS2 is
often used in high-availaibility systems. However, OCFS2 usually
converts the filesystem to read-only when encounters an error. This may
not be necessary, since turning the filesystem read-only would affect
other running processes as well, decreasing availability.
Then, a mount option (errors=continue) is introduced, which would return
the -EIO errno to the calling process and terminate furhter processing
so that the filesystem is not corrupted further. The filesystem is not
converted to read-only, and the problematic file's inode number is
reported in the kernel log. The user can try to check/fix this file via
online filecheck feature.
Signed-off-by: Gang He <ghe@suse.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement online check or fix inode block during reading a inode block
to memory.
Signed-off-by: Gang He <ghe@suse.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Create online file check sysfile when ocfs2 mount, remove the related
sysfile when ocfs2 umount.
Signed-off-by: Gang He <ghe@suse.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement online file check sysfile interfaces, e.g. how to create the
related sysfile according to device name, how to display/handle file
check request from the sysfile.
Signed-off-by: Gang He <ghe@suse.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When there are errors in the ocfs2 filesystem, they are usually
accompanied by the inode number which caused the error. This inode
number would be the input to fixing the file. One of these options
could be considered:
A file in the sys filesytem which would accept inode numbers. This
could be used to communication back what has to be fixed or is fixed.
You could write:
$# echo "<inode>" > /sys/fs/ocfs2/devname/filecheck/check
or
$# echo "<inode>" > /sys/fs/ocfs2/devname/filecheck/fix
Compare with second version, I re-design filecheck sysfs interfaces,
there are three sysfs files (check, fix and set) under filecheck
directory (see above), sysfs will accept only one argument <inode>.
Second, I adjust some code in ocfs2_filecheck_repair_inode_block()
function according to upstream feedback, we cannot just add VALID_FL
flag back as a inode block fix, then we will not fix this field
corruption currently until having a complete solution. Compare with
first version, I use strncasecmp instead of double strncmp functions.
Second, update the source file contribution vendor.
This patch (of 4):
Export ocfs2_kset object from ocfs2_stackglue kernel module, then online
file check code will create the related sysfiles under ocfs2_kset
object. We're exporting this because it's built in ocfs2_stackglue.ko.
Signed-off-by: Gang He <ghe@suse.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The latency tracer format has a nice column to indicate IRQ state, but
this is not able to tell us about NMI state.
When tracing perf interrupt handlers (which often run in NMI context)
it is very useful to see how the events nest.
Link: http://lkml.kernel.org/r/20160318153022.105068893@infradead.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The trace_printk() code will allocate extra buffers if the compile detects
that a trace_printk() is used. To do this, the format of the trace_printk()
is saved to the __trace_printk_fmt section, and if that section is bigger
than zero, the buffers are allocated (along with a message that this has
happened).
If trace_printk() uses a format that is not a constant, and thus something
not guaranteed to be around when the print happens, the compiler optimizes
the fmt out, as it is not used, and the __trace_printk_fmt section is not
filled. This means the kernel will not allocate the special buffers needed
for the trace_printk() and the trace_printk() will not write anything to the
tracing buffer.
Adding a "__used" to the variable in the __trace_printk_fmt section will
keep it around, even though it is set to NULL. This will keep the string
from being printed in the debugfs/tracing/printk_formats section as it is
not needed.
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: 07d777fe8c "tracing: Add percpu buffers for trace_printk()"
Cc: stable@vger.kernel.org # v3.5+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Claudio Imbrenda says:
====================
AF_VSOCK: Shrink the area influenced by prepare_to_wait
This patchset applies on net-next.
I think I found a problem with the patch submitted by Laura Abbott
( https://lkml.org/lkml/2016/2/4/711 ): we might miss wakeups.
Since the condition is not checked between the prepare_to_wait and the
schedule(), if a wakeup happens after the condition is checked but before
the sleep happens, and we miss it. ( A description of the problem can be
found here: http://www.makelinux.net/ldd3/chp-6-sect-2 ).
The first patch reverts the previous broken patch, while the second patch
properly fixes the sleep-while-waiting issue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When a thread is prepared for waiting by calling prepare_to_wait, sleeping
is not allowed until either the wait has taken place or finish_wait has
been called. The existing code in af_vsock imposed unnecessary no-sleep
assumptions to a broad list of backend functions.
This patch shrinks the influence of prepare_to_wait to the area where it
is strictly needed, therefore relaxing the no-sleep restriction there.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 5988818008 ("vsock: Fix
blocking ops call in prepare_to_wait")
The commit reverted with this patch caused us to potentially miss wakeups.
Since the condition is not checked between the prepare_to_wait and the
schedule(), if a wakeup happens after the condition is checked but before
the sleep happens, we will miss it. ( A description of the problem can be
found here: http://www.makelinux.net/ldd3/chp-6-sect-2 ).
By reverting the patch, the behaviour is still incorrect (since we
shouldn't sleep between the prepare_to_wait and the schedule) but at least
it will not miss wakeups.
The next patch in the series actually fixes the behaviour.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Highlights include:
Features:
- Add support for multiple NFSv4.1 callbacks in flight
- Initial patchset for RPC multipath support
- Adapt RPC/RDMA to use the new completion queue API
Bugfixes and cleanups:
- nfs4: nfs4_ff_layout_prepare_ds should return NULL if connection failed
- Cleanups to remove nfs_inode_dio_wait and nfs4_file_fsync
- Fix RPC/RDMA credit accounting
- Properly handle RDMA_ERROR replies
- xprtrdma: Do not wait if ib_post_send() fails
- xprtrdma: Segment head and tail XDR buffers on page boundaries
- xprtrdma cleanups for dprintk, physical_op_map and unused macros
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJW8Y7MAAoJEGcL54qWCgDyMsMP+we8JSgfVqI5X1lKpU9aPWkI
D912ybtV58Kv0elKwYvQMqm+mRvdMNz1hZNJa4sAEaPVBOGfFjyZLy3xlDlr0HTf
M+Juh0FNLTcUh1obxJamjsbpNxfg4b6f/Z29KWRzahv/MlpMJVS3hLjpAEzCcTYr
WfOOovV6mragtsBINegGl/6jk/x2D22JDnKcTU+8ltVZGJtZe+HoqTFhUOrLO5qm
wR3YO22fbOuiZxCPoST06kMNiksYnYXnOju8RwlKwFYq3bWke0jWstQtIC4vKs6K
4u5o74aTBL5zMkJPnJuIfN2if4LJPptSr1n7pItbv3MLmgY1mWjE6N2BATpijfhQ
p+Gt/GHTAvswuWrmwySZKLj/Q8EkBuw4ohPFwLQ9eFHl2USoV3G9KQw7H0odR4d1
IyQPCag+suN2lWBreFkPIV48dZyeCVk6JmJmy3SN+d0L1t3jd6gwSO2UBgG7S2Gd
qVbdxYRiU/zYP6E5wFouLhIc1beSfe4vnJqvnuWZrIId+haTE2+OLi7772WGIkSe
xoZVTg7AX4Wu79ZyWoH+e9FnDvEsRkRVv7HQfpsMq2gynBWj70/KemEoeZnjqWaB
UOWcH8/vNLrnwlXTh0VHG6I8t3s0EXgqQB4//tYRLI42oIj35W2pIMnjYt52DeVB
Mo5mbYghtR9bgeoRQ6V4
=kC3t
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.6-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
Features:
- Add support for multiple NFSv4.1 callbacks in flight
- Initial patchset for RPC multipath support
- Adapt RPC/RDMA to use the new completion queue API
Bugfixes and cleanups:
- nfs4: nfs4_ff_layout_prepare_ds should return NULL if connection failed
- Cleanups to remove nfs_inode_dio_wait and nfs4_file_fsync
- Fix RPC/RDMA credit accounting
- Properly handle RDMA_ERROR replies
- xprtrdma: Do not wait if ib_post_send() fails
- xprtrdma: Segment head and tail XDR buffers on page boundaries
- xprtrdma cleanups for dprintk, physical_op_map and unused macros"
* tag 'nfs-for-4.6-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (35 commits)
nfs/blocklayout: make sure making a aligned read request
nfs4: nfs4_ff_layout_prepare_ds should return NULL if connection failed
nfs: remove nfs_inode_dio_wait
nfs: remove nfs4_file_fsync
xprtrdma: Use new CQ API for RPC-over-RDMA client send CQs
xprtrdma: Use an anonymous union in struct rpcrdma_mw
xprtrdma: Use new CQ API for RPC-over-RDMA client receive CQs
xprtrdma: Serialize credit accounting again
xprtrdma: Properly handle RDMA_ERROR replies
rpcrdma: Add RPCRDMA_HDRLEN_ERR
xprtrdma: Do not wait if ib_post_send() fails
xprtrdma: Segment head and tail XDR buffers on page boundaries
xprtrdma: Clean up dprintk format string containing a newline
xprtrdma: Clean up physical_op_map()
xprtrdma: Clean up unused RPCRDMA_INLINE_PAD_THRESH macro
NFS add callback_ops to nfs4_proc_bind_conn_to_session_callback
pnfs/NFSv4.1: Add multipath capabilities to pNFS flexfiles servers over NFSv3
SUNRPC: Allow addition of new transports to a struct rpc_clnt
NFSv4.1: nfs4_proc_bind_conn_to_session must iterate over all connections
SUNRPC: Make NFS swap work with multipath
...
Pull overlayfs updates from Miklos Szeredi:
"Various fixes and tweaks"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: cleanup unused var in rename2
ovl: rename is_merge to is_lowest
ovl: fixed coding style warning
ovl: Ensure upper filesystem supports d_type
ovl: Warn on copy up if a process has a R/O fd open to the lower file
ovl: honor flag MS_SILENT at mount
ovl: verify upper dentry before unlink and rename
The driver calls gpiod_set_value() with GPIOD_OUT_* instead of 0 and 1, as
a result the PHY isn't really put back into reset state in macb_remove().
Moreover, the driver assumes that something else has set the GPIO direction
to output, so if it has not, the PHY may not be taken out of reset in
macb_probe() either...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull fuse update from Miklos Szeredi:
"This contains direct I/O fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: return patrial success from fuse_direct_io()
fuse: Add reference counting for fuse_io_priv
fuse: do not use iocb after it may have been freed
Field fl4.flowi4_flags is not initialized in fib_compute_spec_dst()
before calling fib_lookup(), which means fib_table_lookup() is
using non-deterministic data at this line:
if (!(flp->flowi4_flags & FLOWI_FLAG_SKIP_NH_OIF)) {
Fix by initializing the entire fl4 structure, which will prevent
similar issues as fields are added in the future by ensuring that
all fields are initialized to zero unless explicitly initialized
to another value.
Fixes: 58189ca7b2 ("net: Fix vti use case with oif in dst lookups")
Suggested-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Errata A-007273 (For FMan V3 devices only):
FMan soft reset is not finished properly if one
of the Ethernet MAC clocks is disabled
Workaround:
Re-enable all disabled MAC clocks through the DCFG_CCSR_DEVDISR2
register prior to issuing an FMAN soft reset.
Re-disable the MAC clocks after the FMAN soft reset is done.
Signed-off-by: Igal Liberman <igal.liberman@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Make earlyprintk=xen work for HVM guests.
- Remove module support for things never built as modules.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJW8YhlAAoJEFxbo/MsZsTRiCgH/3GnL93s/BsRTrA/DYykYqA4
rPPkDd6WMlEbrko6FY7o4IxUAZ50LWU8wjmDNVTT+R/VKlWCVh2+ksxjbH8e0nzr
0QohAYTdG1VGmhkrwTjKj0wC3Nr8vaDqoBXAFRz4DiQdbobn12wzXjaVrbl8RRNc
oHDqR+DfZj6ontqx8oFkkTk7CFKIGEOtm/uBhCAV2fNkQSUCzuQyKLlarxM6jq/2
EnLR1bhnyoy5zNFzXxmaJgZOit8QFMKvPdDRknlp9yDHYJ5zolJkv+6FPiAG6SZs
A8yAUPQoAD+ekAiV2DIhb8qoNNNdZXdRCFBpoudwX3GyyU4O2P5Intl0xEg17Nk=
=L8S4
-----END PGP SIGNATURE-----
Merge tag 'for-linus-4.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from David Vrabel:
"Features and fixes for 4.6:
- Make earlyprintk=xen work for HVM guests
- Remove module support for things never built as modules"
* tag 'for-linus-4.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
drivers/xen: make platform-pci.c explicitly non-modular
drivers/xen: make sys-hypervisor.c explicitly non-modular
drivers/xen: make xenbus_dev_[front/back]end explicitly non-modular
drivers/xen: make [xen-]ballon explicitly non-modular
xen: audit usages of module.h ; remove unnecessary instances
xen/x86: Drop mode-selecting ifdefs in startup_xen()
xen/x86: Zero out .bss for PV guests
hvc_xen: make early_printk work with HVM guests
hvc_xen: fix xenboot for DomUs
hvc_xen: add earlycon support
Currently, ingress ipv4 broadcast datagrams are dropped since,
in udp_v4_early_demux(), ip_check_mc_rcu() is invoked even on
bcast packets.
This patch addresses the issue, invoking ip_check_mc_rcu()
only for mcast packets.
Fixes: 6e54030932 ("ipv4/udp: Verify multicast group is ours in upd_v4_early_demux()")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull i2c updates from Wolfram Sang:
"Mostly usual driver updates and improvements. The changelog should
give an idea. Standing out is the i2c-qup driver with lots of new
capabilities and we also have now an i2c-demuxer.
I'd especially like to welcome Peter Rosin as the i2c-mux maintainer.
He has an interesting series for muxes in the queue and agreed to look
after this part of the subsystem. Thank you, Peter, and welcome
again!
The octeon changes were applied pretty recently before the merge
window. I am aware. They are the first (and relatively simple)
patches of a larger overhaul to this driver. In case something goes
wrong with them, they are easy to fix (or revert). The advantage I
see is that they are out of the way, and I can concentrate on the next
block of patches. I really would like to apply the overhaul in
smaller batches to avoid regressions. And waiting a cycle for the
introductory patches seemed too much of a delay for me"
* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (39 commits)
i2c: octeon: Support I2C_M_RECV_LEN
i2c: octeon: Cleanup resource allocation code
i2c: octeon: Cleanup i2c-octeon driver
MAINTAINERS: add Peter Rosin as i2c mux maintainer
dt-bindings: i2c: Spelling s/propoerty/property/
i2c: immediately mark ourselves as registered
i2c: i801: sort IDs alphabetically
MAINTAINERS: Mika and me are designated reviewers for I2C DESIGNWARE
i2c: octeon: Cleanup kerneldoc comments
i2c: do not use internal data from driver core
i2c: cadence: Fix the kernel-doc warnings
i2c: imx: remove extra spaces.
i2c: rcar: don't open code of_device_get_match_data()
i2c: qup: Fix fifo handling after adding V2 support
i2c: xiic: Implement power management
i2c: piix4: Pre-shift the port number
i2c: piix4: Always use the same type for port
i2c: piix4: Support alternative port selection register
i2c: tegra: don't open code of_device_get_match_data()
i2c: riic, sh_mobile, rcar: Use ARCH_RENESAS
...
Yisen Zhuang says:
====================
net: hns: bugs fixed for hns
This series includes some bug fixes and updates for hns driver.
>from Daode, one fix about mss.
>from Kejian, one fix about ping6 issue, one fix about mac address setting,
two fix for RSS setting, two fix about mtu setting.
>from qianqian, fixed HNS v2 xge statistic reg issue.
>from Sheng, one fix about manage packets sending, one fix about GMACs mac
setting.
For more details, please see individual patches.
Thanks a lot!
---
change log:
Series V2:
- fix the comments as below:
1) modifies the wrong charator "whick" to "which" in commit log
2) use the "eth_hdr()" help to get source mac of packets
3) fix the wrong cast
4) use tabs instead of spaces to indent the value
Series V1:
- first submit
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When set MTU to the minimum value 68, there are increasing number
of error packets occur, which is caused by the overflowed value of
mss. This patch fix the bug.
Signed-off-by: Daode Huang <huangdaode@hisilicon.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If mtu for debug port is set more than 1500, it may cause that packets
are dropped by ppe. So maximum value for debug port should be 1500.
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In chip V1, the maximum mtu value is 9600. But in chip V2, it is 9728.
And it is always configurates as 9600 before this patch.
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If trying to get receive flow hash indirection table by ethtool, it needs
to call .get_rxnfc to get ring number first. So this patch implements the
.get_rxnfc of ethtool. And the data type of rss_indir_table is u32, it has
to be multiply by the width of data type when using memcpy.
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both .get_rxfh and .set_rxfh are always return 0, it should return result
from hardware when getting or setting rss. And the rss function should
return the correct data type.
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As the user manual of HNS V2 describs, XGE_DFX_CTRL_CFG.xge_dfx_ctrl_cfg
should be configed as zero if we want xge statistic reg to be read only.
But HNS V1 gets the other meanings. It needs to be identified the process
and then config it rightly.
Signed-off-by: Qianqian Xie <xieqianqian@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When sending a pause frame out from GMACs, the packets' source MAC address
does not match the GMACs' MAC address. It causes by the condition before
the mac address setting routine for GMACs, the mac address cannot be set
into loacal mac table for service ports. It obviously the condition needs
to be deleted.
Signed-off-by: Sheng Li <lisheng011@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Debug ports receives lots of packets with dest mac addr does not match
local mac addr, because the filter is close, and it does not drop the
useless packets. This patch adds ON/OFF switch of filtering the packets
whose dest mac addr do not match the local addr in mac table. And the
switch is ON in initialization.
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In chip V2, the default value of port id in tx BD is Zero. If it is not
configurated to the other value, all management packets will be sent out
from port0. So port_id in the tx BD needs to be updated when sending a
management packet.
In V2 chip, when sending mamagement packets, the driver should
config the port id to BD descs.
Signed-off-by: Sheng Li <lisheng011@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current upstreaming code fails to ping other IPv6 net device, because
the enet receives the multicast packets with the src mac addr which is the
same as its mac addr. These packets need to be dropped.
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct in6_addr isn't used anymore in inet6_connection_sock.h, removing
the forward declaration.
Fixes: 1b33bc3e9e ("ipv6: remove obsolete inet6 functions")
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
By returning -ENOIOCTLCMD, sock_do_ioctl() falls back to calling
dev_ioctl(), which provides support for NIC driver ioctls, which
includes ethtool support. This is similar to the way ioctls are handled
in udp.c or tcp.c.
This removes the requirement that ethtool for example be tied to the
support of a specific L3 protocol (ethtool uses an AF_INET socket
today).
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Updates: commit 793cf87de9 ("ethtool: Set cmd field in
ETHTOOL_GLINKSETTINGS response to wrong nwords")
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull SCSI target updates from Nicholas Bellinger:
"The highlights this round include:
- Add target_alloc_session() w/ callback helper for doing se_session
allocation + tag + se_node_acl lookup. (HCH + nab)
- Tree-wide fabric driver conversion to use target_alloc_session()
- Convert sbp-target to use percpu_ida tag pre-allocation, and
TARGET_SCF_ACK_KREF I/O krefs (Chris Boot + nab)
- Convert usb-gadget to use percpu_ida tag pre-allocation, and
TARGET_SCF_ACK_KREF I/O krefs (Andrzej Pietrasiewicz + nab)
- Convert xen-scsiback to use percpu_ida tag pre-allocation, and
TARGET_SCF_ACK_KREF I/O krefs (Juergen Gross + nab)
- Convert tcm_fc to use TARGET_SCF_ACK_KREF I/O + TMR krefs
- Convert ib_srpt to use percpu_ida tag pre-allocation
- Add DebugFS node for qla2xxx target sess list (Quinn)
- Rework iser-target connection termination (Jenny + Sagi)
- Convert iser-target to new CQ API (HCH)
- Add pass-through WRITE_SAME support for IBLOCK (Mike Christie)
- Introduce data_bitmap for asynchronous access of data area (Sheng
Yang + Andy)
- Fix target_release_cmd_kref shutdown comp leak (Himanshu Madhani)
Also, there is a separate PULL request coming for cxgb4 NIC driver
prerequisites for supporting hw iscsi segmentation offload (ISO), that
will be the base for a number of v4.7 developments involving
iscsi-target hw offloads"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (36 commits)
target: Fix target_release_cmd_kref shutdown comp leak
target: Avoid DataIN transfers for non-GOOD SAM status
target/user: Report capability of handling out-of-order completions to userspace
target/user: Fix size_t format-spec build warning
target/user: Don't free expired command when time out
target/user: Introduce data_bitmap, replace data_length/data_head/data_tail
target/user: Free data ring in unified function
target/user: Use iovec[] to describe continuous area
target: Remove enum transport_lunflags_table
target/iblock: pass WRITE_SAME to device if possible
iser-target: Kill the ->isert_cmd back pointer in struct iser_tx_desc
iser-target: Kill struct isert_rdma_wr
iser-target: Convert to new CQ API
iser-target: Split and properly type the login buffer
iser-target: Remove ISER_RECV_DATA_SEG_LEN
iser-target: Remove impossible condition from isert_wait_conn
iser-target: Remove redundant wait in release_conn
iser-target: Rework connection termination
iser-target: Separate flows for np listeners and connections cma events
iser-target: Add new state ISER_CONN_BOUND to isert_conn
...