If an anonymous mapping is not allowed to fault thp memory and then
madvise(MADV_HUGEPAGE) is used after fault, khugepaged will never
collapse this memory into thp memory.
This occurs because the madvise(2) handler for thp, hugepage_madvise(),
clears VM_NOHUGEPAGE on the stack and it isn't stored in vma->vm_flags
until the final action of madvise_behavior(). This causes the
khugepaged_enter_vma_merge() to be a no-op in hugepage_madvise() when
the vma had previously had VM_NOHUGEPAGE set.
Fix this by passing the correct vma flags to the khugepaged mm slot
handler. There's no chance khugepaged can run on this vma until after
madvise_behavior() returns since we hold mm->mmap_sem.
It would be possible to clear VM_NOHUGEPAGE directly from vma->vm_flags
in hugepage_advise(), but I didn't want to introduce special case
behavior into madvise_behavior(). I think it's best to just let it
always set vma->vm_flags itself.
Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Suleiman Souhlal <suleiman@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds a new hardware flag which indicate that the transceiver
doesn't support check for bad checksum via hardware. Also add a handling of
this while receive.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch splits the IEEE802154_HW_OMIT_CKSUM hardware flag into
IEEE802154_HW_TX_OMIT_CKSUM and IEEE802154_HW_RX_OMIT_CKSUM. This is
useful to deliver the received crc from the driver layer to the monitor
interface. At the moment we can't do that without change the xmit
handling.
The received checksum should be visible in monitor mode only.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch adds a new driver operation to bring the transceiver into
promiscuous mode.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Overdue introduction for address filtering hardware flag. Furthermore we
will check and set address filtering on interface up. This patch
prepares that we can check if an transceiver supports address filtering
option. Currently all mainline driver supports hardware address filtering.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Cc: Alan Ott <alan@signal11.us>
Cc: Varka Bhadram <varkabhadram@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch adds a new IEEE802154_HW_ARET hardware flag for indicating
that the transceiver supports ARET handling. Also remove the
IEEE802154_HW_FRAME_RETRIES from IEEE802154_HW_CSMA flag. Frame retries
handling is part of ARET.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch removes tabs after define in hardware flags declarations.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
In neigh_parms_release() we loop over all entries to find the entry given in
argument and being able to remove it from the list. By using a double linked
list, we can avoid this loop.
Here are some numbers with 30 000 dummy interfaces configured:
Before the patch:
$ time rmmod dummy
real 2m0.118s
user 0m0.000s
sys 1m50.048s
After the patch:
$ time rmmod dummy
real 1m9.970s
user 0m0.000s
sys 0m47.976s
Suggested-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
napi_schedule() can be called from any context and has to mask hard
irqs.
Add a variant that can only be called from hard interrupts handlers
or when irqs are already masked.
Many NIC drivers can use it from their hard IRQ handler instead of
generic variant.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a sysctl that causes an interface's optimistic addresses
to be considered equivalent to other non-deprecated addresses
for source address selection purposes. Preferred addresses
will still take precedence over optimistic addresses, subject
to other ranking in the source address selection algorithm.
This is useful where different interfaces are connected to
different networks from different ISPs (e.g., a cell network
and a home wifi network).
The current behaviour complies with RFC 3484/6724, and it
makes sense if the host has only one interface, or has
multiple interfaces on the same network (same or cooperating
administrative domain(s), but not in the multiple distinct
networks case.
For example, if a mobile device has an IPv6 address on an LTE
network and then connects to IPv6-enabled wifi, while the wifi
IPv6 address is undergoing DAD, IPv6 connections will try use
the wifi default route with the LTE IPv6 address, and will get
stuck until they time out.
Also, because optimistic nodes can receive frames, issue
an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC
flag appropriately set). A second RTM_NEWADDR is sent if DAD
completes (the address flags have changed), otherwise an
RTM_DELADDR is sent.
Also: add an entry in ip-sysctl.txt for optimistic_dad.
Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
While testing upcoming Yaogong patch (converting out of order queue
into an RB tree), I hit the max reordering level of linux TCP stack.
Reordering level was limited to 127 for no good reason, and some
network setups [1] can easily reach this limit and get limited
throughput.
Allow a new max limit of 300, and add a sysctl to allow admins to even
allow bigger (or lower) values if needed.
[1] Aggregation of links, per packet load balancing, fabrics not doing
deep packet inspections, alternative TCP congestion modules...
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yaogong Wang <wygivan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull block layer fixes from Jens Axboe:
"A small collection of fixes for the current kernel. This contains:
- Two error handling fixes from Jan Kara. One for null_blk on
failure to add a device, and the other for the block/scsi_ioctl
SCSI_IOCTL_SEND_COMMAND fixing up the error jump point.
- A commit added in the merge window for the bio integrity bits
unfortunately disabled merging for all requests if
CONFIG_BLK_DEV_INTEGRITY wasn't set. Reverse the logic, so that
integrity checking wont disallow merges when not enabled.
- A fix from Ming Lei for merging and generating too many segments.
This caused a BUG in virtio_blk.
- Two error handling printk() fixups from Robert Elliott, improving
the information given when we rate limit.
- Error handling fixup on elevator_init() failure from Sudip
Mukherjee.
- A fix from Tony Battersby, fixing up a memory leak in the
scatterlist handling with scsi-mq"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: Fix merge logic when CONFIG_BLK_DEV_INTEGRITY is not defined
lib/scatterlist: fix memory leak with scsi-mq
block: fix wrong error return in elevator_init()
scsi: Fix error handling in SCSI_IOCTL_SEND_COMMAND
null_blk: Cleanup error recovery in null_add_dev()
blk-merge: recaculate segment if it isn't less than max segments
fs: clarify rate limit suppressed buffer I/O errors
fs: merge I/O error prints into one line
Pull HID fixes from Jiri Kosina:
- workarounds for a couple of misbehaving Elan Touchscreens, by Adel
Gadllah
- fix for TransducerSerialNumber field implementation, by Jason Gerecke
- a couple of new HID usages (added by HUT), by Olivier Gay
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: input: Fix TransducerSerialNumber implementation
HID: add keyboard input assist hid usages
HID: usbhid: enable always-poll quirk for Elan Touchscreen 016f
HID: usbhid: enable always-poll quirk for Elan Touchscreen 009b
PREEMPT_RCU and TREE_PREEMPT_RCU serve the same function after
TINY_PREEMPT_RCU has been removed. This patch removes TREE_PREEMPT_RCU
and uses PREEMPT_RCU config option in its place.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Drivers can now tell blk-mq if they take advantage of the deferred
issue through 'last' or not. If they do, don't do queue-direct
for sync IO. This is a preparation patch for the nvme conversion.
Signed-off-by: Jens Axboe <axboe@fb.com>
Since we have the notion of a 'last' request in a chain, we can use
this to have the hardware optimize the issuing of requests. Add
a list_head parameter to queue_rq that the driver can use to
temporarily store hw commands for issue when 'last' is true. If we
are doing a chain of requests, pass in a NULL list for the first
request to force issue of that immediately, then batch the remainder
for deferred issue until the last request has been sent.
Instead of adding yet another argument to the hot ->queue_rq path,
encapsulate the passed arguments in a blk_mq_queue_data structure.
This is passed as a constant, and has been tested as faster than
passing 4 (or even 3) args through ->queue_rq. Update drivers for
the new ->queue_rq() prototype. There are no functional changes
in this patch for drivers - if they don't use the passed in list,
then they will just queue requests individually like before.
Signed-off-by: Jens Axboe <axboe@fb.com>
This is the counterpart to get_seconds() based on CLOCK_MONOTONIC. The
use case for this interface are kernel internal coarse grained
timestamps which do neither require the nanoseconds fraction of
current time nor the CLOCK_REALTIME properties. Such timestamps can
currently only retrieved by calling ktime_get_ts64() and using the
tv_sec field of the returned timespec64. That's inefficient as it
involves the read of the clocksource, math operations and must be
protected by the timekeeper sequence counter.
To avoid the sequence counter protection we restrict the return value
to unsigned 32bit on 32bit machines. This covers ~136 years of uptime
and therefor an overflow is not expected to hit anytime soon.
To avoid math in the function we calculate the current seconds portion
of CLOCK_MONOTONIC when the timekeeper gets updated in
tk_update_ktime_data() similar to the CLOCK_REALTIME counterpart
xtime_sec.
[ tglx: Massaged changelog, simplified and commented the update
function, added docbook comment ]
Signed-off-by: Heena Sirwani <heenasirwani@gmail.com>
Reviewed-by: Arnd Bergman <arnd@arndb.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: opw-kernel@googlegroups.com
Link: http://lkml.kernel.org/r/da0b63f4bdf3478909f92becb35861197da3a905.1414578445.git.heenasirwani@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Several benefits here:
- we can drop the macro is_dj_device: I never been really conviced by
this macro as we could fall into a null pointer anytime. Anyway time
showed that this never happened.
- we can simplify the hid driver logitech-djdevice, and make it aware
of any new receiver VID/PID.
- we can use the Wireless PID of the DJ device as the product id of the
hid device, this way the sysfs will differentiate between different
DJ devices.
Signed-off-by: Benjamin Tisssoires <benjamin.tissoires@redhat.com>
Tested-by: Andrew de los Reyes <adlr@chromium.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
While merging wacom from the input to the hid tree, some
comments have been duplicated. We can also integrate the
test for Synaptics devices in the switch case below, so
it is clear that there will be only one place for such
quirks.
No functional changes are expected in this commit.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
so that make htmldocs works properly.
Since kerneldoc can't handle noname enum properly, name enum
sndrv_compress_encoder.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Commit 4eaf99bead switched to returning bool and as a result reversed
the logic of the integrity merge checks. However, the empty stubs used
when the block integrity code is compiled out were still returning
0. Make these stubs return "true".
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Tested-by: Michael L. Semon <mlsemon35@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
In an overlay directory that shadows an empty lower directory, say
/mnt/a/empty102, do:
touch /mnt/a/empty102/x
unlink /mnt/a/empty102/x
rmdir /mnt/a/empty102
It's actually harmless, but needs another level of nesting between
I_MUTEX_CHILD and I_MUTEX_NORMAL.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Although rcu_dereference() and friends can be used in situations where
object lifetimes are being managed by something other than RCU, the
resulting sparse and lockdep-RCU noise can be annoying. This commit
therefore supplies a lockless_dereference(), which provides the
protection for dereferences without the RCU-related debugging noise.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This patch removes the driver ops callbacks inside of wpan_phy struct.
It was used to check if a phy supports this driver ops call. We do this
now via hardware flags.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The range of channel and page fits into an unsigned byte range. This
patch changes the set_channel parameter definitions for channel and
page to u8.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Cc: Alan Ott <alan@signal11.us>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The ieee802154_ops structure should be never changed during runtime.
This patch declare this structure as const to avoid a runtime change.
Signed-off-by: Alexander Aring <alex.aring@gmail.com>
Cc: Alan Ott <alan@signal11.us>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Added 100M, 20G and 56G ethtool speed reporting support.
Update mlx4_en_test_speed self test with the new speeds.
Defined new link speeds in include/uapi/linux/ethtool.h:
+#define SPEED_20000 20000
+#define SPEED_40000 40000
+#define SPEED_56000 56000
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding ACCESS REG mlx4 command and use it to implement Query method for
PTYS (Port Type and Speed Register).
Query and store eth_prot_ctrl dev cap.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added support for get_module_info/get_module_eeprom ethtool support for cable info reading.
Added new cable types enum in include/uapi/linux/ethtool.h for ethtool use.
+#define ETH_MODULE_SFF_8636 0x3
+#define ETH_MODULE_SFF_8636_LEN 256
+#define ETH_MODULE_SFF_8436 0x4
+#define ETH_MODULE_SFF_8436_LEN 256
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added new MAD_IFC command to read cable module info with attribute id (0xFF60).
Update include/linux/mlx4/device.h with function declaration (mlx4_get_module_info)
and the needed defines/enums for future use.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The original motivation for this change was to allow the helper to be used
in files other than actions.c as part of work on an odp select group
action.
It was as pointed out by Thomas Graf that this helper would be best off
living in netlink.h. Furthermore, I think that the generic nature of this
helper means it is best off in netlink.h regardless of if it is used more
than one .c file or not. Thus, I would like it considered independent of
the work on an odp select group action.
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: Andy Zhou <azhou@nicira.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix kernel-doc warning in <linux/skbuff.h> by making both headers_start
and headers_end private fields.
Warning(..//include/linux/skbuff.h:654): No description found for parameter 'headers_end[0]'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 35ce7f29a4 (rcu: Create rcuo kthreads only for onlined CPUs)
avoids creating rcuo kthreads for CPUs that never come online. This
fixes a bug in many instances of firmware: Instead of lying about their
age, these systems instead lie about the number of CPUs that they have.
Before commit 35ce7f29a4, this could result in huge numbers of useless
rcuo kthreads being created.
It appears that experience indicates that I should have told the
people suffering from this problem to fix their broken firmware, but
I instead produced what turned out to be a partial fix. The missing
piece supplied by this commit makes sure that rcu_barrier() knows not to
post callbacks for no-CBs CPUs that have not yet come online, because
otherwise rcu_barrier() will hang on systems having firmware that lies
about the number of CPUs.
It is tempting to simply have rcu_barrier() refuse to post a callback on
any no-CBs CPU that does not have an rcuo kthread. This unfortunately
does not work because rcu_barrier() is required to wait for all pending
callbacks. It is therefore required to wait even for those callbacks
that cannot possibly be invoked. Even if doing so hangs the system.
Given that posting a callback to a no-CBs CPU that does not yet have an
rcuo kthread can hang rcu_barrier(), It is tempting to report an error
in this case. Unfortunately, this will result in false positives at
boot time, when it is perfectly legal to post callbacks to the boot CPU
before the scheduler has started, in other words, before it is legal
to invoke rcu_barrier().
So this commit instead has rcu_barrier() avoid posting callbacks to
CPUs having neither rcuo kthread nor pending callbacks, and has it
complain bitterly if it finds CPUs having no rcuo kthread but some
pending callbacks. And when rcu_barrier() does find CPUs having no rcuo
kthread but pending callbacks, as noted earlier, it has no choice but
to hang indefinitely.
Reported-by: Yanko Kaneti <yaneti@declera.com>
Reported-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Reported-by: Meelis Roos <mroos@linux.ee>
Reported-by: Eric B Munson <emunson@akamai.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Eric B Munson <emunson@akamai.com>
Tested-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Tested-by: Yanko Kaneti <yaneti@declera.com>
Tested-by: Kevin Fenzi <kevin@scrye.com>
Tested-by: Meelis Roos <mroos@linux.ee>
this patch uses vb2_is_busy() helper to check if streaming is
actually started, instead of driver managing it.
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
This function actually operates on a gpio_chip, so its prefix should
reflect that fact for consistency with other functions defined in
gpio/driver.h.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>