If an attacker can cause a controlled kernel stack overflow, overwriting
the restart block is a very juicy exploit target. This is because the
restart_block is held in the same memory allocation as the kernel stack.
Moving the restart block to struct task_struct prevents this exploit by
making the restart_block harder to locate.
Note that there are other fields in thread_info that are also easy
targets, at least on some architectures.
It's also a decent simplification, since the restart code is more or less
identical on all architectures.
[james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Miller <davem@davemloft.net>
Acked-by: Richard Weinberger <richard@nod.at>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking updates from David Miller:
1) More iov_iter conversion work from Al Viro.
[ The "crypto: switch af_alg_make_sg() to iov_iter" commit was
wrong, and this pull actually adds an extra commit on top of the
branch I'm pulling to fix that up, so that the pre-merge state is
ok. - Linus ]
2) Various optimizations to the ipv4 forwarding information base trie
lookup implementation. From Alexander Duyck.
3) Remove sock_iocb altogether, from CHristoph Hellwig.
4) Allow congestion control algorithm selection via routing metrics.
From Daniel Borkmann.
5) Make ipv4 uncached route list per-cpu, from Eric Dumazet.
6) Handle rfs hash collisions more gracefully, also from Eric Dumazet.
7) Add xmit_more support to r8169, e1000, and e1000e drivers. From
Florian Westphal.
8) Transparent Ethernet Bridging support for GRO, from Jesse Gross.
9) Add BPF packet actions to packet scheduler, from Jiri Pirko.
10) Add support for uniqu flow IDs to openvswitch, from Joe Stringer.
11) New NetCP ethernet driver, from Muralidharan Karicheri and Wingman
Kwok.
12) More sanely handle out-of-window dupacks, which can result in
serious ACK storms. From Neal Cardwell.
13) Various rhashtable bug fixes and enhancements, from Herbert Xu,
Patrick McHardy, and Thomas Graf.
14) Support xmit_more in be2net, from Sathya Perla.
15) Group Policy extensions for vxlan, from Thomas Graf.
16) Remove Checksum Offload support for vxlan, from Tom Herbert.
17) Like ipv4, support lockless transmit over ipv6 UDP sockets. From
Vlad Yasevich.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1494+1 commits)
crypto: fix af_alg_make_sg() conversion to iov_iter
ipv4: Namespecify TCP PMTU mechanism
i40e: Fix for stats init function call in Rx setup
tcp: don't include Fast Open option in SYN-ACK on pure SYN-data
openvswitch: Only set TUNNEL_VXLAN_OPT if VXLAN-GBP metadata is set
ipv6: Make __ipv6_select_ident static
ipv6: Fix fragment id assignment on LE arches.
bridge: Fix inability to add non-vlan fdb entry
net: Mellanox: Delete unnecessary checks before the function call "vunmap"
cxgb4: Add support in cxgb4 to get expansion rom version via ethtool
ethtool: rename reserved1 memeber in ethtool_drvinfo for expansion ROM version
net: dsa: Remove redundant phy_attach()
IB/mlx4: Reset flow support for IB kernel ULPs
IB/mlx4: Always use the correct port for mirrored multicast attachments
net/bonding: Fix potential bad memory access during bonding events
tipc: remove tipc_snprintf
tipc: nl compat add noop and remove legacy nl framework
tipc: convert legacy nl stats show to nl compat
tipc: convert legacy nl net id get to nl compat
tipc: convert legacy nl net id set to nl compat
...
Pull timer updates from Ingo Molnar:
"The main changes in this cycle were:
- rework hrtimer expiry calculation in hrtimer_interrupt(): the
previous code had a subtle bug where expiry caching would miss an
expiry, resulting in occasional bogus (late) expiry of hrtimers.
- continuing Y2038 fixes
- ktime division optimization
- misc smaller fixes and cleanups"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
hrtimer: Make __hrtimer_get_next_event() static
rtc: Convert rtc_set_ntp_time() to use timespec64
rtc: Remove redundant rtc_valid_tm() from rtc_hctosys()
rtc: Modify rtc_hctosys() to address y2038 issues
rtc: Update rtc-dev to use y2038-safe time interfaces
rtc: Update interface.c to use y2038-safe time interfaces
time: Expose get_monotonic_boottime64 for in-kernel use
time: Expose getboottime64 for in-kernel uses
ktime: Optimize ktime_divns for constant divisors
hrtimer: Prevent stale expiry time in hrtimer_interrupt()
ktime.h: Introduce ktime_ms_delta
I noticed some CLOCK_TAI timer test failures on one of my
less-frequently used configurations. And after digging in I
found in 76f4108892 (Cleanup hrtimer accessors to the
timekepeing state), the hrtimer_get_softirq_time tai offset
calucation was incorrectly rewritten, as the tai offset we
return shold be from CLOCK_MONOTONIC, and not CLOCK_REALTIME.
This results in CLOCK_TAI timers expiring early on non-highres
capable machines.
This patch fixes the issue, calculating the tai time properly
from the monotonic base.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable <stable@vger.kernel.org> # 3.17+
Link: http://lkml.kernel.org/r/1423097126-10236-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Conflicts:
arch/arm/boot/dts/imx6sx-sdb.dts
net/sched/cls_bpf.c
Two simple sets of overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull timer fixes from Thomas Gleixner:
"A set of small fixes:
- regression fix for exynos_mct clocksource
- trivial build fix for kona clocksource
- functional one liner fix for the sh_tmu clocksource
- two validation fixes to prevent (root only) data corruption in the
kernel via settimeofday and adjtimex. Tagged for stable"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: adjtimex: Validate the ADJ_FREQUENCY values
time: settimeofday: Validate the values of tv from user
clocksource: sh_tmu: Set cpu_possible_mask to fix SMP broadcast
clocksource: kona: fix __iomem annotation
clocksource: exynos_mct: Fix bitmask regression for exynos4_mct_write
kernel/time/hrtimer.c:444:9: sparse: symbol '__hrtimer_get_next_event' was not declared. Should it be static?
Fixes: 9bc7491906 hrtimer: Prevent stale expiry time in hrtimer_interrupt()
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: kbuild-all@01.org
Link: http://lkml.kernel.org/r/20150123121206.GA4766@snb
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* ktime division optimization
* Expose a few more y2038-safe timekeeping interfaces
* RTC core changes to address y2038
Signed-off-by: John Stultz <john.stultz@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUwvXJAAoJEK8vClot3jMxTAoH/1DMT3fuVx6RFjKJ/P1abIB+
+w3cfEgEWgkSwYmuS0XHq1WppnQ0p0n1GOJcWUPiP9tTGrKcTdp5uG5qMprcga3q
XoeR8wefkyEKyH4ukStdGKQKot2Vj117TauDtVNPf2eOOBS5pqOw1dYUlwjlMtOj
45poW5ORNKmBMn90e22k8nlNSI9PebvMh9w6nzeYJWEibdyk96z2TOk1puPTvws/
ppyNzlhnKckpNb49JVxE8B4DNRpXsUV+aUxRNyRPN4OdqCGzHwIJCyEKi6+nbRyb
4HMUhfl8eRB2Iu7zHF2a2XEOqJdOjl8i1DsTwr3Vwd3crf4XkXD6WtTtGl2YKkU=
=YhDu
-----END PGP SIGNATURE-----
Merge tag 'fortglx-3.20-time' of https://git.linaro.org/people/john.stultz/linux into timers/core
Pull time updates from John Stultz for 3.20:
* ktime division optimization
* Expose a few more y2038-safe timekeeping interfaces
* RTC core changes to address y2038
rtc_set_ntp_time() uses timespec which is y2038-unsafe,
so modify to use timespec64 which is y2038-safe, then
replace rtc_time_to_tm() with rtc_time64_to_tm().
Also adjust all its call sites(only NTP uses it) accordingly.
Cc: pang.xunlei <pang.xunlei@linaro.org>
Cc: Arnd Bergmann <arnd.bergmann@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Adds a timespec64 based getboottime64() implementation
that can be used as we convert internal users of
getboottime away from using timespecs.
Cc: pang.xunlei <pang.xunlei@linaro.org>
Cc: Arnd Bergmann <arnd.bergmann@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
At least on ARM, do_div() is optimized to turn constant divisors into
an inline multiplication by the reciprocal value at compile time.
However this optimization is missed entirely whenever ktime_divns() is
used and the slow out-of-line division code is used all the time.
Let ktime_divns() use do_div() inline whenever the divisor is constant
and small enough. This will make things like ktime_to_us() and
ktime_to_ms() much faster.
Cc: Arnd Bergmann <arnd.bergmann@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Nicolas Pitre <nico@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
hrtimer_interrupt() has the following subtle issue:
hrtimer_interrupt()
lock(cpu_base);
expires_next = KTIME_MAX;
expire_timers(CLOCK_MONOTONIC);
expires = get_next_timer(CLOCK_MONOTONIC);
if (expires < expires_next)
expires_next = expires;
expire_timers(CLOCK_REALTIME);
unlock(cpu_base);
wakeup()
hrtimer_start(CLOCK_MONOTONIC, newtimer);
lock(cpu_base();
expires = get_next_timer(CLOCK_REALTIME);
if (expires < expires_next)
expires_next = expires;
So because we already evaluated the next expiring timer of
CLOCK_MONOTONIC we ignore that the expiry time of newtimer might be
earlier than the overall next expiry time in hrtimer_interrupt().
To solve this, remove the caching of the next expiry value from
hrtimer_interrupt() and reevaluate all active clock bases for the next
expiry value. To avoid another code duplication, create a shared
evaluation function and use it for hrtimer_get_next_event(),
hrtimer_force_reprogram() and hrtimer_interrupt().
There is another subtlety in this mechanism:
While hrtimer_interrupt() is running, we want to avoid to touch the
hardware device because we will reprogram it anyway at the end of
hrtimer_interrupt(). This works nicely for hrtimers which get rearmed
via the HRTIMER_RESTART mechanism, because we drop out when the
callback on that CPU is running. But that fails, if a new timer gets
enqueued like in the example above.
This has another implication: While hrtimer_interrupt() is running we
refuse remote enqueueing of timers - see hrtimer_interrupt() and
hrtimer_check_target().
hrtimer_interrupt() tries to prevent this by setting cpu_base->expires
to KTIME_MAX, but that fails if a new timer gets queued.
Prevent both the hardware access and the remote enqueue
explicitely. We can loosen the restriction on the remote enqueue now
due to reevaluation of the next expiry value, but that needs a
seperate patch.
Folded in a fix from Vignesh Radhakrishnan.
Reported-and-tested-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Based-on-patch-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: vigneshr@codeaurora.org
Cc: john.stultz@linaro.org
Cc: viresh.kumar@linaro.org
Cc: fweisbec@gmail.com
Cc: cl@linux.com
Cc: stuart.w.hayes@gmail.com
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1501202049190.5526@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Verify that the frequency value from userspace is valid and makes sense.
Unverified values can cause overflows later on.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
[jstultz: Fix up bug for negative values and drop redunent cap check]
Signed-off-by: John Stultz <john.stultz@linaro.org>
An unvalidated user input is multiplied by a constant, which can result in
an undefined behaviour for large values. While this is validated later,
we should avoid triggering undefined behaviour.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
[jstultz: include trivial milisecond->microsecond correction noticed
by Andy]
Signed-off-by: John Stultz <john.stultz@linaro.org>
The current timecounter implementation will drop a variable amount
of resolution, depending on the magnitude of the time delta. In
other words, reading the clock too often or too close to a time
stamp conversion will introduce errors into the time values. This
patch fixes the issue by introducing a fractional nanosecond field
that accumulates the low order bits.
Reported-by: Janusz Użycki <j.uzycki@elproma.com.pl>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The timecounter code has almost nothing to do with the clocksource
code. Let it live in its own file. This will help isolate the
timecounter users from the clocksource users in the source tree.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull NOHZ update from Thomas Gleixner:
"Remove the call into the nohz idle code from the fake 'idle' thread in
the powerclamp driver along with the export of those functions which
was smuggeled in via the thermal tree. People have tried to hack
around it in the nohz core code, but it just violates all rightful
assumptions of that code about the only valid calling context (i.e.
the proper idle task).
The powerclamp trainwreck will still work, it just wont get the
benefit of long idle sleeps"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/powerclamp: Remove tick_nohz_idle abuse
commit 4dbd27711c "tick: export nohz tick idle symbols for module
use" was merged via the thermal tree without an explicit ack from the
relevant maintainers.
The exports are abused by the intel powerclamp driver which implements
a fake idle state from a sched FIFO task. This causes all kinds of
wreckage in the NOHZ core code which rightfully assumes that
tick_nohz_idle_enter/exit() are only called from the idle task itself.
Recent changes in the NOHZ core lead to a failure of the powerclamp
driver and now people try to hack completely broken and backwards
workarounds into the NOHZ core code. This is completely unacceptable
and just papers over the real problem. There are way more subtle
issues lurking around the corner.
The real solution is to fix the powerclamp driver by rewriting it with
a sane concept, but that's beyond the scope of this.
So the only solution for now is to remove the calls into the core NOHZ
code from the powerclamp trainwreck along with the exports.
Fixes: d6d71ee4a1 "PM: Introduce Intel PowerClamp Driver"
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Pan Jacob jun <jacob.jun.pan@intel.com>
Cc: LKP <lkp@01.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412181110110.17382@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull drm updates from Dave Airlie:
"Highlights:
- AMD KFD driver merge
This is the AMD HSA interface for exposing a lowlevel interface for
GPGPU use. They have an open source userspace built on top of this
interface, and the code looks as good as it was going to get out of
tree.
- Initial atomic modesetting work
The need for an atomic modesetting interface to allow userspace to
try and send a complete set of modesetting state to the driver has
arisen, and been suffering from neglect this past year. No more,
the start of the common code and changes for msm driver to use it
are in this tree. Ongoing work to get the userspace ioctl finished
and the code clean will probably wait until next kernel.
- DisplayID 1.3 and tiled monitor exposed to userspace.
Tiled monitor property is now exposed for userspace to make use of.
- Rockchip drm driver merged.
- imx gpu driver moved out of staging
Other stuff:
- core:
panel - MIPI DSI + new panels.
expose suggested x/y properties for virtual GPUs
- i915:
Initial Skylake (SKL) support
gen3/4 reset work
start of dri1/ums removal
infoframe tracking
fixes for lots of things.
- nouveau:
tegra k1 voltage support
GM204 modesetting support
GT21x memory reclocking work
- radeon:
CI dpm fixes
GPUVM improvements
Initial DPM fan control
- rcar-du:
HDMI support added
removed some support for old boards
slave encoder driver for Analog Devices adv7511
- exynos:
Exynos4415 SoC support
- msm:
a4xx gpu support
atomic helper conversion
- tegra:
iommu support
universal plane support
ganged-mode DSI support
- sti:
HDMI i2c improvements
- vmwgfx:
some late fixes.
- qxl:
use suggested x/y properties"
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (969 commits)
drm: sti: fix module compilation issue
drm/i915: save/restore GMBUS freq across suspend/resume on gen4
drm: sti: correctly cleanup CRTC and planes
drm: sti: add HQVDP plane
drm: sti: add cursor plane
drm: sti: enable auxiliary CRTC
drm: sti: fix delay in VTG programming
drm: sti: prepare sti_tvout to support auxiliary crtc
drm: sti: use drm_crtc_vblank_{on/off} instead of drm_vblank_{on/off}
drm: sti: fix hdmi avi infoframe
drm: sti: remove event lock while disabling vblank
drm: sti: simplify gdp code
drm: sti: clear all mixer control
drm: sti: remove gpio for HDMI hot plug detection
drm: sti: allow to change hdmi ddc i2c adapter
drm/doc: Document drm_add_modes_noedid() usage
drm/i915: Remove '& 0xffff' from the mask given to WA_REG()
drm/i915: Invert the mask and val arguments in wa_add() and WA_REG()
drm: Zero out DRM object memory upon cleanup
drm/i915/bdw: Fix the write setting up the WIZ hashing mode
...
Pull percpu updates from Tejun Heo:
"Nothing interesting. A patch to convert the remaining __get_cpu_var()
users, another to fix non-critical off-by-one in an assertion and a
cosmetic conversion to lockless_dereference() in percpu-ref.
The back-merge from mainline is to receive lockless_dereference()"
* 'for-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: Replace smp_read_barrier_depends() with lockless_dereference()
percpu: Convert remaining __get_cpu_var uses in 3.18-rcX
percpu: off by one in BUG_ON()
Pull more 2038 timer work from Thomas Gleixner:
"Two more patches for the ongoing 2038 work:
- New accessors to clock MONOTONIC and REALTIME seconds
This is a seperate branch as Arnd has follow up work depending on
this"
* 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping: Provide y2038 safe accessor to the seconds portion of CLOCK_REALTIME
timekeeping: Provide fast accessor to the seconds part of CLOCK_MONOTONIC
Pull timer core updates from Thomas Gleixner:
"The time(r) departement provides:
- more infrastructure work on the year 2038 issue
- a few fixes in the Armada SoC timers
- the usual pile of fixlets and improvements"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: armada-370-xp: Use the reference clock on A375 SoC
watchdog: orion: Use the reference clock on Armada 375 SoC
clocksource: armada-370-xp: Add missing clock enable
time: Fix sign bug in NTP mult overflow warning
time: Remove timekeeping_inject_sleeptime()
rtc: Update suspend/resume timing to use 64bit time
rtc/lib: Provide y2038 safe rtc_tm_to_time()/rtc_time_to_tm() replacement
time: Fixup comments to reflect usage of timespec64
time: Expose get_monotonic_coarse64() for in-kernel uses
time: Expose getrawmonotonic64 for in-kernel uses
time: Provide y2038 safe mktime() replacement
time: Provide y2038 safe timekeeping_inject_sleeptime() replacement
time: Provide y2038 safe do_settimeofday() replacement
time: Complete NTP adjustment threshold judging conditions
time: Avoid possible NTP adjustment mult overflow.
time: Rename udelay_test.c to test_udelay.c
clocksource: sirf: Remove hard-coded clock rate
Pull RCU updates from Ingo Molnar:
"These are the main changes in this cycle:
- Streamline RCU's use of per-CPU variables, shifting from "cpu"
arguments to functions to "this_"-style per-CPU variable
accessors.
- signal-handling RCU updates.
- real-time updates.
- torture-test updates.
- miscellaneous fixes.
- documentation updates"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
rcu: Fix FIXME in rcu_tasks_kthread()
rcu: More info about potential deadlocks with rcu_read_unlock()
rcu: Optimize cond_resched_rcu_qs()
rcu: Add sparse check for RCU_INIT_POINTER()
documentation: memory-barriers.txt: Correct example for reorderings
documentation: Add atomic_long_t to atomic_ops.txt
documentation: Additional restriction for control dependencies
documentation: Document RCU self test boot params
rcutorture: Fix rcu_torture_cbflood() memory leak
rcutorture: Remove obsolete kversion param in kvm.sh
rcutorture: Remove stale test configurations
rcutorture: Enable RCU self test in configs
rcutorture: Add early boot self tests
torture: Run Linux-kernel binary out of results directory
cpu: Avoid puts_pending overflow
rcu: Remove "cpu" argument to rcu_cleanup_after_idle()
rcu: Remove "cpu" argument to rcu_prepare_for_idle()
rcu: Remove "cpu" argument to rcu_needs_cpu()
rcu: Remove "cpu" argument to rcu_note_context_switch()
rcu: Remove "cpu" argument to rcu_preempt_check_callbacks()
...
We've lost the +1 required for correct timeouts in
commit 5ed0bdf21a
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Wed Jul 16 21:05:06 2014 +0000
drm: i915: Use nsec based interfaces
Use ktime_get_raw_ns() and get rid of the back and forth timespec
conversions.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: John Stultz <john.stultz@linaro.org>
So fix this up by reinstating our handrolled _timeout function. While
at it bother with handling MAX_JIFFIES.
v2: Convert to usecs (we don't care about the accuracy anyway) first
to avoid overflow issues Dave Gordon spotted.
v3: Drop the explicit MAX_JIFFY_OFFSET check, usecs_to_jiffies should
take care of that already. It might be a bit too enthusiastic about it
though.
v4: Chris has a much nicer color, so use his implementation.
This requires to export nsec_to_jiffies from time.c.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82749
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Acked-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
In commit 6067dc5a8c ("time: Avoid possible NTP adjustment
mult overflow") a new check was added to watch for adjustments
that could cause a mult overflow.
Unfortunately the check compares a signed with unsigned value
and ignored the case where the adjustment was negative, which
causes spurious warn-ons on some systems (and seems like it
would result in problematic time adjustments there as well, due
to the early return).
Thus this patch adds a check to make sure the adjustment is
positive before we check for an overflow, and resovles the issue
in my testing.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Debugged-by: pang.xunlei <pang.xunlei@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/1416890145-30048-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fix up a few comments that weren't updated when the
functions were converted to use timespec64 structures.
Acked-by: Arnd Bergmann <arnd.bergmann@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Adds a timespec64 based get_monotonic_coarse64() implementation
that can be used as we convert internal users of
get_monotonic_coarse away from using timespecs.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Adds a timespec64 based getrawmonotonic64() implementation
that can be used as we convert internal users of
getrawmonotonic away from using timespecs.
Signed-off-by: John Stultz <john.stultz@linaro.org>
As part of addressing "y2038 problem" for in-kernel uses, this
patch adds safe mktime64() using time64_t.
After this patch, mktime() is deprecated and all its call sites
will be fixed using mktime64(), after that it can be removed.
Signed-off-by: pang.xunlei <pang.xunlei@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
As part of addressing "y2038 problem" for in-kernel uses, this
patch adds timekeeping_inject_sleeptime64() using timespec64.
After this patch, timekeeping_inject_sleeptime() is deprecated
and all its call sites will be fixed using the new interface,
after that it can be removed.
NOTE: timekeeping_inject_sleeptime() is safe actually, but we
want to eliminate timespec eventually, so comes this patch.
Signed-off-by: pang.xunlei <pang.xunlei@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
The kernel uses 32-bit signed value(time_t) for seconds elapsed
1970-01-01:00:00:00, thus it will overflow at 2038-01-19 03:14:08
on 32-bit systems. This is widely known as the y2038 problem.
As part of addressing "y2038 problem" for in-kernel uses, this patch
adds safe do_settimeofday64() using timespec64.
After this patch, do_settimeofday() is deprecated and all its call
sites will be fixed using do_settimeofday64(), after that it can be
removed.
Signed-off-by: pang.xunlei <pang.xunlei@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
The clocksource mult-adjustment threshold is [mult-maxadj, mult+maxadj],
timekeeping_adjust() only deals with the upper threshold, but misses the
lower threshold.
This patch adds the lower threshold judging condition.
Signed-off-by: pang.xunlei <pang.xunlei@linaro.org>
[jstultz: Minor fix for > 80 char line]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Ideally, __clocksource_updatefreq_scale, selects the largest shift
value possible for a clocksource. This results in the mult memember of
struct clocksource being particularly large, although not so large
that NTP would adjust the clock to cause it to overflow.
That said, nothing actually prohibits an overflow from occuring, its
just that it "shouldn't" occur.
So while very unlikely, and so far never observed, the value of
(cs->mult+cs->maxadj) may have a chance to reach very near 0xFFFFFFFF,
so there is a possibility it may overflow when doing NTP positive
adjustment
See the following detail: When NTP slewes the clock, kernel goes
through update_wall_time()->...->timekeeping_apply_adjustment():
tk->tkr.mult += mult_adj;
Since there is no guard against it, its possible tk->tkr.mult may
overflow during this operation.
This patch avoids any possible mult overflow by judging the overflow
case before adding mult_adj to mult, also adds the WARNING message
when capturing such case.
Signed-off-by: pang.xunlei <pang.xunlei@linaro.org>
[jstultz: Reworded commit message]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Kees requested that this test module be renamed for consistency sake,
so this patch renames the udelay_test.c file (recently added to
tip/timers/core for 3.17) to test_udelay.c
Cc: Kees Cook <keescook@chromium.org>
Cc: Greg KH <greg@kroah.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Linux-Next <linux-next@vger.kernel.org>
Cc: David Riley <davidriley@chromium.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
While looking over the cpu-timer code I found that we appear to add
the delta for the calling task twice, through:
cpu_timer_sample_group()
thread_group_cputimer()
thread_group_cputime()
times->sum_exec_runtime += task_sched_runtime();
*sample = cputime.sum_exec_runtime + task_delta_exec();
Which would make the sample run ahead, making the sleep short.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/20141112113737.GI10476@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The "cpu" argument to rcu_needs_cpu() is always the current CPU, so drop
it. This in turn allows the "cpu" argument to rcu_cpu_has_callbacks()
to be removed, which allows the uses of "cpu" in both functions to be
replaced with a this_cpu_ptr(). Again, the anticipated cross-CPU uses
of these functions has been replaced by NO_HZ_FULL.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
The "cpu" argument was kept around on the off-chance that RCU might
offload scheduler-clock interrupts. However, this offload approach
has been replaced by NO_HZ_FULL, which offloads -all- RCU processing
from qualifying CPUs. It is therefore time to remove the "cpu" argument
to rcu_check_callbacks(), which this commit does.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
During the 3.18 merge period additional __get_cpu_var uses were
added. The patch converts these to this_cpu_ptr().
Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
ktime_get_real_seconds() is the replacement function for get_seconds()
returning the seconds portion of CLOCK_REALTIME in a time64_t. For
64bit the function is equivivalent to get_seconds(), but for 32bit it
protects the readout with the timekeeper sequence count. This is
required because 32-bit machines cannot access 64-bit tk->xtime_sec
variable atomically.
[tglx: Massaged changelog and added docbook comment ]
Signed-off-by: Heena Sirwani <heenasirwani@gmail.com>
Reviewed-by: Arnd Bergman <arnd@arndb.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: opw-kernel@googlegroups.com
Link: http://lkml.kernel.org/r/7adcfaa8962b8ad58785d9a2456c3f77d93c0ffb.1414578445.git.heenasirwani@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This is the counterpart to get_seconds() based on CLOCK_MONOTONIC. The
use case for this interface are kernel internal coarse grained
timestamps which do neither require the nanoseconds fraction of
current time nor the CLOCK_REALTIME properties. Such timestamps can
currently only retrieved by calling ktime_get_ts64() and using the
tv_sec field of the returned timespec64. That's inefficient as it
involves the read of the clocksource, math operations and must be
protected by the timekeeper sequence counter.
To avoid the sequence counter protection we restrict the return value
to unsigned 32bit on 32bit machines. This covers ~136 years of uptime
and therefor an overflow is not expected to hit anytime soon.
To avoid math in the function we calculate the current seconds portion
of CLOCK_MONOTONIC when the timekeeper gets updated in
tk_update_ktime_data() similar to the CLOCK_REALTIME counterpart
xtime_sec.
[ tglx: Massaged changelog, simplified and commented the update
function, added docbook comment ]
Signed-off-by: Heena Sirwani <heenasirwani@gmail.com>
Reviewed-by: Arnd Bergman <arnd@arndb.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: opw-kernel@googlegroups.com
Link: http://lkml.kernel.org/r/da0b63f4bdf3478909f92becb35861197da3a905.1414578445.git.heenasirwani@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Andrey reported that on a kernel with UBSan enabled he found:
UBSan: Undefined behaviour in ../kernel/time/clockevents.c:75:34
I guess it should be 1ULL here instead of 1U:
(!ismax || evt->mult <= (1U << evt->shift)))
That's indeed the correct solution because shift might be 32.
Reported-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If userland creates a timer without specifying a sigevent info, we'll
create one ourself, using a stack local variable. Particularly will we
use the timer ID as sival_int. But as sigev_value is a union containing
a pointer and an int, that assignment will only partially initialize
sigev_value on systems where the size of a pointer is bigger than the
size of an int. On such systems we'll copy the uninitialized stack bytes
from the timer_create() call to userland when the timer actually fires
and we're going to deliver the signal.
Initialize sigev_value with 0 to plug the stack info leak.
Found in the PaX patch, written by the PaX Team.
Fixes: 5a9fa73072 ("posix-timers: kill ->it_sigev_signo and...")
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: PaX Team <pageexec@freemail.hu>
Cc: <stable@vger.kernel.org> # v2.6.28+
Link: http://lkml.kernel.org/r/1412456799-32339-1-git-send-email-minipli@googlemail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull percpu consistent-ops changes from Tejun Heo:
"Way back, before the current percpu allocator was implemented, static
and dynamic percpu memory areas were allocated and handled separately
and had their own accessors. The distinction has been gone for many
years now; however, the now duplicate two sets of accessors remained
with the pointer based ones - this_cpu_*() - evolving various other
operations over time. During the process, we also accumulated other
inconsistent operations.
This pull request contains Christoph's patches to clean up the
duplicate accessor situation. __get_cpu_var() uses are replaced with
with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().
Unfortunately, the former sometimes is tricky thanks to C being a bit
messy with the distinction between lvalues and pointers, which led to
a rather ugly solution for cpumask_var_t involving the introduction of
this_cpu_cpumask_var_ptr().
This converts most of the uses but not all. Christoph will follow up
with the remaining conversions in this merge window and hopefully
remove the obsolete accessors"
* 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
irqchip: Properly fetch the per cpu offset
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
Revert "powerpc: Replace __get_cpu_var uses"
percpu: Remove __this_cpu_ptr
clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
sparc: Replace __get_cpu_var uses
avr32: Replace __get_cpu_var with __this_cpu_write
blackfin: Replace __get_cpu_var uses
tile: Use this_cpu_ptr() for hardware counters
tile: Replace __get_cpu_var uses
powerpc: Replace __get_cpu_var uses
alpha: Replace __get_cpu_var
ia64: Replace __get_cpu_var uses
s390: cio driver &__get_cpu_var replacements
s390: Replace __get_cpu_var uses
mips: Replace __get_cpu_var uses
MIPS: Replace __get_cpu_var uses in FPU emulator.
arm: Replace __this_cpu_ptr with raw_cpu_ptr
...
Pull s390 updates from Martin Schwidefsky:
"This patch set contains the main portion of the changes for 3.18 in
regard to the s390 architecture. It is a bit bigger than usual,
mainly because of a new driver and the vector extension patches.
The interesting bits are:
- Quite a bit of work on the tracing front. Uprobes is enabled and
the ftrace code is reworked to get some of the lost performance
back if CONFIG_FTRACE is enabled.
- To improve boot time with CONFIG_DEBIG_PAGEALLOC, support for the
IPTE range facility is added.
- The rwlock code is re-factored to improve writer fairness and to be
able to use the interlocked-access instructions.
- The kernel part for the support of the vector extension is added.
- The device driver to access the CD/DVD on the HMC is added, this
will hopefully come in handy to improve the installation process.
- Add support for control-unit initiated reconfiguration.
- The crypto device driver is enhanced to enable the additional AP
domains and to allow the new crypto hardware to be used.
- Bug fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (39 commits)
s390/ftrace: simplify enabling/disabling of ftrace_graph_caller
s390/ftrace: remove 31 bit ftrace support
s390/kdump: add support for vector extension
s390/disassembler: add vector instructions
s390: add support for vector extension
s390/zcrypt: Toleration of new crypto hardware
s390/idle: consolidate idle functions and definitions
s390/nohz: use a per-cpu flag for arch_needs_cpu
s390/vtime: do not reset idle data on CPU hotplug
s390/dasd: add support for control unit initiated reconfiguration
s390/dasd: fix infinite loop during format
s390/mm: make use of ipte range facility
s390/setup: correct 4-level kernel page table detection
s390/topology: call set_sched_topology early
s390/uprobes: architecture backend for uprobes
s390/uprobes: common library for kprobes and uprobes
s390/rwlock: use the interlocked-access facility 1 instructions
s390/rwlock: improve writer fairness
s390/rwlock: remove interrupt-enabling rwlock variant.
s390/mm: remove change bit override support
...