There's not much changes for the tracing system this release.
Mostly small clean ups and fixes.
The biggest change is to how bprintf works. bprintf is used by
trace_printk() to just save the format and args of a printf call,
and the formatting is done when the trace buffer is read. This is
done to keep the formatting out of the fast path (this was recommended
by you). The issue is when arguments are de-referenced.
If a pointer is saved, and the format has something like "%*pbl",
when the buffer is read, it will de-reference the argument then.
The problem is if the data no longer exists. This can cause the
kernel to oops.
The fix for this was to make these de-reference pointes do
the formatting at the time it is called (the fast path), as
this guarantees that the data exists (and doesn't change later)
-----BEGIN PGP SIGNATURE-----
iQHIBAABCgAyFiEEPm6V/WuN2kyArTUe1a05Y9njSUkFAlpzUWIUHHJvc3RlZHRA
Z29vZG1pcy5vcmcACgkQ1a05Y9njSUk6FQwAqvrOE8AO9cbwB0bxm82aZj7UyYkG
D9w4goPiykcNR31wxvM2jhrQn6uqG04g2YFCZntZvZMmGFyu6y/tDxGNGe0cHmVR
1dZkFcst6IAHybY70cIkAcNNJXFFsccTaVcuniidWYwq2K3z2eqK5s3YfA+ZMr0q
nfUvv0cIgdGTduM4SuoKr0ewxb779I0tHhQXUnMTeLOWG3HQhOh5LcjdBk/N4pTj
axfXIlx8lkHSPOvvUmpJ/R51v7HzfidibEnUrrx46ng3DV81hFdEhwbVl+GO44fP
9Wq7vRCvTThvTZVCg8gaYsuo3+CPISmlkpdq373bCqnKuGynGmXzkFNzXjDHl8pL
GPbji/Gu9kSMZaZoBihleaX4G4UW386nqDC2YlLTe4rx6f8phYLxMOC2/8nZ+A3L
JKB4PTWKUSSNCfaPgiT6dv/CeaXalcYtgVJ2xM2jFhoOEyM5jwb0f6tdvXsuj5Fn
ZlFNTuKWPn0siQIng7EZan3LmIFkukM5sfVF
=bmm/
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"There's not much changes for the tracing system this release. Mostly
small clean ups and fixes.
The biggest change is to how bprintf works. bprintf is used by
trace_printk() to just save the format and args of a printf call, and
the formatting is done when the trace buffer is read. This is done to
keep the formatting out of the fast path (this was recommended by
you). The issue is when arguments are de-referenced.
If a pointer is saved, and the format has something like "%*pbl", when
the buffer is read, it will de-reference the argument then. The
problem is if the data no longer exists. This can cause the kernel to
oops.
The fix for this was to make these de-reference pointes do the
formatting at the time it is called (the fast path), as this
guarantees that the data exists (and doesn't change later)"
* tag 'trace-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
vsprintf: Do not have bprintf dereference pointers
ftrace: Mark function tracer test functions noinline/noclone
trace_uprobe: Display correct offset in uprobe_events
tracing: Make sure the parsed string always terminates with '\0'
tracing: Clear parser->idx if only spaces are read
tracing: Detect the string nul character when parsing user input string
Merge KASAN word-at-a-time fixups from Andrey Ryabinin.
The word-at-a-time optimizations have caused headaches for KASAN, since
the whole point is that we access byte streams in bigger chunks, and
KASAN can be unhappy about the potential extra access at the end of the
string.
We used to have a horrible hack in dcache, and then people got
complaints from the strscpy() case. This fixes it all up properly, by
adding an explicit helper for the "access byte stream one word at a
time" case.
* emailed patches from Andrey Ryabinin <aryabinin@virtuozzo.com>:
fs: dcache: Revert "manually unpoison dname after allocation to shut up kasan's reports"
fs/dcache: Use read_word_at_a_time() in dentry_string_cmp()
lib/strscpy: Shut up KASAN false-positives in strscpy()
compiler.h: Add read_word_at_a_time() function.
compiler.h, kasan: Avoid duplicating __read_once_size_nocheck()
strscpy() performs the word-at-a-time optimistic reads. So it may may
access the memory past the end of the object, which is perfectly fine
since strscpy() doesn't use that (past-the-end) data and makes sure the
optimistic read won't cross a page boundary.
Use new read_word_at_a_time() to shut up the KASAN.
Note that this potentially could hide some bugs. In example bellow,
stscpy() will copy more than we should (1-3 extra uninitialized bytes):
char dst[8];
char *src;
src = kmalloc(5, GFP_KERNEL);
memset(src, 0xff, 5);
strscpy(dst, src, 8);
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here is the set of "big" driver core patches for 4.16-rc1.
The majority of the work here is in the firmware subsystem, with reworks
to try to attempt to make the code easier to handle in the long run, but
no functional change. There's also some tree-wide sysfs attribute
fixups with lots of acks from the various subsystem maintainers, as well
as a handful of other normal fixes and changes.
And finally, some license cleanups for the driver core and sysfs code.
All have been in linux-next for a while with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWnLvPw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynNzACgkzjPoBytJWbpWFt6SR6L33/u4kEAnRFvVCGL
s6ygQPQhZIjKk2Lxa2hC
=Zihy
-----END PGP SIGNATURE-----
Merge tag 'driver-core-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of "big" driver core patches for 4.16-rc1.
The majority of the work here is in the firmware subsystem, with
reworks to try to attempt to make the code easier to handle in the
long run, but no functional change. There's also some tree-wide sysfs
attribute fixups with lots of acks from the various subsystem
maintainers, as well as a handful of other normal fixes and changes.
And finally, some license cleanups for the driver core and sysfs code.
All have been in linux-next for a while with no reported issues"
* tag 'driver-core-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (48 commits)
device property: Define type of PROPERTY_ENRTY_*() macros
device property: Reuse property_entry_free_data()
device property: Move property_entry_free_data() upper
firmware: Fix up docs referring to FIRMWARE_IN_KERNEL
firmware: Drop FIRMWARE_IN_KERNEL Kconfig option
USB: serial: keyspan: Drop firmware Kconfig options
sysfs: remove DEBUG defines
sysfs: use SPDX identifiers
drivers: base: add coredump driver ops
sysfs: add attribute specification for /sysfs/devices/.../coredump
test_firmware: fix missing unlock on error in config_num_requests_store()
test_firmware: make local symbol test_fw_config static
sysfs: turn WARN() into pr_warn()
firmware: Fix a typo in fallback-mechanisms.rst
treewide: Use DEVICE_ATTR_WO
treewide: Use DEVICE_ATTR_RO
treewide: Use DEVICE_ATTR_RW
sysfs.h: Use octal permissions
component: add debugfs support
bus: simple-pm-bus: convert bool SIMPLE_PM_BUS to tristate
...
documentation, errseq documentation, kernel-doc support for nested
structure definitions, the removal of lots of crufty kernel-doc support for
unused formats, SPDX tag documentation, the beginnings of a manual for
subsystem maintainers, and lots of fixes and updates.
As usual, some of the changesets reach outside of Documentation/ to effect
kerneldoc comment fixes. It also adds the new LICENSES directory, of which
Thomas promises I do not need to be the maintainer.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJab11TAAoJEI3ONVYwIuV6i1UP/1LgGPHW9Ygq5qaLFbReZd/u
Mx/orrhHX0PdkbCCE+CbL8Vm1m4UKFDTBdlpk3s542zxeeG0ZBXuTnvq4Kyk+cTN
p4/vsIEzk/Ih13/glGE5MlV+EjiEK+8hK69TIUj7bAyuHmpzofjRz9/1M6RLDGDC
HY6UI58AXG0yOQWMWCGRMYpQAFUGij2equ7Doe1ugXRq14dx7V4RsOhI140iRk7t
bquAq1rS2fXniiuPFmLBUe4dWW28isVa/Vl/aXcaWQDKMyT0OLhjOMW36wWKqtPi
WdVCpHv1NLZNyZZr9S3kvfOwW+BUqpEzfVwssyBLW4h0tsnIx0U0HVhSTY8/TvFZ
QD9yCSana4LB/e5CHXIX5lBHbjHxf+rETXqVV4MgwDaMvM3mCo4X6WUTJDmZADo6
vQISEKeb4su5uWAbc9T9xwRSLhZnFVdJ/QuYdNQ5+EpFJYLhzQ9eBvEz6JstSIXL
p9ASBiPNY3ulpVZ8q0JOHJRBhq5mHJH6Dy8achzbILy2l/ZI4b8lJ53mw9II04cp
puF96E6HpvuZ8Tgjjrg9U3ZdxXNrUgc/tjk2ZDkyTglk1XF2jKSq2tiNSZ3oLrJm
XqJPnpCeyJM5UDvwkIBzgC41WEHwe8uvoNbUnc4X7UJSZegFzcSLQXf5qaprHS5k
XeQ7sbd+S+jzVVjFi0W5
=Z15Z
-----END PGP SIGNATURE-----
Merge tag 'docs-4.16' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"Documentation updates for 4.16.
New stuff includes refcount_t documentation, errseq documentation,
kernel-doc support for nested structure definitions, the removal of
lots of crufty kernel-doc support for unused formats, SPDX tag
documentation, the beginnings of a manual for subsystem maintainers,
and lots of fixes and updates.
As usual, some of the changesets reach outside of Documentation/ to
effect kerneldoc comment fixes. It also adds the new LICENSES
directory, of which Thomas promises I do not need to be the
maintainer"
* tag 'docs-4.16' of git://git.lwn.net/linux: (65 commits)
linux-next: docs-rst: Fix typos in kfigure.py
linux-next: DOC: HWPOISON: Fix path to debugfs in hwpoison.txt
Documentation: Fix misconversion of #if
docs: add index entry for networking/msg_zerocopy
Documentation: security/credentials.rst: explain need to sort group_list
LICENSES: Add MPL-1.1 license
LICENSES: Add the GPL 1.0 license
LICENSES: Add Linux syscall note exception
LICENSES: Add the MIT license
LICENSES: Add the BSD-3-clause "Clear" license
LICENSES: Add the BSD 3-clause "New" or "Revised" License
LICENSES: Add the BSD 2-clause "Simplified" license
LICENSES: Add the LGPL-2.1 license
LICENSES: Add the LGPL 2.0 license
LICENSES: Add the GPL 2.0 license
Documentation: Add license-rules.rst to describe how to properly identify file licenses
scripts: kernel_doc: better handle show warnings logic
fs/*/Kconfig: drop links to 404-compliant http://acl.bestbits.at
doc: md: Fix a file name to md-fault.c in fault-injection.txt
errseq: Add to documentation tree
...
Pull networking updates from David Miller:
1) Significantly shrink the core networking routing structures. Result
of http://vger.kernel.org/~davem/seoul2017_netdev_keynote.pdf
2) Add netdevsim driver for testing various offloads, from Jakub
Kicinski.
3) Support cross-chip FDB operations in DSA, from Vivien Didelot.
4) Add a 2nd listener hash table for TCP, similar to what was done for
UDP. From Martin KaFai Lau.
5) Add eBPF based queue selection to tun, from Jason Wang.
6) Lockless qdisc support, from John Fastabend.
7) SCTP stream interleave support, from Xin Long.
8) Smoother TCP receive autotuning, from Eric Dumazet.
9) Lots of erspan tunneling enhancements, from William Tu.
10) Add true function call support to BPF, from Alexei Starovoitov.
11) Add explicit support for GRO HW offloading, from Michael Chan.
12) Support extack generation in more netlink subsystems. From Alexander
Aring, Quentin Monnet, and Jakub Kicinski.
13) Add 1000BaseX, flow control, and EEE support to mvneta driver. From
Russell King.
14) Add flow table abstraction to netfilter, from Pablo Neira Ayuso.
15) Many improvements and simplifications to the NFP driver bpf JIT,
from Jakub Kicinski.
16) Support for ipv6 non-equal cost multipath routing, from Ido
Schimmel.
17) Add resource abstration to devlink, from Arkadi Sharshevsky.
18) Packet scheduler classifier shared filter block support, from Jiri
Pirko.
19) Avoid locking in act_csum, from Davide Caratti.
20) devinet_ioctl() simplifications from Al viro.
21) More TCP bpf improvements from Lawrence Brakmo.
22) Add support for onlink ipv6 route flag, similar to ipv4, from David
Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1925 commits)
tls: Add support for encryption using async offload accelerator
ip6mr: fix stale iterator
net/sched: kconfig: Remove blank help texts
openvswitch: meter: Use 64-bit arithmetic instead of 32-bit
tcp_nv: fix potential integer overflow in tcpnv_acked
r8169: fix RTL8168EP take too long to complete driver initialization.
qmi_wwan: Add support for Quectel EP06
rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK
ipmr: Fix ptrdiff_t print formatting
ibmvnic: Wait for device response when changing MAC
qlcnic: fix deadlock bug
tcp: release sk_frag.page in tcp_disconnect
ipv4: Get the address of interface correctly.
net_sched: gen_estimator: fix lockdep splat
net: macb: Handle HRESP error
net/mlx5e: IPoIB, Fix copy-paste bug in flow steering refactoring
ipv6: addrconf: break critical section in addrconf_verify_rtnl()
ipv6: change route cache aging logic
i40e/i40evf: Update DESC_NEEDED value to reflect larger value
bnxt_en: cleanup DIM work on device shutdown
...
Pull crypto updates from Herbert Xu:
"API:
- Enforce the setting of keys for keyed aead/hash/skcipher
algorithms.
- Add multibuf speed tests in tcrypt.
Algorithms:
- Improve performance of sha3-generic.
- Add native sha512 support on arm64.
- Add v8.2 Crypto Extentions version of sha3/sm3 on arm64.
- Avoid hmac nesting by requiring underlying algorithm to be unkeyed.
- Add cryptd_max_cpu_qlen module parameter to cryptd.
Drivers:
- Add support for EIP97 engine in inside-secure.
- Add inline IPsec support to chelsio.
- Add RevB core support to crypto4xx.
- Fix AEAD ICV check in crypto4xx.
- Add stm32 crypto driver.
- Add support for BCM63xx platforms in bcm2835 and remove bcm63xx.
- Add Derived Key Protocol (DKP) support in caam.
- Add Samsung Exynos True RNG driver.
- Add support for Exynos5250+ SoCs in exynos PRNG driver"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (166 commits)
crypto: picoxcell - Fix error handling in spacc_probe()
crypto: arm64/sha512 - fix/improve new v8.2 Crypto Extensions code
crypto: arm64/sm3 - new v8.2 Crypto Extensions implementation
crypto: arm64/sha3 - new v8.2 Crypto Extensions implementation
crypto: testmgr - add new testcases for sha3
crypto: sha3-generic - export init/update/final routines
crypto: sha3-generic - simplify code
crypto: sha3-generic - rewrite KECCAK transform to help the compiler optimize
crypto: sha3-generic - fixes for alignment and big endian operation
crypto: aesni - handle zero length dst buffer
crypto: artpec6 - remove select on non-existing CRYPTO_SHA384
hwrng: bcm2835 - Remove redundant dev_err call in bcm2835_rng_probe()
crypto: stm32 - remove redundant dev_err call in stm32_cryp_probe()
crypto: axis - remove unnecessary platform_get_resource() error check
crypto: testmgr - test misuse of result in ahash
crypto: inside-secure - make function safexcel_try_push_requests static
crypto: aes-generic - fix aes-generic regression on powerpc
crypto: chelsio - Fix indentation warning
crypto: arm64/sha1-ce - get rid of literal pool
crypto: arm64/sha2-ce - move the round constant table to .rodata section
...
- Misc small driver fixups to
bnxt_re/hfi1/qib/hns/ocrdma/rdmavt/vmw_pvrdma/nes
- Several major feature adds to bnxt_re driver: SRIOV VF RoCE support,
HugePages support, extended hardware stats support, and SRQ support
- A notable number of fixes to the i40iw driver from debugging scale up
testing
- More work to enable the new hip08 chip in the hns driver
- Misc small ULP fixups to srp/srpt//ipoib
- Preparation for srp initiator and target to support the RDMA-CM
protocol for connections
- Add RDMA-CM support to srp initiator, srp target is still a WIP
- Fixes for a couple of places where ipoib could spam the dmesg log
- Fix encode/decode of FDR/EDR data rates in the core
- Many patches from Parav with ongoing work to clean up inconsistencies
and bugs in RoCE support around the rdma_cm
- mlx5 driver support for the userspace features 'thread domain', 'wallclock
timestamps' and 'DV Direct Connected transport'. Support for the firmware
dual port rocee capability
- Core support for more than 32 rdma devices in the char dev allocation
- kernel doc updates from Randy Dunlap
- New netlink uAPI for inspecting RDMA objects similar in spirit to 'ss'
- One minor change to the kobject code acked by GKH
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCgAGBQJacfljAAoJEDht9xV+IJsaUnwP+QFJvfIDEfRlfU2rTmcfymPs
Rz9bW1KLgETcJx/XOE2ba2DOaqdFr56TLflsDfEfOSIL8AtzBQqH3vTqEj49bBP7
4JZAkzWllUS/qoYD2XmvOM0IrIfFXzZtLM/lzLi+5dwK26x3GAB9hHXpKzUrJ1vj
I1Naq14qOFXoNBndEtZJqtIKOhR/Pnd6YtxAiNCmViZGdqm3DIU3D4VJhU5B7pO9
j6ovJs16wfJl/gV1iiz9xO49ViVFpwzSIzYE/Q2ZCegcrsF3EEVN2J4vZHkKgDuN
0/Ar/WOvkPzKBFR8hJ7M4kwp0Fy/69/U49s7kpGNxdhML9sU3+Qfse6JYGj0M9L8
01gTM0SShyAZMNAvjVFbIKLQPg806OAit4cooMwlObbwJ6b7B8K0uN17/uVIkIqp
gXqertyl1BLhUtTOby/8Fox/f/oEvaZksKiwcTKSb7D1Y5jGZZUPRknJ5SwAFWQB
RiTPJ6mY7BUsM9zuYQtRE8x2mpgIezYXFcrAz7iT76WuoZQgo1QLIyYRM1+MlhnC
wNrp5BtqoVfW2Ps0CbSdxJ9vDtDf3cwLg0RzcCB8+NJJccsRD9IVMDev/TDY5k9U
M9LxxtW3WuulRWgliU0Q9VaswUQoIao16vBMVL7GwUm+ClLvbRVoPe8jxgtfk+W3
GAANAI7Kv/vUoV/6CFfP
=sMXV
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull RDMA subsystem updates from Jason Gunthorpe:
"Overall this cycle did not have any major excitement, and did not
require any shared branch with netdev.
Lots of driver updates, particularly of the scale-up and performance
variety. The largest body of core work was Parav's patches fixing and
restructing some of the core code to make way for future RDMA
containerization.
Summary:
- misc small driver fixups to
bnxt_re/hfi1/qib/hns/ocrdma/rdmavt/vmw_pvrdma/nes
- several major feature adds to bnxt_re driver: SRIOV VF RoCE
support, HugePages support, extended hardware stats support, and
SRQ support
- a notable number of fixes to the i40iw driver from debugging scale
up testing
- more work to enable the new hip08 chip in the hns driver
- misc small ULP fixups to srp/srpt//ipoib
- preparation for srp initiator and target to support the RDMA-CM
protocol for connections
- add RDMA-CM support to srp initiator, srp target is still a WIP
- fixes for a couple of places where ipoib could spam the dmesg log
- fix encode/decode of FDR/EDR data rates in the core
- many patches from Parav with ongoing work to clean up
inconsistencies and bugs in RoCE support around the rdma_cm
- mlx5 driver support for the userspace features 'thread domain',
'wallclock timestamps' and 'DV Direct Connected transport'. Support
for the firmware dual port rocee capability
- core support for more than 32 rdma devices in the char dev
allocation
- kernel doc updates from Randy Dunlap
- new netlink uAPI for inspecting RDMA objects similar in spirit to 'ss'
- one minor change to the kobject code acked by Greg KH"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (259 commits)
RDMA/nldev: Provide detailed QP information
RDMA/nldev: Provide global resource utilization
RDMA/core: Add resource tracking for create and destroy PDs
RDMA/core: Add resource tracking for create and destroy CQs
RDMA/core: Add resource tracking for create and destroy QPs
RDMA/restrack: Add general infrastructure to track RDMA resources
RDMA/core: Save kernel caller name when creating PD and CQ objects
RDMA/core: Use the MODNAME instead of the function name for pd callers
RDMA: Move enum ib_cq_creation_flags to uapi headers
IB/rxe: Change RDMA_RXE kconfig to use select
IB/qib: remove qib_keys.c
IB/mthca: remove mthca_user.h
RDMA/cm: Fix access to uninitialized variable
RDMA/cma: Use existing netif_is_bond_master function
IB/core: Avoid SGID attributes query while converting GID from OPA to IB
RDMA/mlx5: Avoid memory leak in case of XRCD dealloc failure
IB/umad: Fix use of unprotected device pointer
IB/iser: Combine substrings for three messages
IB/iser: Delete an unnecessary variable initialisation in iser_send_data_out()
IB/iser: Delete an error message for a failed memory allocation in iser_send_data_out()
...
This pull requests contains a consolidation of the generic no-IOMMU code,
a well as the glue code for swiotlb. All the code is based on the x86
implementation with hooks to allow all architectures that aren't cache
coherent to use it. The x86 conversion itself has been deferred because
the x86 maintainers were a little busy in the last months.
-----BEGIN PGP SIGNATURE-----
iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlpxcVoLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYN/Lw/+Je9teM4NPQ8lU/ncbJN/bUzCFGJ6dFt2eVX/6xs3
sfl8vBdeHt6CBM02rRNecEr31z3+orjQes5JnlEJFYeG3jumV0zCPw/zbxqjzbJ1
3n6cckLxbxzy8Ca1G/BVjHLAUX5eWp1ujn/Q4d03VKVQZhJvFYlqDbP3TrNVx7xn
k86u37p/o+ngjwX66UdZ3C4iIBF8zqy6n2kkpv4HUQtHHzPwEvliN39eNilovb56
iGOzjDX1UWHAu4xCTVnPHSG4fA4XU41NWzIN3DIVPE25lYSISSl9TFAdR8GeZA0G
0Yj6sW53pRSoUwco1ocoS44/FgrPOB5/vHIL06pABvicXBiomje1QylqcK7zAczk
esjkfPEZrmZuu99GtqFyDNKEvKKdy+aBGaTZ3y+NxsuBs+0xS2Owz1IE4Tk28xaw
xh7zn+CVdk2fJh6ZIdw5Eu9b9VN08UriqDmDzO/ylDlcNGcDi7wcxiSTEkHJ1ON/
g9nletV6f3egL0wljDcOnhCJCHTvmWEeq3z8lE55QzPzSH0hHpnGQ2WD0tKrroxz
kjOZp0TdXa4F5iysOHe2xl2sftOH0zIkBQJ+oBcK12mTaLu21+yeuCggQXJ/CBdk
1Ol7l9g9T0TDuZPfiTHt5+6jmECQs92LElWA8x7uF7Fpix3BpnafWaaSMSsosF3F
D1Y=
=Nrl9
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-4.16' of git://git.infradead.org/users/hch/dma-mapping
Pull dma mapping updates from Christoph Hellwig:
"Except for a runtime warning fix from Christian this is all about
consolidation of the generic no-IOMMU code, a well as the glue code
for swiotlb.
All the code is based on the x86 implementation with hooks to allow
all architectures that aren't cache coherent to use it.
The x86 conversion itself has been deferred because the x86
maintainers were a little busy in the last months"
* tag 'dma-mapping-4.16' of git://git.infradead.org/users/hch/dma-mapping: (57 commits)
MAINTAINERS: add the iommu list for swiotlb and xen-swiotlb
arm64: use swiotlb_alloc and swiotlb_free
arm64: replace ZONE_DMA with ZONE_DMA32
mips: use swiotlb_{alloc,free}
mips/netlogic: remove swiotlb support
tile: use generic swiotlb_ops
tile: replace ZONE_DMA with ZONE_DMA32
unicore32: use generic swiotlb_ops
ia64: remove an ifdef around the content of pci-dma.c
ia64: clean up swiotlb support
ia64: use generic swiotlb_ops
ia64: replace ZONE_DMA with ZONE_DMA32
swiotlb: remove various exports
swiotlb: refactor coherent buffer allocation
swiotlb: refactor coherent buffer freeing
swiotlb: wire up ->dma_supported in swiotlb_dma_ops
swiotlb: add common swiotlb_map_ops
swiotlb: rename swiotlb_free to swiotlb_exit
x86: rename swiotlb_dma_ops
powerpc: rename swiotlb_dma_ops
...
Pull misc vfs updates from Al Viro:
"All kinds of misc stuff, without any unifying topic, from various
people.
Neil's d_anon patch, several bugfixes, introduction of kvmalloc
analogue of kmemdup_user(), extending bitfield.h to deal with
fixed-endians, assorted cleanups all over the place..."
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits)
alpha: osf_sys.c: use timespec64 where appropriate
alpha: osf_sys.c: fix put_tv32 regression
jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path
dcache: delete unused d_hash_mask
dcache: subtract d_hash_shift from 32 in advance
fs/buffer.c: fold init_buffer() into init_page_buffers()
fs: fold __inode_permission() into inode_permission()
fs: add RWF_APPEND
sctp: use vmemdup_user() rather than badly open-coding memdup_user()
snd_ctl_elem_init_enum_names(): switch to vmemdup_user()
replace_user_tlv(): switch to vmemdup_user()
new primitive: vmemdup_user()
memdup_user(): switch to GFP_USER
eventfd: fold eventfd_ctx_get() into eventfd_ctx_fileget()
eventfd: fold eventfd_ctx_read() into eventfd_read()
eventfd: convert to use anon_inode_getfd()
nfs4file: get rid of pointless include of btrfs.h
uvc_v4l2: clean copyin/copyout up
vme_user: don't use __copy_..._user()
usx2y: don't bother with memdup_user() for 16-byte structure
...
Pull RCU updates from Ingo Molnar:
"The main RCU changes in this cycle were:
- Updates to use cond_resched() instead of cond_resched_rcu_qs()
where feasible (currently everywhere except in kernel/rcu and in
kernel/torture.c). Also a couple of fixes to avoid sending IPIs to
offline CPUs.
- Updates to simplify RCU's dyntick-idle handling.
- Updates to remove almost all uses of smp_read_barrier_depends() and
read_barrier_depends().
- Torture-test updates.
- Miscellaneous fixes"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
torture: Save a line in stutter_wait(): while -> for
torture: Eliminate torture_runnable and perf_runnable
torture: Make stutter less vulnerable to compilers and races
locking/locktorture: Fix num reader/writer corner cases
locking/locktorture: Fix rwsem reader_delay
torture: Place all torture-test modules in one MAINTAINERS group
rcutorture/kvm-build.sh: Skip build directory check
rcutorture: Simplify functions.sh include path
rcutorture: Simplify logging
rcutorture/kvm-recheck-*: Improve result directory readability check
rcutorture/kvm.sh: Support execution from any directory
rcutorture/kvm.sh: Use consistent help text for --qemu-args
rcutorture/kvm.sh: Remove unused variable, `alldone`
rcutorture: Remove unused script, config2frag.sh
rcutorture/configinit: Fix build directory error message
rcutorture: Preempt RCU-preempt readers more vigorously
torture: Reduce #ifdefs for preempt_schedule()
rcu: Remove have_rcu_nocb_mask from tree_plugin.h
rcu: Add comment giving debug strategy for double call_rcu()
tracing, rcu: Hide trace event rcu_nocb_wake when not used
...
Pull STRICT_DEVMEM default from Ingo Molnar:
"Make CONFIG_STRICT_DEVMEM default-y on x86 and arm64 as well, to
follow the distro status quo"
* 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Kconfig: Make STRICT_DEVMEM default-y on x86 and arm64
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJabj6pAAoJEHm+PkMAQRiGs8cIAJQFkCWnbz86e3vG4DuWhyA8
CMGHCQdUOxxFGa/ixhIiuetbC0x+JVHAjV2FwVYbAQfaZB3pfw2iR1ncQxpAP1AI
oLU9vBEqTmwKMPc9CM5rRfnLFWpGcGwUNzgPdxD5yYqGDtcM8K840mF6NdkYe5AN
xU8rv1wlcFPF4A5pvHCH0pvVmK4VxlVFk/2H67TFdxBs4PyJOnSBnf+bcGWgsKO6
hC8XIVtcKCH2GfFxt5d0Vgc5QXJEpX1zn2mtCa1MwYRjN2plgYfD84ha0xE7J0B0
oqV/wnjKXDsmrgVpncr3txd4+zKJFNkdNRE4eLAIupHo2XHTG4HvDJ5dBY2NhGU=
=sOml
-----END PGP SIGNATURE-----
Merge tag v4.15 of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
To resolve conflicts in:
drivers/infiniband/hw/mlx5/main.c
drivers/infiniband/hw/mlx5/qp.c
From patches merged into the -rc cycle. The conflict resolution matches
what linux-next has been carrying.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Pull block updates from Jens Axboe:
"This is the main pull request for block IO related changes for the
4.16 kernel. Nothing major in this pull request, but a good amount of
improvements and fixes all over the map. This contains:
- BFQ improvements, fixes, and cleanups from Angelo, Chiara, and
Paolo.
- Support for SMR zones for deadline and mq-deadline from Damien and
Christoph.
- Set of fixes for bcache by way of Michael Lyle, including fixes
from himself, Kent, Rui, Tang, and Coly.
- Series from Matias for lightnvm with fixes from Hans Holmberg,
Javier, and Matias. Mostly centered around pblk, and the removing
rrpc 1.2 in preparation for supporting 2.0.
- A couple of NVMe pull requests from Christoph. Nothing major in
here, just fixes and cleanups, and support for command tracing from
Johannes.
- Support for blk-throttle for tracking reads and writes separately.
From Joseph Qi. A few cleanups/fixes also for blk-throttle from
Weiping.
- Series from Mike Snitzer that enables dm to register its queue more
logically, something that's alwways been problematic on dm since
it's a stacked device.
- Series from Ming cleaning up some of the bio accessor use, in
preparation for supporting multipage bvecs.
- Various fixes from Ming closing up holes around queue mapping and
quiescing.
- BSD partition fix from Richard Narron, fixing a problem where we
can't mount newer (10/11) FreeBSD partitions.
- Series from Tejun reworking blk-mq timeout handling. The previous
scheme relied on atomic bits, but it had races where we would think
a request had timed out if it to reused at the wrong time.
- null_blk now supports faking timeouts, to enable us to better
exercise and test that functionality separately. From me.
- Kill the separate atomic poll bit in the request struct. After
this, we don't use the atomic bits on blk-mq anymore at all. From
me.
- sgl_alloc/free helpers from Bart.
- Heavily contended tag case scalability improvement from me.
- Various little fixes and cleanups from Arnd, Bart, Corentin,
Douglas, Eryu, Goldwyn, and myself"
* 'for-4.16/block' of git://git.kernel.dk/linux-block: (186 commits)
block: remove smart1,2.h
nvme: add tracepoint for nvme_complete_rq
nvme: add tracepoint for nvme_setup_cmd
nvme-pci: introduce RECONNECTING state to mark initializing procedure
nvme-rdma: remove redundant boolean for inline_data
nvme: don't free uuid pointer before printing it
nvme-pci: Suspend queues after deleting them
bsg: use pr_debug instead of hand crafted macros
blk-mq-debugfs: don't allow write on attributes with seq_operations set
nvme-pci: Fix queue double allocations
block: Set BIO_TRACE_COMPLETION on new bio during split
blk-throttle: use queue_is_rq_based
block: Remove kblockd_schedule_delayed_work{,_on}()
blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended delays
blk-mq: Rename blk_mq_request_direct_issue() into blk_mq_request_issue_directly()
lib/scatterlist: Fix chaining support in sgl_alloc_order()
blk-throttle: track read and write request individually
block: add bdev_read_only() checks to common helpers
block: fail op_is_write() requests to read-only partitions
blk-throttle: export io_serviced_recursive, io_service_bytes_recursive
...
Update selftests to relfect recent changes and add various new
test cases.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
b24413180f ("License cleanup: add SPDX GPL-2.0 license identifier to
files with no license") added SPDX GPL-2.0 to several PCI files that
previously contained no license information.
Add SPDX GPL-2.0 to all other PCI files that did not contain any license
information and hence were under the default GPL version 2 license of the
kernel.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When trace_printk() was introduced, it was discussed that making it be as
low overhead as possible, that the processing of the format string should be
delayed until it is read. That is, a "trace_printk()" should not convert
the %d into numbers and so on, but instead, save the fmt string and all the
args in the buffer at the time of recording. When the trace_printk() data is
read, it would then parse the format string and do the conversions of the
saved arguments in the tracing buffer.
The code to perform this was added to vsprintf where vbin_printf() would
save the arguments of a specified format string in a buffer, then
bstr_printf() could be used to convert the buffer with the same format
string into the final output, as if vsprintf() was called in one go.
The issue arises when dereferenced pointers are used. The problem is that
something like %*pbl which reads a bitmask, will save the pointer to the
bitmask in the buffer. Then the reading of the buffer via bstr_printf() will
then look at the pointer to process the final output. Obviously the value of
that pointer could have changed since the time it was recorded to the time
the buffer is read. Worse yet, the bitmask could be unmapped, and the
reading of the trace buffer could actually cause a kernel oops.
Another problem is that user space tools such as perf and trace-cmd do not
have access to the contents of these pointers, and they become useless when
the tracing buffer is extracted.
Instead of having vbin_printf() simply save the pointer in the buffer for
later processing, have it perform the formatting at the time bin_printf() is
called. This will fix the issue of dereferencing pointers at a later time,
and has the extra benefit of having user space tools understand these
values.
Since perf and trace-cmd already can handle %p[sSfF] via saving kallsyms,
their pointers are saved and not processed during vbin_printf(). If they
were converted, it would break perf and trace-cmd, as they would not know
how to deal with the conversion.
Link: http://lkml.kernel.org/r/20171228204025.14a71d8f@gandalf.local.home
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Make it possible to call these two functions from a kernel module.
Note: despite their name, these two functions can be used meaningfully
independent of kobjects. A later patch will add calls to these
functions from the SRP driver because this patch series modifies the
SRP driver such that it can hold a reference to a namespace that can
last longer than the lifetime of the process through which the
namespace reference was obtained.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Fixes the following sparse warnings:
lib/test_firmware.c:99:20: warning:
symbol 'test_fw_config' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a couple of test cases for interpreter and JIT that are
related to an issue we faced some time ago in Cilium [1],
which is fixed in LLVM with commit e53750e1e086 ("bpf: fix
bug on silently truncating 64-bit immediate").
Test cases were run-time checking kernel to behave as intended
which should also provide some guidance for current or new
JITs in case they should trip over this. Added for cBPF and
eBPF.
[1] https://github.com/cilium/cilium/pull/2162
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
All these symbols are only used by arch dma_ops implementations or
xen-swiotlb. None of which can be modular.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
Factor out a new swiotlb_alloc_buffer helper that allocates DMA coherent
memory from the swiotlb bounce buffer.
This allows to simplify the swiotlb_alloc implemenation that uses
dma_direct_alloc to try to allocate a reachable buffer first.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
Factor out a new swiotlb_free_buffer helper that checks if an address
is allocated from the swiotlb bounce buffer, and if yes frees it.
This allows to simplify the swiotlb_free implemenation that uses
dma_direct_free to free the non-bounce buffer allocations.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
To properly reject too small DMA masks based on the addressability of the
bounce buffer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
Currently all architectures that want to use swiotlb have to implement
their own dma_map_ops instances. Provide a generic one based on the
x86 implementation which first calls into dma_direct to try a full blown
direct mapping implementation (including e.g. CMA) before falling back
allocating from the swiotlb buffer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
TTM tries to allocate coherent memory in chunks of 2MB first to improve
TLB efficiency and falls back to allocating 4K pages if that fails.
Suppress the warning when the 2MB allocations fails since there is a
valid fall back path.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reported-by: Mike Galbraith <efault@gmx.de>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=104082
CC: stable@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
So that they don't need to indirect through the operation vector.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
If an attempt to allocate memory succeeded, but isn't inside the
supported DMA mask, retry the allocation with GFP_DMA set as a
last resort.
Based on the x86 code, but an off by one error in what is now
dma_coherent_ok has been fixed vs the x86 code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
This allows to dip into zones for lower memory if they are available.
If one of the zones is not available the corresponding GFP_* flag
will evaluate to 0 so they won't change anything. We provide an
arch tunable for those architectures that do not use GFP_DMA for
the lowest 24-bits, given that there are a few.
Roughly based on the x86 code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
This means it uses whatever linear remapping scheme that the architecture
provides is used in the generic dma_direct ops.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
The trivial direct mapping implementation already does a virtual to
physical translation which isn't strictly a noop, and will soon learn
to do non-direct but linear physical to dma translations through the
device offset and a few small tricks. Rename it to a better fitting
name.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Support in-kernel fault-injection framework via debugfs.
This allows you to inject a conditional error to specified
function using debugfs interfaces.
Here is the result of test script described in
Documentation/fault-injection/fault-injection.txt
===========
# ./test_fail_function.sh
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0227404 s, 46.1 MB/s
btrfs-progs v4.4
See http://btrfs.wiki.kernel.org for more information.
Label: (null)
UUID: bfa96010-12e9-4360-aed0-42eec7af5798
Node size: 16384
Sector size: 4096
Filesystem size: 1001.00MiB
Block group profiles:
Data: single 8.00MiB
Metadata: DUP 58.00MiB
System: DUP 12.00MiB
SSD detected: no
Incompat features: extref, skinny-metadata
Number of devices: 1
Devices:
ID SIZE PATH
1 1001.00MiB /dev/loop2
mount: mount /dev/loop2 on /opt/tmpmnt failed: Cannot allocate memory
SUCCESS!
===========
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add injectable error types for each error-injectable function.
One motivation of error injection test is to find software flaws,
mistakes or mis-handlings of expectable errors. If we find such
flaws by the test, that is a program bug, so we need to fix it.
But if the tester miss input the error (e.g. just return success
code without processing anything), it causes unexpected behavior
even if the caller is correctly programmed to handle any errors.
That is not what we want to test by error injection.
To clarify what type of errors the caller must expect for each
injectable function, this introduces injectable error types:
- EI_ETYPE_NULL : means the function will return NULL if it
fails. No ERR_PTR, just a NULL.
- EI_ETYPE_ERRNO : means the function will return -ERRNO
if it fails.
- EI_ETYPE_ERRNO_NULL : means the function will return -ERRNO
(ERR_PTR) or NULL.
ALLOW_ERROR_INJECTION() macro is expanded to get one of
NULL, ERRNO, ERRNO_NULL to record the error type for
each function. e.g.
ALLOW_ERROR_INJECTION(open_ctree, ERRNO)
This error types are shown in debugfs as below.
====
/ # cat /sys/kernel/debug/error_injection/list
open_ctree [btrfs] ERRNO
io_ctl_init [btrfs] ERRNO
====
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Since error-injection framework is not limited to be used
by kprobes, nor bpf. Other kernel subsystems can use it
freely for checking safeness of error-injection, e.g.
livepatch, ftrace etc.
So this separate error-injection framework from kprobes.
Some differences has been made:
- "kprobe" word is removed from any APIs/structures.
- BPF_ALLOW_ERROR_INJECTION() is renamed to
ALLOW_ERROR_INJECTION() since it is not limited for BPF too.
- CONFIG_FUNCTION_ERROR_INJECTION is the config item of this
feature. It is automatically enabled if the arch supports
error injection feature for kprobe or ftrace etc.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
For chacha20_block(), use the existing 32-bit left-rotate function
instead of defining one ourselves.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
BPF alignment tests got a conflict because the registers
are output as Rn_w instead of just Rn in net-next, and
in net a fixup for a testcase prohibits logical operations
on pointers before using them.
Also, we should attempt to patch BPF call args if JIT always on is
enabled. Instead, if we fail to JIT the subprogs we should pass
an error back up and fail immediately.
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf 2018-01-09
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Prevent out-of-bounds speculation in BPF maps by masking the
index after bounds checks in order to fix spectre v1, and
add an option BPF_JIT_ALWAYS_ON into Kconfig that allows for
removing the BPF interpreter from the kernel in favor of
JIT-only mode to make spectre v2 harder, from Alexei.
2) Remove false sharing of map refcount with max_entries which
was used in spectre v1, from Daniel.
3) Add a missing NULL psock check in sockmap in order to fix
a race, from John.
4) Fix test_align BPF selftest case since a recent change in
verifier rejects the bit-wise arithmetic on pointers
earlier but test_align update was missing, from Alexei.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
phys_to_dma, dma_to_phys and dma_capable are helpers published by
architecture code for use of swiotlb and xen-swiotlb only. Drivers are
not supposed to use these directly, but use the DMA API instead.
Move these to a new asm/dma-direct.h helper, included by a
linux/dma-direct.h wrapper that provides the default linear mapping
unless the architecture wants to override it.
In the MIPS case the existing dma-coherent.h is reused for now as
untangling it will take a bit of work.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Robin Murphy <robin.murphy@arm.com>
The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.
A quote from goolge project zero blog:
"At this point, it would normally be necessary to locate gadgets in
the host kernel code that can be used to actually leak data by reading
from an attacker-controlled location, shifting and masking the result
appropriately and then using the result of that as offset to an
attacker-controlled address for a load. But piecing gadgets together
and figuring out which ones work in a speculation context seems annoying.
So instead, we decided to use the eBPF interpreter, which is built into
the host kernel - while there is no legitimate way to invoke it from inside
a VM, the presence of the code in the host kernel's text section is sufficient
to make it usable for the attack, just like with ordinary ROP gadgets."
To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
option that removes interpreter from the kernel in favor of JIT-only mode.
So far eBPF JIT is supported by:
x64, arm64, arm32, sparc64, s390, powerpc64, mips64
The start of JITed program is randomized and code page is marked as read-only.
In addition "constant blinding" can be turned on with net.core.bpf_jit_harden
v2->v3:
- move __bpf_prog_ret0 under ifdef (Daniel)
v1->v2:
- fix init order, test_bpf and cBPF (Daniel's feedback)
- fix offloaded bpf (Jakub's feedback)
- add 'return 0' dummy in case something can invoke prog->bpf_func
- retarget bpf tree. For bpf-next the patch would need one extra hunk.
It will be sent when the trees are merged back to net-next
Considered doing:
int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
but it seems better to land the patch as-is and in bpf-next remove
bpf_jit_enable global variable from all JITs, consolidate in one place
and remove this jit_init() function.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
dereference_symbol_descriptor() invokes appropriate ARCH specific
function descriptor dereference callbacks:
- dereference_kernel_function_descriptor() if the pointer is a
kernel symbol;
- dereference_module_function_descriptor() if the pointer is a
module symbol.
This is the last step needed to make '%pS/%ps' smart enough to
handle function descriptor dereference on affected ARCHs and
to retire '%pF/%pf'.
To refresh it:
Some architectures (ia64, ppc64, parisc64) use an indirect pointer
for C function pointers - the function pointer points to a function
descriptor and we need to dereference it to get the actual function
pointer.
Function descriptors live in .opd elf section and all affected
ARCHs (ia64, ppc64, parisc64) handle it properly for kernel and
modules. So we, technically, can decide if the dereference is
needed by simply looking at the pointer: if it belongs to .opd
section then we need to dereference it.
The kernel and modules have their own .opd sections, obviously,
that's why we need to split dereference_function_descriptor()
and use separate kernel and module dereference arch callbacks.
Link: http://lkml.kernel.org/r/20171206043649.GB15885@jagdpanzerIV
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: James Bottomley <jejb@parisc-linux.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-ia64@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Tested-by: Tony Luck <tony.luck@intel.com> #ia64
Tested-by: Santosh Sivaraj <santosh@fossix.org> #powerpc
Tested-by: Helge Deller <deller@gmx.de> #parisc64
Signed-off-by: Petr Mladek <pmladek@suse.com>
In support of a soon to be published MFD driver using serdev to talk to
a supervisory processor that uses the CCITT-FALSE CRC16 variant in it's
protocol, this patch was tested successfully on an i.MX6 ARM platform.
Link: http://lkml.kernel.org/r/20170413142932.27287-1-andrew.smirnov@gmail.com
Signed-off-by: Andrey Vostrikov <andrey.vostrikov@cogentembedded.com>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Tested-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Many kernel drivers contain code that allocates and frees both a
scatterlist and the pages that populate that scatterlist.
Introduce functions in lib/scatterlist.c that perform these tasks
instead of duplicating this functionality in multiple drivers.
Only include these functions in the build if CONFIG_SGL_ALLOC=y
to avoid that the kernel size increases if this functionality is
not used.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- racy use of ctx->rcvused in af_alg
- algif_aead crash in chacha20poly1305
- freeing bogus pointer in pcrypt
- build error on MIPS in mpi
- memory leak in inside-secure
- memory overwrite in inside-secure
- NULL pointer dereference in inside-secure
- state corruption in inside-secure
- build error without CRYPTO_GF128MUL in chelsio
- use after free in n2"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: inside-secure - do not use areq->result for partial results
crypto: inside-secure - fix request allocations in invalidation path
crypto: inside-secure - free requests even if their handling failed
crypto: inside-secure - per request invalidation
lib/mpi: Fix umul_ppmm() for MIPS64r6
crypto: pcrypt - fix freeing pcrypt instances
crypto: n2 - cure use after free
crypto: af_alg - Fix race around ctx->rcvused by making it atomic_t
crypto: chacha20poly1305 - validate the digest size
crypto: chelsio - select CRYPTO_GF128MUL
print_symbol() is a very old API that has been obsoleted by %pS format
specifier in a normal printk() call.
Replace print_symbol() with a direct printk("%pS") call.
Link: http://lkml.kernel.org/r/20171211125025.2270-13-sergey.senozhatsky@gmail.com
To: Andrew Morton <akpm@linux-foundation.org>
To: Russell King <linux@armlinux.org.uk>
To: Catalin Marinas <catalin.marinas@arm.com>
To: Mark Salter <msalter@redhat.com>
To: Tony Luck <tony.luck@intel.com>
To: David Howells <dhowells@redhat.com>
To: Yoshinori Sato <ysato@users.sourceforge.jp>
To: Guan Xuetao <gxt@mprc.pku.edu.cn>
To: Borislav Petkov <bp@alien8.de>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Thomas Gleixner <tglx@linutronix.de>
To: Peter Zijlstra <peterz@infradead.org>
To: Vineet Gupta <vgupta@synopsys.com>
To: Fengguang Wu <fengguang.wu@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-am33-list@redhat.com
Cc: linux-sh@vger.kernel.org
Cc: linux-edac@vger.kernel.org
Cc: x86@kernel.org
Cc: linux-snps-arc@lists.infradead.org
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
[pmladek@suse.com: updated commit message]
Signed-off-by: Petr Mladek <pmladek@suse.com>
Pull RCU updates from Paul E. McKenney:
- Updates to use cond_resched() instead of cond_resched_rcu_qs()
where feasible (currently everywhere except in kernel/rcu and
in kernel/torture.c). Also a couple of fixes to avoid sending
IPIs to offline CPUs.
- Updates to simplify RCU's dyntick-idle handling.
- Updates to remove almost all uses of smp_read_barrier_depends()
and read_barrier_depends().
- Miscellaneous fixes.
- Torture-test updates.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- Move errseq.rst into core-api
- Add errseq to the core-api index
- Promote the header to a more prominent header type, otherwise we get three
entries in the table of contents.
- Reformat the table to look nicer and be a little more proportional in
terms of horizontal width per bit (the SF bit is still disproportionately
large, but there's no way to fix that).
- Include errseq kernel-doc in the errseq.rst
- Neaten some kernel-doc markup
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Pull timer fixes from Thomas Gleixner:
"A pile of fixes for long standing issues with the timer wheel and the
NOHZ code:
- Prevent timer base confusion accross the nohz switch, which can
cause unlocked access and data corruption
- Reinitialize the stale base clock on cpu hotplug to prevent subtle
side effects including rollovers on 32bit
- Prevent an interrupt storm when the timer softirq is already
pending caused by tick_nohz_stop_sched_tick()
- Move the timer start tracepoint to a place where it actually makes
sense
- Add documentation to timerqueue functions as they caused confusion
several times now"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timerqueue: Document return values of timerqueue_add/del()
timers: Invoke timer_start_debug() where it makes sense
nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
timers: Reinitialize per cpu bases on hotplug
timers: Use deferrable base independent of base::nohz_active
Here are 2 driver core fixes for 4.15-rc6, resolving some reported
issues.
The first is a cacheinfo fix for DT based systems to resolve a reported
issue that has been around for a while, and the other is to resolve a
regression in the kobject uevent code that showed up in 4.15-rc1.
Both have been in linux-next for a while with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWkjLCA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yk6GACbBZ/EqSrd5w6//4okaHJLPAi8ligAoNfcBqrH
TPQycbostUcOBHUvvoUB
=MaZG
-----END PGP SIGNATURE-----
Merge tag 'driver-core-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are two driver core fixes for 4.15-rc6, resolving some reported
issues.
The first is a cacheinfo fix for DT based systems to resolve a
reported issue that has been around for a while, and the other is to
resolve a regression in the kobject uevent code that showed up in
4.15-rc1.
Both have been in linux-next for a while with no reported issues"
* tag 'driver-core-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
kobject: fix suppressing modalias in uevents delivered over netlink
drivers: base: cacheinfo: fix cache type for non-architected system cache
The return values of timerqueue_add/del() are not documented in the kernel doc
comment. Add proper documentation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: rt@linutronix.de
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Link: https://lkml.kernel.org/r/20171222145337.872681338@linutronix.de
Even with a number of waitqueues, we can get into a situation where we
are heavily contended on the waitqueue lock. I got a report on spc1
where we're spending seconds doing this. Arguably the use case is nasty,
I reproduce it with one device and 1000 threads banging on the device.
But that doesn't mean we shouldn't be handling it better.
What ends up happening is that a thread will fail to get a tag, add
itself to the waitqueue, and subsequently get woken up when a tag is
freed - only to find itself going back to sleep on the waitqueue.
Instead of waking all threads, use an exclusive wait and wake up our
sbitmap batch count instead. This seems to work well for me (massive
improvement for this use case), and it survives basic testing. But I
haven't fully verified it yet.
An additional improvement is running the queue and checking for a new
tag BEFORE needing to add ourselves to the waitqueue.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Lots of overlapping changes. Also on the net-next side
the XDP state management is handled more in the generic
layers so undo the 'net' nfp fix which isn't applicable
in net-next.
Include a necessary change by Jakub Kicinski, with log message:
====================
cls_bpf no longer takes care of offload tracking. Make sure
netdevsim performs necessary checks. This fixes a warning
caused by TC trying to remove a filter it has not added.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Current MIPS64r6 toolchains aren't able to generate efficient
DMULU/DMUHU based code for the C implementation of umul_ppmm(), which
performs an unsigned 64 x 64 bit multiply and returns the upper and
lower 64-bit halves of the 128-bit result. Instead it widens the 64-bit
inputs to 128-bits and emits a __multi3 intrinsic call to perform a 128
x 128 multiply. This is both inefficient, and it results in a link error
since we don't include __multi3 in MIPS linux.
For example commit 90a53e4432 ("cfg80211: implement regdb signature
checking") merged in v4.15-rc1 recently broke the 64r6_defconfig and
64r6el_defconfig builds by indirectly selecting MPILIB. The same build
errors can be reproduced on older kernels by enabling e.g. CRYPTO_RSA:
lib/mpi/generic_mpih-mul1.o: In function `mpihelp_mul_1':
lib/mpi/generic_mpih-mul1.c:50: undefined reference to `__multi3'
lib/mpi/generic_mpih-mul2.o: In function `mpihelp_addmul_1':
lib/mpi/generic_mpih-mul2.c:49: undefined reference to `__multi3'
lib/mpi/generic_mpih-mul3.o: In function `mpihelp_submul_1':
lib/mpi/generic_mpih-mul3.c:49: undefined reference to `__multi3'
lib/mpi/mpih-div.o In function `mpihelp_divrem':
lib/mpi/mpih-div.c:205: undefined reference to `__multi3'
lib/mpi/mpih-div.c:142: undefined reference to `__multi3'
Therefore add an efficient MIPS64r6 implementation of umul_ppmm() using
inline assembly and the DMULU/DMUHU instructions, to prevent __multi3
calls being emitted.
Fixes: 7fd08ca58a ("MIPS: Add build support for the MIPS R6 ISA")
Signed-off-by: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-mips@linux-mips.org
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Documentation/printk-formats.txt is a candidate for conversion to
ReStructuredText format. Some effort has already been made to do this
conversion even thought the suffix is currently .txt
Changes required to complete conversion
- Move printk-formats.txt to core-api/printk-formats.rst
- Add entry to Documentation/core-api/index.rst
- Remove entry from Documentation/00-INDEX
- Fix minor grammatical errors.
- Order heading adornments as suggested by rst docs.
- Use 'Passed by reference' uniformly.
- Update pointer documentation around %px specifier.
- Fix erroneous double backticks (to commas).
- Remove extraneous double backticks (suggested by Jonathan Corbet).
- Simplify documentation for kobject.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
[jc: downcased "kernel"]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
The commit 4a336a23d6 ("kobject: copy env blob in one go") optimized
constructing uevent data for delivery over netlink by using the raw
environment buffer, instead of reconstructing it from individual
environment pointers. Unfortunately in doing so it broke suppressing
MODALIAS attribute for KOBJ_UNBIND events, as the code that suppressed this
attribute only adjusted the environment pointers, but left the buffer
itself alone. Let's fix it by making sure the offending attribute is
obliterated form the buffer as well.
Reported-by: Tariq Toukan <tariqt@mellanox.com>
Reported-by: Casey Leedom <leedom@chelsio.com>
Fixes: 4a336a23d6 ("kobject: copy env blob in one go")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Borkmann says:
====================
pull-request: bpf 2017-12-17
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix a corner case in generic XDP where we have non-linear skbs
but enough tailroom in the skb to not miss to linearizing there,
from Song.
2) Fix BPF JIT bugs in s390x and ppc64 to not recache skb data when
BPF context is not skb, from Daniel.
3) Fix a BPF JIT bug in sparc64 where recaching skb data after helper
call would use the wrong register for the skb, from Daniel.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Three sets of overlapping changes, two in the packet scheduler
and one in the meson-gxl PHY driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull locking fixes from Ingo Molnar:
"Misc fixes:
- Fix a S390 boot hang that was caused by the lock-break logic.
Remove lock-break to begin with, as review suggested it was
unreasonably fragile and our confidence in its continued good
health is lower than our confidence in its removal.
- Remove the lockdep cross-release checking code for now, because of
unresolved false positive warnings. This should make lockdep work
well everywhere again.
- Get rid of the final (and single) ACCESS_ONCE() straggler and
remove the API from v4.15.
- Fix a liblockdep build warning"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools/lib/lockdep: Add missing declaration of 'pr_cont()'
checkpatch: Remove ACCESS_ONCE() warning
compiler.h: Remove ACCESS_ONCE()
tools/include: Remove ACCESS_ONCE()
tools/perf: Convert ACCESS_ONCE() to READ_ONCE()
locking/lockdep: Remove the cross-release locking checks
locking/core: Remove break_lock field when CONFIG_GENERIC_LOCKBREAK=y
locking/core: Fix deadlock during boot on systems with GENERIC_LOCKBREAK
Add a test that i) uses LD_ABS, ii) zeroing R6 before call, iii) calls
a helper that triggers reload of cached skb data, iv) uses LD_ABS again.
It's added for test_bpf in order to do runtime testing after JITing as
well as test_verifier to test that the sequence is allowed.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a variant of rbtree_replace_node() that maintains the leftmost cache
of struct rbtree_root_cached when replacing nodes within the rbtree.
As drm_mm is the only rb_replace_node() being used on an interval tree,
the mistake looks fairly self-contained. Furthermore the only user of
drm_mm_replace_node() is its testsuite...
Testcase: igt/drm_mm/replace
Link: http://lkml.kernel.org/r/20171122100729.3742-1-chris@chris-wilson.co.uk
Link: https://patchwork.freedesktop.org/patch/msgid/20171109212435.9265-1-chris@chris-wilson.co.uk
Fixes: f808c13fd3 ("lib/interval_tree: fast overlap detection")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
while it found a number of old bugs initially, was also causing too many
false positives that caused people to disable lockdep - which is arguably
a worse overall outcome.
If we disable cross-release by default but keep the code upstream then
in practice the most likely outcome is that we'll allow the situation
to degrade gradually, by allowing entropy to introduce more and more
false positives, until it overwhelms maintenance capacity.
Another bad side effect was that people were trying to work around
the false positives by uglifying/complicating unrelated code. There's
a marked difference between annotating locking operations and
uglifying good code just due to bad lock debugging code ...
This gradual decrease in quality happened to a number of debugging
facilities in the kernel, and lockdep is pretty complex already,
so we cannot risk this outcome.
Either cross-release checking can be done right with no false positives,
or it should not be included in the upstream kernel.
( Note that it might make sense to maintain it out of tree and go through
the false positives every now and then and see whether new bugs were
introduced. )
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Update kernel-doc notation in lib/uuid.c and then add UUID/GUID
function interfaces to kernel-api.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
[jc: tweaked the uuid_is_valid() kerneldoc]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Distros have been shipping with CONFIG_STRICT_DEVMEM=y for years now. It
is probably time to flip this default for x86 and arm64.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Laura Abbott <labbott@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kernel-hardening@lists.openwall.com
Link: http://lkml.kernel.org/r/20171201201000.GA44539@beast
Signed-off-by: Ingo Molnar <mingo@kernel.org>
To allocate the array of bucket locks for the hash table we now
call library function alloc_bucket_spinlocks. This function is
based on the old alloc_bucket_locks in rhashtable and should
produce the same effect.
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add two new library functions: alloc_bucket_spinlocks and
free_bucket_spinlocks. These are used to allocate and free an array
of spinlocks that are useful as locks for hash buckets. The interface
specifies the maximum number of spinlocks in the array as well
as a CPU multiplier to derive the number of spinlocks to allocate.
The number allocated is rounded up to a power of two to make the
array amenable to hash lookup.
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function is like rhashtable_walk_next except that it only returns
the current element in the inter and does not advance the iter.
This patch also creates __rhashtable_walk_find_next. It finds the next
element in the table when the entry cached in iter is NULL or at the end
of a slot. __rhashtable_walk_find_next is called from
rhashtable_walk_next and rhastable_walk_peek.
end_of_table is an added field to the iter structure. This indicates
that the end of table was reached (walker.tbl being NULL is not a
sufficient condition for end of table).
Signed-off-by: Tom Herbert <tom@quantonium.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most callers of rhashtable_walk_start don't care about a resize event
which is indicated by a return value of -EAGAIN. So calls to
rhashtable_walk_start are wrapped wih code to ignore -EAGAIN. Something
like this is common:
ret = rhashtable_walk_start(rhiter);
if (ret && ret != -EAGAIN)
goto out;
Since zero and -EAGAIN are the only possible return values from the
function this check is pointless. The condition never evaluates to true.
This patch changes rhashtable_walk_start to return void. This simplifies
code for the callers that ignore -EAGAIN. For the few cases where the
caller cares about the resize event, particularly where the table can be
walked in mulitple parts for netlink or seq file dump, the function
rhashtable_walk_start_check has been added that returns -EAGAIN on a
resize event.
Signed-off-by: Tom Herbert <tom@quantonium.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Destination is a kernel pointer and source - a userland one
in _copy_from_user(); _copy_to_user() is the other way round.
Fixes: d597580d37 ("generic ...copy_..._user primitives")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Callers of sprint_oid() do not check its return value before printing
the result. In the case where the OID is zero-length, -EBADMSG was
being returned without anything being written to the buffer, resulting
in uninitialized stack memory being printed. Fix this by writing
"(bad)" to the buffer in the cases where -EBADMSG is returned.
Fixes: 4f73175d03 ("X.509: Add utility functions to render OIDs as strings")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
In sprint_oid(), if the input buffer were to be more than 1 byte too
small for the first snprintf(), 'bufsize' would underflow, causing a
buffer overflow when printing the remainder of the OID.
Fortunately this cannot actually happen currently, because no users pass
in a buffer that can be too small for the first snprintf().
Regardless, fix it by checking the snprintf() return value correctly.
For consistency also tweak the second snprintf() check to look the same.
Fixes: 4f73175d03 ("X.509: Add utility functions to render OIDs as strings")
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
asn1_ber_decoder() was ignoring errors from actions associated with the
opcodes ASN1_OP_END_SEQ_ACT, ASN1_OP_END_SET_ACT,
ASN1_OP_END_SEQ_OF_ACT, and ASN1_OP_END_SET_OF_ACT. In practice, this
meant the pkcs7_note_signed_info() action (since that was the only user
of those opcodes). Fix it by checking for the error, just like the
decoder does for actions associated with the other opcodes.
This bug allowed users to leak slab memory by repeatedly trying to add a
specially crafted "pkcs7_test" key (requires CONFIG_PKCS7_TEST_KEY).
In theory, this bug could also be used to bypass module signature
verification, by providing a PKCS#7 message that is misparsed such that
a signature's ->authattrs do not contain its ->msgdigest. But it
doesn't seem practical in normal cases, due to restrictions on the
format of the ->authattrs.
Fixes: 42d5ec27f8 ("X.509: Add an ASN.1 decoder")
Cc: <stable@vger.kernel.org> # v3.7+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
In asn1_ber_decoder(), indefinitely-sized ASN.1 items were being passed
to the action functions before their lengths had been computed, using
the bogus length of 0x80 (ASN1_INDEFINITE_LENGTH). This resulted in
reading data past the end of the input buffer, when given a specially
crafted message.
Fix it by rearranging the code so that the indefinite length is resolved
before the action is called.
This bug was originally found by fuzzing the X.509 parser in userspace
using libFuzzer from the LLVM project.
KASAN report (cleaned up slightly):
BUG: KASAN: slab-out-of-bounds in memcpy ./include/linux/string.h:341 [inline]
BUG: KASAN: slab-out-of-bounds in x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366
Read of size 128 at addr ffff880035dd9eaf by task keyctl/195
CPU: 1 PID: 195 Comm: keyctl Not tainted 4.14.0-09238-g1d3b78bbc6e9 #26
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0xd1/0x175 lib/dump_stack.c:53
print_address_description+0x78/0x260 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x23f/0x350 mm/kasan/report.c:409
memcpy+0x1f/0x50 mm/kasan/kasan.c:302
memcpy ./include/linux/string.h:341 [inline]
x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366
asn1_ber_decoder+0xb4a/0x1fd0 lib/asn1_decoder.c:447
x509_cert_parse+0x1c7/0x620 crypto/asymmetric_keys/x509_cert_parser.c:89
x509_key_preparse+0x61/0x750 crypto/asymmetric_keys/x509_public_key.c:174
asymmetric_key_preparse+0xa4/0x150 crypto/asymmetric_keys/asymmetric_type.c:388
key_create_or_update+0x4d4/0x10a0 security/keys/key.c:850
SYSC_add_key security/keys/keyctl.c:122 [inline]
SyS_add_key+0xe8/0x290 security/keys/keyctl.c:62
entry_SYSCALL_64_fastpath+0x1f/0x96
Allocated by task 195:
__do_kmalloc_node mm/slab.c:3675 [inline]
__kmalloc_node+0x47/0x60 mm/slab.c:3682
kvmalloc ./include/linux/mm.h:540 [inline]
SYSC_add_key security/keys/keyctl.c:104 [inline]
SyS_add_key+0x19e/0x290 security/keys/keyctl.c:62
entry_SYSCALL_64_fastpath+0x1f/0x96
Fixes: 42d5ec27f8 ("X.509: Add an ASN.1 decoder")
Reported-by: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org> # v3.7+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Commit 28033ae4e0 ("net: netlink: Update attr validation to require
exact length for some types") requires attributes using types NLA_U* and
NLA_S* to have an exact length. This change is exposing bugs in various
userspace commands that are sending attributes with an invalid length
(e.g., attribute has type NLA_U8 and userspace sends NLA_U32). While
the commands are clearly broken and need to be fixed, users are arguing
that the sudden change in enforcement is breaking older commands on
newer kernels for use cases that otherwise "worked".
Relax the validation to print a warning mesage similar to what is done
for messages containing extra bytes after parsing.
Fixes: 28033ae4e0 ("net: netlink: Update attr validation to require exact length for some types")
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the SPDX tag is in all kobject files, that identifies the
license in a specific and legally-defined manner. So the extra GPL text
wording can be removed as it is no longer needed at all.
This is done on a quest to remove the 700+ different ways that files in
the kernel describe the GPL license text. And there's unneeded stuff
like the address (sometimes incorrect) for the FSF which is never
needed.
No copyright headers or other non-license-description text was removed.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's good to have SPDX identifiers in all files to make it easier to
audit the kernel tree for correct licenses.
Update the kobject files files with the correct SPDX license identifier
based on the license text in the file itself. The SPDX identifier is a
legally binding shorthand, which can be used instead of the full boiler
plate text.
This work is based on a script and data from Thomas Gleixner, Philippe
Ombredanne, and Kate Stewart.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that smp_read_barrier_depends() is implied by READ_ONCE(), the several
smp_read_barrier_depends() calls may be removed from lib/assoc_array.c.
This commit makes this change and marks the READ_ONCE() calls that head
address dependencies.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Because READ_ONCE() now implies smp_read_barrier_depends(), this commit
removes the now-redundant smp_read_barrier_depends() following the
READ_ONCE() in __ref_is_percpu().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
This tag contains a handful of small cleanups that are a result of
feedback that didn't make it into our original patch set, either because
the feedback hadn't been given yet, I missed the original emails, or
we weren't ready to submit the changes yet.
I've been maintaining the various cleanup patch sets I have as their own
branches, which I then merged together and signed. Each merge commit
has a short summary of the changes, and each branch is based on your
latest tag (4.15-rc1, in this case). If this isn't the right way to do
this then feel free to suggest something else, but it seems sane to me.
Here's a short summary of the changes, roughly in order of how
interesting they are.
* libgcc.h has been moved from include/lib, where it's the only member,
to include/linux. This is meant to avoid tab completion conflicts.
* VDSO entries for clock_get/gettimeofday/getcpu have been added. These
are simple syscalls now, but we want to let glibc use them from the
start so we can make them faster later.
* A VDSO entry for instruction cache flushing has been added so
userspace can flush the instruction cache.
* The VDSO symbol versions for __vdso_cmpxchg{32,64} have been removed,
as those VDSO entries don't actually exist.
* __io_writes has been corrected to respect the given type.
* A new READ_ONCE in arch_spin_is_locked().
* __test_and_op_bit_ord() is now actually ordered.
* Various small fixes throughout the tree to enable allmodconfig to
build cleanly.
* Removal of some dead code in our atomic support headers.
* Improvements to various comments in our atomic support headers.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlohyvMTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRDvTKFQLMurQWkhD/wO/F8vrwsNMOWR8zxvHdB30KD+FHmr
X1+X9OqnH8AMd4Woj6pS0ap7g0GCKuLiI/bOTrQVVdTJpmKFaJ9rrwRCJzHq43yt
feRjKyPAFYlvf6YaIEJ3YHU0t3LO1eK27YyFMg6F8y+bZim6oK2GdyfYF0Xiik3B
L3NkDPSH4oplTJjUI+tzDZdMsuZKhxpXPnbNQA7YZLepz04jOPGWqFrA1C3gAaVQ
dj1OkOGTSyQFwia7LrIm2g0J5/mqpjAF0KjdiTsvH6G9x3V0HZYU5Br3kHgauWKc
YrNEbbDl8EakT5QocPf5F4Z8qpO9Hvxjwe2/z27usPtV9FQOPuDDPOygSPykwNNJ
bDfv9nIE3W7lN26BaRcV2ivY3r9ZpCEcq+qXIiTm3P/uTVqjMq54NkvHnj4ON1ih
DJZEgkM9L+rm7c9XDn627FBkmkeEndPJcQ3P/nopb5zGTYTb2HGrUt2nM+KR2vuE
FdYtA9+ll3OzyFO3OVVjiAlxr8Qnwf2wIWXJXxWpcmmchGJ5NeTSZtiD14pAP5eC
EDpoWwefvhqRMGdOlgq/fkx4Mrhz27euWXine3ZccprABAf7Hxkb/N5ojIJKT7qW
mN3HL3PC9P0t/HxQEu0q0NLLsP+X/1yZ5HmDl44Y7N8aeCrIUXaB61gsTt6Oi6Ha
PMJi5PI6VDDQbA==
=CCe+
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-4.15-rc2_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux
Pull RISC-V cleanups and ABI fixes from Palmer Dabbelt:
"This contains a handful of small cleanups that are a result of
feedback that didn't make it into our original patch set, either
because the feedback hadn't been given yet, I missed the original
emails, or we weren't ready to submit the changes yet.
I've been maintaining the various cleanup patch sets I have as their
own branches, which I then merged together and signed. Each merge
commit has a short summary of the changes, and each branch is based on
your latest tag (4.15-rc1, in this case). If this isn't the right way
to do this then feel free to suggest something else, but it seems sane
to me.
Here's a short summary of the changes, roughly in order of how
interesting they are.
- libgcc.h has been moved from include/lib, where it's the only
member, to include/linux. This is meant to avoid tab completion
conflicts.
- VDSO entries for clock_get/gettimeofday/getcpu have been added.
These are simple syscalls now, but we want to let glibc use them
from the start so we can make them faster later.
- A VDSO entry for instruction cache flushing has been added so
userspace can flush the instruction cache.
- The VDSO symbol versions for __vdso_cmpxchg{32,64} have been
removed, as those VDSO entries don't actually exist.
- __io_writes has been corrected to respect the given type.
- A new READ_ONCE in arch_spin_is_locked().
- __test_and_op_bit_ord() is now actually ordered.
- Various small fixes throughout the tree to enable allmodconfig to
build cleanly.
- Removal of some dead code in our atomic support headers.
- Improvements to various comments in our atomic support headers"
* tag 'riscv-for-linus-4.15-rc2_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux: (23 commits)
RISC-V: __io_writes should respect the length argument
move libgcc.h to include/linux
RISC-V: Clean up an unused include
RISC-V: Allow userspace to flush the instruction cache
RISC-V: Flush I$ when making a dirty page executable
RISC-V: Add missing include
RISC-V: Use define for get_cycles like other architectures
RISC-V: Provide stub of setup_profiling_timer()
RISC-V: Export some expected symbols for modules
RISC-V: move empty_zero_page definition to C and export it
RISC-V: io.h: type fixes for warnings
RISC-V: use RISCV_{INT,SHORT} instead of {INT,SHORT} for asm macros
RISC-V: use generic serial.h
RISC-V: remove spin_unlock_wait()
RISC-V: `sfence.vma` orderes the instruction cache
RISC-V: Add READ_ONCE in arch_spin_is_locked()
RISC-V: __test_and_op_bit_ord should be strongly ordered
RISC-V: Remove smb_mb__{before,after}_spinlock()
RISC-V: Remove __smp_bp__{before,after}_atomic
RISC-V: Comment on why {,cmp}xchg is ordered how it is
...
Introducing a new include/lib directory just for this file totally
messes up tab completion for include/linux, which is highly annoying.
Move it to include/linux where we have headers for all kinds of other
lib/ code as well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Instead, just fall back on the new '%p' behavior which hashes the
pointer.
Otherwise, '%pK' - that was intended to mark a pointer as restricted -
just ends up leaking pointers that a normal '%p' wouldn't leak. Which
just make the whole thing pointless.
I suspect we should actually get rid of '%pK' entirely, and make it just
work as '%p' regardless, but this is the minimal obvious fix. People
who actually use 'kptr_restrict' should weigh in on which behavior they
want.
Cc: Tobin Harding <me@tobin.cc>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When chacha20_block() outputs the keystream block, it uses 'u32' stores
directly. However, the callers (crypto/chacha20_generic.c and
drivers/char/random.c) declare the keystream buffer as a 'u8' array,
which is not guaranteed to have the needed alignment.
Fix it by having both callers declare the keystream as a 'u32' array.
For now this is preferable to switching over to the unaligned access
macros because chacha20_block() is only being used in cases where we can
easily control the alignment (stack buffers).
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
printk specifier %p now hashes all addresses before printing. Sometimes
we need to see the actual unmodified address. This can be achieved using
%lx but then we face the risk that if in future we want to change the
way the Kernel handles printing of pointers we will have to grep through
the already existent 50 000 %lx call sites. Let's add specifier %px as a
clear, opt-in, way to print a pointer and maintain some level of
isolation from all the other hex integer output within the Kernel.
Add printk specifier %px to print the actual unmodified address.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently there exist approximately 14 000 places in the kernel where
addresses are being printed using an unadorned %p. This potentially
leaks sensitive information regarding the Kernel layout in memory. Many
of these calls are stale, instead of fixing every call lets hash the
address by default before printing. This will of course break some
users, forcing code printing needed addresses to be updated.
Code that _really_ needs the address will soon be able to use the new
printk specifier %px to print the address.
For what it's worth, usage of unadorned %p can be broken down as
follows (thanks to Joe Perches).
$ git grep -E '%p[^A-Za-z0-9]' | cut -f1 -d"/" | sort | uniq -c
1084 arch
20 block
10 crypto
32 Documentation
8121 drivers
1221 fs
143 include
101 kernel
69 lib
100 mm
1510 net
40 samples
7 scripts
11 security
166 sound
152 tools
2 virt
Add function ptr_to_id() to map an address to a 32 bit unique
identifier. Hash any unadorned usage of specifier %p and any malformed
specifiers.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently code to handle %pK is all within the switch statement in
pointer(). This is the wrong level of abstraction. Each of the other switch
clauses call a helper function, pK should do the same.
Refactor code out of pointer() to new function restricted_pointer().
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Pull timer updates from Thomas Gleixner:
- The final conversion of timer wheel timers to timer_setup().
A few manual conversions and a large coccinelle assisted sweep and
the removal of the old initialization mechanisms and the related
code.
- Remove the now unused VSYSCALL update code
- Fix permissions of /proc/timer_list. I still need to get rid of that
file completely
- Rename a misnomed clocksource function and remove a stale declaration
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
m68k/macboing: Fix missed timer callback assignment
treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts
timer: Remove redundant __setup_timer*() macros
timer: Pass function down to initialization routines
timer: Remove unused data arguments from macros
timer: Switch callback prototype to take struct timer_list * argument
timer: Pass timer_list pointer to callbacks unconditionally
Coccinelle: Remove setup_timer.cocci
timer: Remove setup_*timer() interface
timer: Remove init_timer() interface
treewide: setup_timer() -> timer_setup() (2 field)
treewide: setup_timer() -> timer_setup()
treewide: init_timer() -> setup_timer()
treewide: Switch DEFINE_TIMER callbacks to struct timer_list *
s390: cmm: Convert timers to use timer_setup()
lightnvm: Convert timers to use timer_setup()
drivers/net: cris: Convert timers to use timer_setup()
drm/vc4: Convert timers to use timer_setup()
block/laptop_mode: Convert timers to use timer_setup()
net/atm/mpc: Avoid open-coded assignment of timer callback function
...
General changes:
* Unconfuse get_unmapped_area and point/unpoint driver methods
* New partition parser: sharpslpart
* Kill GENERIC_IO
* Various fixes
NAND changes:
* Add a flag to mark NANDs that require 3 address cycles to encode a
page address
* Set a default ECC/free layout when NAND_ECC_NONE is requested
* Fix a bug in panic_nand_write()
* Another batch of cleanups for the denali driver
* Fix PM support in the atmel driver
* Remove support for platform data in the omap driver
* Fix subpage write in the omap driver
* Fix irq handling in the mtk driver
* Change link order of mtk_ecc and mtk_nand drivers to speed up boot
time
* Change log level of ECC error messages in the mxc driver
* Patch the pxa3xx driver to support Armada 8k platforms
* Add BAM DMA support to the qcom driver
* Convert gpio-nand to the GPIO desc API
* Fix ECC handling in the mt29f driver
SPI-NOR changes:
* Introduce system power management support
* New mechanism to select the proper .quad_enable() hook by JEDEC ID,
when needed, instead of only by manufacturer ID
* Add support to new memory parts from Gigadevice, Winbond, Macronix and
Everspin
* Maintainance for Cadence, Intel, Mediatek and STM32 drivers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJaEzkZAAoJEGb5WYXrGLvBiUMP/25eEatNd5pGo9rtXqX463kp
Q8zXGwtGp7Y2ThtC2TMbSSZZFdhGXIv3AUGpW+Y1yFMzGbiwWh8T28rdgDKDINhl
jQteoWGQnZnnLhsMEbApJUqqtlxKFkY6COv/fUItmN8a4E5SyYF6ARKdnxH36Quu
j/i3Kyd1FjDzJE2jsAE6TuomlNRuj/4S0OiZBTlgMhQvbo282Rush6RmF5zAvsdN
B+S45Q752Pypg3U+1IYkqFSOtSYS3NM1ynZW7YXdWDwcKxDnKvasebSi+wCqPVc8
n6hkcnXKIMOB6/bGhLg3FZlrzJcH7cbxy2C40NKFmMa7gw+/h1bmvjZk9hubLEc3
+EJ8/1e8Z/KNTGu+Iyy2BNHTLI+KFKM5n/7/mpSPHMP/0uQjYs95GUmPlhVrenuv
wprVsQKj7k92E+5Vm/h+Gys67sEG/rQK0v9UEConzl1s2T7i/hnA2lhPfIFmbMU/
9U2s0CFobDqFUh+O6FSkLg9AT7+gT2HA1t6bbDTJMgnbFW72vlDUiArniia9hWOx
dSc5pxMnaSiiqk+uCma4zLv2/3Tyi5dAEMQy+qAlK1EpmwPAsyu3SEMbyraovb9S
PW0YQcMxVlQ/+EdDZCi83ypMlMQE/fDNcuKVMQD9enbko9yKGEgSZsTm9XwIvAv6
g0P5jYMind1aNNSfg/QM
=wVm7
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20171120' of git://git.infradead.org/linux-mtd
Pull MTD updates from Richard Weinberger:
"General changes:
- Unconfuse get_unmapped_area and point/unpoint driver methods
- New partition parser: sharpslpart
- Kill GENERIC_IO
- Various fixes
NAND changes:
- Add a flag to mark NANDs that require 3 address cycles to encode a
page address
- Set a default ECC/free layout when NAND_ECC_NONE is requested
- Fix a bug in panic_nand_write()
- Another batch of cleanups for the denali driver
- Fix PM support in the atmel driver
- Remove support for platform data in the omap driver
- Fix subpage write in the omap driver
- Fix irq handling in the mtk driver
- Change link order of mtk_ecc and mtk_nand drivers to speed up boot
time
- Change log level of ECC error messages in the mxc driver
- Patch the pxa3xx driver to support Armada 8k platforms
- Add BAM DMA support to the qcom driver
- Convert gpio-nand to the GPIO desc API
- Fix ECC handling in the mt29f driver
SPI-NOR changes:
- Introduce system power management support
- New mechanism to select the proper .quad_enable() hook by JEDEC
ID, when needed, instead of only by manufacturer ID
- Add support to new memory parts from Gigadevice, Winbond, Macronix
and Everspin
- Maintainance for Cadence, Intel, Mediatek and STM32 drivers"
* tag 'for-linus-20171120' of git://git.infradead.org/linux-mtd: (85 commits)
mtd: Avoid probe failures when mtd->dbg.dfs_dir is invalid
mtd: sharpslpart: Add sharpslpart partition parser
mtd: Add sanity checks in mtd_write/read_oob()
mtd: remove the get_unmapped_area method
mtd: implement mtd_get_unmapped_area() using the point method
mtd: chips/map_rom.c: implement point and unpoint methods
mtd: chips/map_ram.c: implement point and unpoint methods
mtd: mtdram: properly handle the phys argument in the point method
mtd: mtdswap: fix spelling mistake: 'TRESHOLD' -> 'THRESHOLD'
mtd: slram: use memremap() instead of ioremap()
kconfig: kill off GENERIC_IO option
mtd: Fix C++ comment in include/linux/mtd/mtd.h
mtd: constify mtd_partition
mtd: plat-ram: Replace manual resource management by devm
mtd: nand: Fix writing mtdoops to nand flash.
mtd: intel-spi: Add Intel Lewisburg PCH SPI super SKU PCI ID
mtd: nand: mtk: fix infinite ECC decode IRQ issue
mtd: spi-nor: Add support for mr25h128
mtd: nand: mtk: change the compile sequence of mtk_nand.o and mtk_ecc.o
mtd: spi-nor: enable 4B opcodes for mx66l51235l
...
This changes all DEFINE_TIMER() callbacks to use a struct timer_list
pointer instead of unsigned long. Since the data argument has already been
removed, none of these callbacks are using their argument currently, so
this renames the argument to "unused".
Done using the following semantic patch:
@match_define_timer@
declarer name DEFINE_TIMER;
identifier _timer, _callback;
@@
DEFINE_TIMER(_timer, _callback);
@change_callback depends on match_define_timer@
identifier match_define_timer._callback;
type _origtype;
identifier _origarg;
@@
void
-_callback(_origtype _origarg)
+_callback(struct timer_list *unused)
{ ... }
Signed-off-by: Kees Cook <keescook@chromium.org>
The flag enables Clang instrumentation of comparison operations
(currently not supported by GCC). This instrumentation is needed by the
new KCOV device to collect comparison operands.
Link: http://lkml.kernel.org/r/20171011095459.70721-2-glider@google.com
Signed-off-by: Victor Chibotaru <tchibo@google.com>
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: <syzkaller@googlegroups.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
find_bit functions are widely used in the kernel, including hot paths.
This module tests performance of those functions in 2 typical scenarios:
randomly filled bitmap with relatively equal distribution of set and
cleared bits, and sparse bitmap which has 1 set bit for 500 cleared
bits.
On ThunderX machine:
Start testing find_bit() with random-filled bitmap
find_next_bit: 240043 cycles, 164062 iterations
find_next_zero_bit: 312848 cycles, 163619 iterations
find_last_bit: 193748 cycles, 164062 iterations
find_first_bit: 177720874 cycles, 164062 iterations
Start testing find_bit() with sparse bitmap
find_next_bit: 3633 cycles, 656 iterations
find_next_zero_bit: 620399 cycles, 327025 iterations
find_last_bit: 3038 cycles, 656 iterations
find_first_bit: 691407 cycles, 656 iterations
[arnd@arndb.de: use correct format string for find-bit tests]
Link: http://lkml.kernel.org/r/20171113135605.3166307-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/20171109140714.13168-1-ynorov@caviumnetworks.com
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Clement Courbet <courbet@google.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fengguang reported soft lockups while running the rbtree and interval
tree test modules. The logic for these tests all occur in init phase,
and we currently are pounding with the default values for number of
nodes and number of iterations of each test. Reduce the latter by two
orders of magnitude. This does not influence the value of the tests in
that one thousand times by default is enough to get the picture.
Link: http://lkml.kernel.org/r/20171109161715.xai2dtwqw2frhkcm@linux-n805
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the amount of resources allocated to a gen_pool exceeds 2^32 then the
avail atomic overflows and this causes problems when clients try and
borrow resources from the pool. This is only expected to be an issue on
64 bit systems.
Add the <linux/atomic.h> header to pull in atomic_long* operations. So
that 32 bit systems continue to use atomic32_t but 64 bit systems can
use atomic64_t.
Link: http://lkml.kernel.org/r/1509033843-25667-1-git-send-email-sbates@raithlin.com
Signed-off-by: Stephen Bates <sbates@raithlin.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Daniel Mentz <danielmentz@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Our current int_sqrt() is not rough nor any approximation; it calculates
the exact value of: floor(sqrt()). Document this.
Link: http://lkml.kernel.org/r/20171020164645.001652117@infradead.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Anshul Garg <aksgarg1989@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Michael Davidson <md@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The initial value (@m) compute is:
m = 1UL << (BITS_PER_LONG - 2);
while (m > x)
m >>= 2;
Which is a linear search for the highest even bit smaller or equal to @x
We can implement this using a binary search using __fls() (or better when
its hardware implemented).
m = 1UL << (__fls(x) & ~1UL);
Especially for small values of @x; which are the more common arguments
when doing a CDF on idle times; the linear search is near to worst case,
while the binary search of __fls() is a constant 6 (or 5 on 32bit)
branches.
cycles: branches: branch-misses:
PRE:
hot: 43.633557 +- 0.034373 45.333132 +- 0.002277 0.023529 +- 0.000681
cold: 207.438411 +- 0.125840 45.333132 +- 0.002277 6.976486 +- 0.004219
SOFTWARE FLS:
hot: 29.576176 +- 0.028850 26.666730 +- 0.004511 0.019463 +- 0.000663
cold: 165.947136 +- 0.188406 26.666746 +- 0.004511 6.133897 +- 0.004386
HARDWARE FLS:
hot: 24.720922 +- 0.025161 20.666784 +- 0.004509 0.020836 +- 0.000677
cold: 132.777197 +- 0.127471 20.666776 +- 0.004509 5.080285 +- 0.003874
Averages computed over all values <128k using a LFSR to generate order.
Cold numbers have a LFSR based branch trace buffer 'confuser' ran between
each int_sqrt() invocation.
Link: http://lkml.kernel.org/r/20171020164644.936577234@infradead.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Joe Perches <joe@perches.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Anshul Garg <aksgarg1989@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Michael Davidson <md@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The current int_sqrt() computation is sub-optimal for the case of small
@x. Which is the interesting case when we're going to do cumulative
distribution functions on idle times, which we assume to be a random
variable, where the target residency of the deepest idle state gives an
upper bound on the variable (5e6ns on recent Intel chips).
In the case of small @x, the compute loop:
while (m != 0) {
b = y + m;
y >>= 1;
if (x >= b) {
x -= b;
y += m;
}
m >>= 2;
}
can be reduced to:
while (m > x)
m >>= 2;
Because y==0, b==m and until x>=m y will remain 0.
And while this is computationally equivalent, it runs much faster
because there's less code, in particular less branches.
cycles: branches: branch-misses:
OLD:
hot: 45.109444 +- 0.044117 44.333392 +- 0.002254 0.018723 +- 0.000593
cold: 187.737379 +- 0.156678 44.333407 +- 0.002254 6.272844 +- 0.004305
PRE:
hot: 67.937492 +- 0.064124 66.999535 +- 0.000488 0.066720 +- 0.001113
cold: 232.004379 +- 0.332811 66.999527 +- 0.000488 6.914634 +- 0.006568
POST:
hot: 43.633557 +- 0.034373 45.333132 +- 0.002277 0.023529 +- 0.000681
cold: 207.438411 +- 0.125840 45.333132 +- 0.002277 6.976486 +- 0.004219
Averages computed over all values <128k using a LFSR to generate order.
Cold numbers have a LFSR based branch trace buffer 'confuser' ran between
each int_sqrt() invocation.
Link: http://lkml.kernel.org/r/20171020164644.876503355@infradead.org
Fixes: 30493cc9dd ("lib/int_sqrt.c: optimize square root algorithm")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Anshul Garg <aksgarg1989@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Joe Perches <joe@perches.com>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Davidson <md@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Omit extra messages for a memory allocation failure in these functions.
This issue was detected by using the Coccinelle software.
Link: http://lkml.kernel.org/r/410a4c5a-4ee0-6fcc-969c-103d8e496b78@users.sourceforge.net
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Extract the string test code into its own source file, to allow
compiling it either to a loadable module, or built into the kernel.
Fixes: 03270c13c5 ("lib/string.c: add testcases for memset16/32/64")
Link: http://lkml.kernel.org/r/1505397744-3387-1-git-send-email-geert@linux-m68k.org
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
line-range is supposed to treat "1-" as "1-endoffile", so
handle the special case by setting last_lineno to UINT_MAX.
Fixes this error:
dynamic_debug:ddebug_parse_query: last-line:0 < 1st-line:1
dynamic_debug:ddebug_exec_query: query parse failed
Link: http://lkml.kernel.org/r/10a6a101-e2be-209f-1f41-54637824788e@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The "cut here" string is used in a few paths. Define it in a single
place.
Link: http://lkml.kernel.org/r/1510100869-73751-3-git-send-email-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some architectures store the WARN_ONCE state in the flags field of the
bug_entry. Clear that one too when resetting once state through
/sys/kernel/debug/clear_warn_once
Pointed out by Michael Ellerman
Improves the earlier patch that add clear_warn_once.
[ak@linux.intel.com: add a missing ifdef CONFIG_MODULES]
Link: http://lkml.kernel.org/r/20171020170633.9593-1-andi@firstfloor.org
[akpm@linux-foundation.org: fix unused var warning]
[akpm@linux-foundation.org: Use 0200 for clear_warn_once file, per mpe]
[akpm@linux-foundation.org: clear BUGFLAG_DONE in clear_once_table(), per mpe]
Link: http://lkml.kernel.org/r/20171019204642.7404-1-andi@firstfloor.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
dma-debug reports the following warning:
WARNING: CPU: 3 PID: 298 at kernel-4.4/lib/dma-debug.c:604
debug _dma_assert_idle+0x1a8/0x230()
DMA-API: cpu touching an active dma mapped cacheline [cln=0x00000882300]
CPU: 3 PID: 298 Comm: vold Tainted: G W O 4.4.22+ #1
Hardware name: MT6739 (DT)
Call trace:
debug_dma_assert_idle+0x1a8/0x230
wp_page_copy.isra.96+0x118/0x520
do_wp_page+0x4fc/0x534
handle_mm_fault+0xd4c/0x1310
do_page_fault+0x1c8/0x394
do_mem_abort+0x50/0xec
I found that debug_dma_alloc_coherent() and debug_dma_free_coherent()
assume that dma_alloc_coherent() always returns a linear address.
However it's possible that dma_alloc_coherent() returns a non-linear
address. In this case, page_to_pfn(virt_to_page(virt)) will return an
incorrect pfn. If the pfn is valid and mapped as a COW page, we will
hit the warning when doing wp_page_copy().
Fix this by calculating pfn for linear and non-linear addresses.
[miles.chen@mediatek.com: v4]
Link: http://lkml.kernel.org/r/1510872972-23919-1-git-send-email-miles.chen@mediatek.com
Link: http://lkml.kernel.org/r/1506484087-1177-1-git-send-email-miles.chen@mediatek.com
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull iov_iter updates from Al Viro:
- bio_{map,copy}_user_iov() series; those are cleanups - fixes from the
same pile went into mainline (and stable) in late September.
- fs/iomap.c iov_iter-related fixes
- new primitive - iov_iter_for_each_range(), which applies a function
to kernel-mapped segments of an iov_iter.
Usable for kvec and bvec ones, the latter does kmap()/kunmap() around
the callback. _Not_ usable for iovec- or pipe-backed iov_iter; the
latter is not hard to fix if the need ever appears, the former is by
design.
Another related primitive will have to wait for the next cycle - it
passes page + offset + size instead of pointer + size, and that one
will be usable for everything _except_ kvec. Unfortunately, that one
didn't get exposure in -next yet, so...
- a bit more lustre iov_iter work, including a use case for
iov_iter_for_each_range() (checksum calculation)
- vhost/scsi leak fix in failure exit
- misc cleanups and detritectomy...
* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (21 commits)
iomap_dio_actor(): fix iov_iter bugs
switch ksocknal_lib_recv_...() to use of iov_iter_for_each_range()
lustre: switch struct ksock_conn to iov_iter
vhost/scsi: switch to iov_iter_get_pages()
fix a page leak in vhost_scsi_iov_to_sgl() error recovery
new primitive: iov_iter_for_each_range()
lnet_return_rx_credits_locked: don't abuse list_entry
xen: don't open-code iov_iter_kvec()
orangefs: remove detritus from struct orangefs_kiocb_s
kill iov_shorten()
bio_alloc_map_data(): do bmd->iter setup right there
bio_copy_user_iov(): saner bio size calculation
bio_map_user_iov(): get rid of copying iov_iter
bio_copy_from_iter(): get rid of copying iov_iter
move more stuff down into bio_copy_user_iov()
blk_rq_map_user_iov(): move iov_iter_advance() down
bio_map_user_iov(): get rid of the iov_for_each()
bio_map_user_iov(): move alignment check into the main loop
don't rely upon subsequent bio_add_pc_page() calls failing
... and with iov_iter_get_pages_alloc() it becomes even simpler
...
Here is the set of driver core / debugfs patches for 4.15-rc1.
Not many here, mostly all are debugfs fixes to resolve some
long-reported problems with files going away with references to them in
userspace. There's also some SPDX cleanups for the debugfs code, as
well as a few other minor driver core changes for issues reported by
people.
All of these have been in linux-next for a week or more with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWg2NCA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymUNgCfYq434CFh+YtwITBNYdqkFYFf0ZAAn3qfhh2+
M3rmZzwk2MKBvNQ2npvt
=/8+Y
-----END PGP SIGNATURE-----
Merge tag 'driver-core-4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core / debugfs patches for 4.15-rc1.
Not many here, mostly all are debugfs fixes to resolve some
long-reported problems with files going away with references to them
in userspace. There's also some SPDX cleanups for the debugfs code, as
well as a few other minor driver core changes for issues reported by
people.
All of these have been in linux-next for a week or more with no
reported issues"
* tag 'driver-core-4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: Fix device link deferred probe
debugfs: Remove redundant license text
debugfs: add SPDX identifiers to all debugfs files
debugfs: defer debugfs_fsdata allocation to first usage
debugfs: call debugfs_real_fops() only after debugfs_file_get()
debugfs: purge obsolete SRCU based removal protection
IB/hfi1: convert to debugfs_file_get() and -put()
debugfs: convert to debugfs_file_get() and -put()
debugfs: debugfs_real_fops(): drop __must_hold sparse annotation
debugfs: implement per-file removal protection
debugfs: add support for more elaborate ->d_fsdata
driver core: Move device_links_purge() after bus_remove_device()
arch_topology: Fix section miss match warning due to free_raw_capacity()
driver-core: pr_err() strings should end with newlines
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJaCm8RAAoJEAx081l5xIa+zX0QAJSm31kCG3vdw2CNiRx25L3q
3hcsEOgAjVJ9FQVGKFWjzb8TK35tSqtNx5kWIj0VGaIfBE5Bdg5SLLgKKUYas8rY
4LaphqICq2uxu2BNa2tpiar/sHhAnuozwQ4czpVWXzlaISnb9yYzRl7gMuyUVGkx
+Gih5VUhLmQC0HsRTLJ3vaZQoUsLAl2gAjKcWa1bx57j2S+iKOPfsLaq7VYo+y1I
Njc+iSGqMhJzRLXVkxL2lQKaslp7R38Bbh5K4Kvyjkm4Aq7zErOF6irpOXKMcrGl
mwnr89vf1G9thjikrBaXpKnuvdbWYveoN/ORMlTdCfxkFnChHLnm3bd7NJ49RXDN
Hv/Iq9YYjmZ9GTatxnx7lWtmXnZXC5he1yn1JAuz/yt7/0b/Wx+Mu/wEpBXYNFTd
1AZdD586i+AmPo3yDkqH9nBu8JC0W0AnS9VZma4LVvZOP2UfJmj5Im1CLHItbGDN
FnUCkwyD/lJUUk+WgT+w/GOMJgmFHDiFFl4tFtYVVjrUirpCFVguSKG9xuv6tT8P
8iRsoP7RrcmDN9ojN2SEHwcpsAv3HnKkDv+9+GIbWnrGsSbCPq8Qm+JDSvf4h22I
K5lwNpJrcpSKI+q10L7w2xliTBwb98sJkWGA/rssomrdBOWteGZAyqFRYAVgQ+mJ
x/nJurIqQYh2KQN9+uLG
=xVV2
-----END PGP SIGNATURE-----
Merge tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
"This is the main drm pull request for v4.15.
Core:
- Atomic object lifetime fixes
- Atomic iterator improvements
- Sparse/smatch fixes
- Legacy kms ioctls to be interruptible
- EDID override improvements
- fb/gem helper cleanups
- Simple outreachy patches
- Documentation improvements
- Fix dma-buf rcu races
- DRM mode object leasing for improving VR use cases.
- vgaarb improvements for non-x86 platforms.
New driver:
- tve200: Faraday Technology TVE200 block.
This "TV Encoder" encodes a ITU-T BT.656 stream and can be found in
the StorLink SL3516 (later Cortina Systems CS3516) as well as the
Grain Media GM8180.
New bridges:
- SiI9234 support
New panels:
- S6E63J0X03, OTM8009A, Seiko 43WVF1G, 7" rpi touch panel, Toshiba
LT089AC19000, Innolux AT043TN24
i915:
- Remove Coffeelake from alpha support
- Cannonlake workarounds
- Infoframe refactoring for DisplayPort
- VBT updates
- DisplayPort vswing/emph/buffer translation refactoring
- CCS fixes
- Restore GPU clock boost on missed vblanks
- Scatter list updates for userptr allocations
- Gen9+ transition watermarks
- Display IPC (Isochronous Priority Control)
- Private PAT management
- GVT: improved error handling and pci config sanitizing
- Execlist refactoring
- Transparent Huge Page support
- User defined priorities support
- HuC/GuC firmware refactoring
- DP MST fixes
- eDP power sequencing fixes
- Use RCU instead of stop_machine
- PSR state tracking support
- Eviction fixes
- BDW DP aux channel timeout fixes
- LSPCON fixes
- Cannonlake PLL fixes
amdgpu:
- Per VM BO support
- Powerplay cleanups
- CI powerplay support
- PASID mgr for kfd
- SR-IOV fixes
- initial GPU reset for vega10
- Prime mmap support
- TTM updates
- Clock query interface for Raven
- Fence to handle ioctl
- UVD encode ring support on Polaris
- Transparent huge page DMA support
- Compute LRU pipe tweaks
- BO flag to allow buffers to opt out of implicit sync
- CTX priority setting API
- VRAM lost infrastructure plumbing
qxl:
- fix flicker since atomic rework
amdkfd:
- Further improvements from internal AMD tree
- Usermode events
- Drop radeon support
nouveau:
- Pascal temperature sensor support
- Improved BAR2 handling
- MMU rework to support Pascal MMU
exynos:
- Improved HDMI/mixer support
- HDMI audio interface support
tegra:
- Prep work for tegra186
- Cleanup/fixes
msm:
- Preemption support for a5xx
- Display fixes for 8x96 (snapdragon 820)
- Async cursor plane fixes
- FW loading rework
- GPU debugging improvements
vc4:
- Prep for DSI panels
- fix T-format tiling scanout
- New madvise ioctl
Rockchip:
- LVDS support
omapdrm:
- omap4 HDMI CEC support
etnaviv:
- GPU performance counters groundwork
sun4i:
- refactor driver load + TCON backend
- HDMI improvements
- A31 support
- Misc fixes
udl:
- Probe/EDID read fixes.
tilcdc:
- Misc fixes.
pl111:
- Support more variants
adv7511:
- Improve EDID handling.
- HDMI CEC support
sii8620:
- Add remote control support"
* tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux: (1480 commits)
drm/rockchip: analogix_dp: Use mutex rather than spinlock
drm/mode_object: fix documentation for object lookups.
drm/i915: Reorder context-close to avoid calling i915_vma_close() under RCU
drm/i915: Move init_clock_gating() back to where it was
drm/i915: Prune the reservation shared fence array
drm/i915: Idle the GPU before shinking everything
drm/i915: Lock llist_del_first() vs llist_del_all()
drm/i915: Calculate ironlake intermediate watermarks correctly, v2.
drm/i915: Disable lazy PPGTT page table optimization for vGPU
drm/i915/execlists: Remove the priority "optimisation"
drm/i915: Filter out spurious execlists context-switch interrupts
drm/amdgpu: use irq-safe lock for kiq->ring_lock
drm/amdgpu: bypass lru touch for KIQ ring submission
drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
drm/amd/powerplay: initialize a variable before using it
drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels
drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition
drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug
drm/rockchip: add CONFIG_OF dependency for lvds
...
Merge updates from Andrew Morton:
- a few misc bits
- ocfs2 updates
- almost all of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (131 commits)
memory hotplug: fix comments when adding section
mm: make alloc_node_mem_map a void call if we don't have CONFIG_FLAT_NODE_MEM_MAP
mm: simplify nodemask printing
mm,oom_reaper: remove pointless kthread_run() error check
mm/page_ext.c: check if page_ext is not prepared
writeback: remove unused function parameter
mm: do not rely on preempt_count in print_vma_addr
mm, sparse: do not swamp log with huge vmemmap allocation failures
mm/hmm: remove redundant variable align_end
mm/list_lru.c: mark expected switch fall-through
mm/shmem.c: mark expected switch fall-through
mm/page_alloc.c: broken deferred calculation
mm: don't warn about allocations which stall for too long
fs: fuse: account fuse_inode slab memory as reclaimable
mm, page_alloc: fix potential false positive in __zone_watermark_ok
mm: mlock: remove lru_add_drain_all()
mm, sysctl: make NUMA stats configurable
shmem: convert shmem_init_inodecache() to void
Unify migrate_pages and move_pages access checks
mm, pagevec: rename pagevec drained field
...
During truncation, the mapping has already been checked for shmem and
dax so it's known that workingset_update_node is required.
This patch avoids the checks on mapping for each page being truncated.
In all other cases, a lookup helper is used to determine if
workingset_update_node() needs to be called. The one danger is that the
API is slightly harder to use as calling workingset_update_node directly
without checking for dax or shmem mappings could lead to surprises.
However, the API rarely needs to be used and hopefully the comment is
enough to give people the hint.
sparsetruncate (tiny)
4.14.0-rc4 4.14.0-rc4
oneirq-v1r1 pickhelper-v1r1
Min Time 141.00 ( 0.00%) 140.00 ( 0.71%)
1st-qrtle Time 142.00 ( 0.00%) 141.00 ( 0.70%)
2nd-qrtle Time 142.00 ( 0.00%) 142.00 ( 0.00%)
3rd-qrtle Time 143.00 ( 0.00%) 143.00 ( 0.00%)
Max-90% Time 144.00 ( 0.00%) 144.00 ( 0.00%)
Max-95% Time 147.00 ( 0.00%) 145.00 ( 1.36%)
Max-99% Time 195.00 ( 0.00%) 191.00 ( 2.05%)
Max Time 230.00 ( 0.00%) 205.00 ( 10.87%)
Amean Time 144.37 ( 0.00%) 143.82 ( 0.38%)
Stddev Time 10.44 ( 0.00%) 9.00 ( 13.74%)
Coeff Time 7.23 ( 0.00%) 6.26 ( 13.41%)
Best99%Amean Time 143.72 ( 0.00%) 143.34 ( 0.26%)
Best95%Amean Time 142.37 ( 0.00%) 142.00 ( 0.26%)
Best90%Amean Time 142.19 ( 0.00%) 141.85 ( 0.24%)
Best75%Amean Time 141.92 ( 0.00%) 141.58 ( 0.24%)
Best50%Amean Time 141.69 ( 0.00%) 141.31 ( 0.27%)
Best25%Amean Time 141.38 ( 0.00%) 140.97 ( 0.29%)
As you'd expect, the gain is marginal but it can be detected. The
differences in bonnie are all within the noise which is not surprising
given the impact on the microbenchmark.
radix_tree_update_node_t is a callback for some radix operations that
optionally passes in a private field. The only user of the callback is
workingset_update_node and as it no longer requires a mapping, the
private field is removed.
Link: http://lkml.kernel.org/r/20171018075952.10627-3-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking updates from David Miller:
"Highlights:
1) Maintain the TCP retransmit queue using an rbtree, with 1GB
windows at 100Gb this really has become necessary. From Eric
Dumazet.
2) Multi-program support for cgroup+bpf, from Alexei Starovoitov.
3) Perform broadcast flooding in hardware in mv88e6xxx, from Andrew
Lunn.
4) Add meter action support to openvswitch, from Andy Zhou.
5) Add a data meta pointer for BPF accessible packets, from Daniel
Borkmann.
6) Namespace-ify almost all TCP sysctl knobs, from Eric Dumazet.
7) Turn on Broadcom Tags in b53 driver, from Florian Fainelli.
8) More work to move the RTNL mutex down, from Florian Westphal.
9) Add 'bpftool' utility, to help with bpf program introspection.
From Jakub Kicinski.
10) Add new 'cpumap' type for XDP_REDIRECT action, from Jesper
Dangaard Brouer.
11) Support 'blocks' of transformations in the packet scheduler which
can span multiple network devices, from Jiri Pirko.
12) TC flower offload support in cxgb4, from Kumar Sanghvi.
13) Priority based stream scheduler for SCTP, from Marcelo Ricardo
Leitner.
14) Thunderbolt networking driver, from Amir Levy and Mika Westerberg.
15) Add RED qdisc offloadability, and use it in mlxsw driver. From
Nogah Frankel.
16) eBPF based device controller for cgroup v2, from Roman Gushchin.
17) Add some fundamental tracepoints for TCP, from Song Liu.
18) Remove garbage collection from ipv6 route layer, this is a
significant accomplishment. From Wei Wang.
19) Add multicast route offload support to mlxsw, from Yotam Gigi"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2177 commits)
tcp: highest_sack fix
geneve: fix fill_info when link down
bpf: fix lockdep splat
net: cdc_ncm: GetNtbFormat endian fix
openvswitch: meter: fix NULL pointer dereference in ovs_meter_cmd_reply_start
netem: remove unnecessary 64 bit modulus
netem: use 64 bit divide by rate
tcp: Namespace-ify sysctl_tcp_default_congestion_control
net: Protect iterations over net::fib_notifier_ops in fib_seq_sum()
ipv6: set all.accept_dad to 0 by default
uapi: fix linux/tls.h userspace compilation error
usbnet: ipheth: prevent TX queue timeouts when device not ready
vhost_net: conditionally enable tx polling
uapi: fix linux/rxrpc.h userspace compilation errors
net: stmmac: fix LPI transitioning for dwmac4
atm: horizon: Fix irq release error
net-sysfs: trigger netlink notification on ifalias change via sysfs
openvswitch: Using kfree_rcu() to simplify the code
openvswitch: Make local function ovs_nsh_key_attr_size() static
openvswitch: Fix return value check in ovs_meter_cmd_features()
...
This tag contains the core RISC-V Linux port, which has been through
nine rounds of review on various mailing lists. The port is not
complete: there's some cleanup patches moving through the review
process, a whole bunch of drivers that need some work, and a lot of
feature additions that will be needed.
The patches contained in this tag have been through nine rounds of
review on the various mailing lists. I have some outstanding cleanup
patches, but since there's been so much review on these patches I
thought it would be best to submit them as-is and then submit explicit
cleanup patches so everyone can review them. This first patch set is
big enough that it's a bit of a pain to constantly rewrite, and it's
caused a few headaches with various contributors.
The port is definately a work in progress. While what's there builds
and boots with 4.14, it's a bit hard to actually see anything happen
because there are no device drivers yet. I maintain a staging branch
that contains all the device drivers and cleanup that actually works,
but those patches won't all be ready for a while. I'd like to get what
we currently have into your tree so everyone can start working from a
single base -- of particular importance is allowing the glibc
upstreaming process to proceed so we can sort out any possibly lingering
user-visible ABI problems we might have.
Copied below is the ChangeLog that contains the history of this patch
set:
(v9) As per suggestions on our v8 patch set, I've split the core architecture code
out from our drivers and would like to submit this patch set to be included
into linux-next, with the goal being to be merged in during the next merge
window. This patch set is based on 4.14-rc2, but if it's better to have it
based on something else then I can change it around.
This patch set contains just the core arch code for RISC-V, so while it builds
an nominally boots, you can't print or take an interrupt so it's not that
useful. If you're looking to actually boot a system it would probably be
better to use the full patch set listed below.
We've collected a handful of tags from reviewers, and the remainder of the
patch set only got minimal feedback last time. Here's what changed:
* We now use the device tree to initialize the timer driver so it's less
tighly coupled with the arch port.
* I cleaned up the defconfigs -- there's actually now just one, and it's
empty. For now I think we're OK with what the kernel sets as defaults, but
I anticipate we'll begin to expand this as people start to use the port
more.
* The VDSO symbols version is sane.
* We WFI while spinning in the boot loop.
* A handful of comments have been added.
While there are still a handful of FIXMEs in this patch set, we've started to
get enough interest from various users and contributors that maintaining an out
of tree patch set is starting to become a big burden. Hopefully the patches
are good enough to merge now, which will at least get everyone working in a
more reasonable manner as we clean up the remaining issues.
This patch set is also availiable on github
https://github.com/riscv/riscv-linux/tree/riscv-for-submission-v9-arch
as is the entire patch set necessary to get a more functional RISC-V system up
and running, including a handful of patches that aren't ready for upstream yet.
https://github.com/riscv/riscv-linux/tree/riscv-for-submission-v9
Hopefully I've managed to get everyone's feedback
Here's the change highlights from the whole patch set:
(v8) I know it may not be the ideal time to submit a patch set right now, as
it's the middle of the merge window, but things have calmed down quite a bit in
the last month so I thought it would be good to get everyone on the same page.
There's been a handful of changes since the last patch set, but most of them
are fairly minor:
* We changed PAGE_OFFSET to allowing mapping more physical memory on 64-bit
systems. This is user configurable, as it triggers a different code model
that generates slightly less efficient code.
* The device tree binding documentation is back, I'd managed to lose it at some
point.
* We now pass the atomic64 test suite. The SBI timer driver has been
* refactored.
(v7) It's been a while since my last patch set, but the changes han been fairly
minimal:
* The PCI cleanup patches have been dropped, we'll do them as a separate patch
set later.
* We've the Kconfig entries from CONFIG_ISA_* to CONFIG_RISCV_ISA_*, to make
grep easier.
* There have been a handful of memory model related tweaks in I/O land,
particularly relating the PCI and the upcoming platform specification.
There are significant comments in the relevant files. This is still a WIP,
but I think we're close to getting as good as we're going to get until we
end up with some more specifications.
(v6) As it's been only a day since the v5 patch set, the changes are pretty
minimal:
* The patch set is now based on linux-next/master, which I believe is a better
base now that we're getting closer to upstream.
* EARLY_PRINTK is no longer an option. Since the SBI console is reasonable,
there's no penalty to enabling it (and thus no benefit to disabling it).
* The mmap syscalls were refactored a bit.
(v5) Things have really started to calm down, so this is fairly similar to the
v4 patch set. The most interesting changes include:
* We've moved back to a single patch set.
* SMP support has been fixed, I was accidentally running on a non-SMP
configuration. There were various mistakes all over the tree as a result of
this.
* The cmpxchg syscalls have been removed, as they were deemed a bad idea. As
a result, RISC-V Linux systems mandate the A extension. The corresponding
Kconfig entry to enable builds on non-A systems has been removed.
* A few more atomic fixes: mostly fence changes, but those resulted in a
handful of additional macros that were no longer necessary.
* riscv_early_sie has been removed.
(v4) There have only been a few changes since the v3 patch set:
* The cmpxchg64 syscall is no longer enabled on 32-bit systems. It's not
possible to provide this on SMP systems, and it's not necessary as glibc
knows not to call it.
* We provide a ELF_HWCAP so users can determine the ISA of the machine the
kernel is running on.
* The multi-line comments are in a better form.
* There were a handful of headers that could be replaced with the asm-generic
versions, and a few unnecessary definitions.
* We no longer use printk, but instead use pr_*.
* A few Kconfig and defconfig entries have been cleaned up.
(v3) A highlight of the changes since the v2 patch set includes:
* We've split out all our drivers into separate patch sets, which I've already
sent out to the relevant maintainers. I haven't included those patches in
this patch set, but some of them are necessary to build our port. A git
tree that contains all our patch sets merged together lives at
<https://github.com/riscv/riscv-linux/tree/riscv-for-submission-v3>.
* The patch set is now split up differently: rather than being split per
directory it is split per topic. Hopefully this will make it easier to
review the port on the mailing list. The split is a bit rough, so you
probably still want to look at the patch set as a whole.
* atomic.h has been completely rewritten and is hopefully now correct. I've
attempted to sanitize the various other memory model related code as well,
and I think it should all be sane now aside from a handful of FIXMEs
commented in the code.
* We've changed the cmpexchg syscall to always exist and to not be
multiplexed. There is also a VDSO entry for compare and exchange, which
allows kernels with the A extension to execute user code without the A
extension reasonably fast.
* Our user-visible register state now contains enough space for the Q
extension for 128-bit floating point, as well as a few words to allow
extensibility to future ISA extensions like the eventual V extension for
vectors.
* A handful of driver cleanups, but these have been split into separate patch
sets now so I won't duplicate them here.
(v2) A highlight of the changes since the v1 patch set includes:
* We've split out our drivers into the right places, which means now there's
a lot more patches. I'll be submitting these patches to various subsystem
maintainers and including them in any future RISC-V patch sets until
they've been merged.
* The SBI console driver has been completely rewritten to use the HVC helpers
and is now significantly smaller.
* We've begun to use weaker barriers as opposed to just the big "fence".
There's still some work to do here, specifically:
- We need fences in the relaxed MMIO functions.
- The non-relaxed MMIO functions are missing R/W bits on their fences.
- Many AMOs need the aq and rl bits set.
* We now have thread_info in task_struct. As a result, sscratch now contains
TP instead of SP. This was necessary because thread_info is no longer on
the stack.
* A few shared routines have been added that we use instead of creating
another arch copy.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAloLD8sTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRDvTKFQLMurQbCZEAC2IgWFOAhYDIv4s39jC/iuGcofuuwC
atTVgKSM8tUES5wBomoVxRH1yjDvmyb2jeq3gsp6gWPcchUpLMdfwf2MwW3NV3Mw
ESCZPwYiuFhORh1Jt5RSespjK+V9qMvCW0iU6cPE/9kAlPfMGGDv2vEttOFgOGEm
yVb1i0gHBcdzbw5H0xszBionUAQVXOFqkfO8AW8VPtFMdzZB6t9OBXRgHJLdWgmK
2Zr5pFN75uivNh4RI1KXHpUeD1kLRVICzG7Ak/aQCfKxWsJutFI1dnLFZmFOIoTf
2wgW4KsDsZakcA9rILtfo3SFH+mSD5PWzvv5G44yf9sEkGG9bSgxl29GeJYL7NzG
3Da9FVMvzjIhmxamPGHfFOFTxTud9+6GU6Lj0iBLpHzpcttjhNgE2NXzcY8r1uMD
BcSwkK3duybjeiZLpwnxOywZidCQDv6pZYyc50WBtV/oUG1fncj8DT2ZTIqGv1V8
L6D/MXSr1jt9oJeWzfDCxHlaGaHL6grrmyJ8L1tQKPjMp+DbBPFbMLfvbn/dlsat
mPqmfQZ4zydOVO53k6KiHozGQh6K+cuXMvNxrb9pCRy3etFV2wfTNxtbdeJSa7gj
xarC6vSia8KFVyXp5nydSks5woHGJFQ1kQYSLEORUWiL5zWILbtI6POzOZeYHgej
BvTzVq0AVIbxjA==
=xDIk
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-4.15-arch-v9-premerge' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux
Pull RISC-V architecture support from Palmer Dabbelt:
"This contains the core RISC-V Linux port, which has been through nine
rounds of review on various mailing lists. The port is not complete:
there's some cleanup patches moving through the review process, a
whole bunch of drivers that need some work, and a lot of feature
additions that will be needed.
The patches contained in this tag have been through nine rounds of
review on the various mailing lists. I have some outstanding cleanup
patches, but since there's been so much review on these patches I
thought it would be best to submit them as-is and then submit explicit
cleanup patches so everyone can review them. This first patch set is
big enough that it's a bit of a pain to constantly rewrite, and it's
caused a few headaches with various contributors.
The port is definately a work in progress. While what's there builds
and boots with 4.14, it's a bit hard to actually see anything happen
because there are no device drivers yet. I maintain a staging branch
that contains all the device drivers and cleanup that actually works,
but those patches won't all be ready for a while. I'd like to get what
we currently have into your tree so everyone can start working from a
single base -- of particular importance is allowing the glibc
upstreaming process to proceed so we can sort out any possibly
lingering user-visible ABI problems we might have.
Copied below is the ChangeLog that contains the history of this patch
set:
(v9) As per suggestions on our v8 patch set, I've split the core
architecture code out from our drivers and would like to submit
this patch set to be included into linux-next, with the goal
being to be merged in during the next merge window. This patch
set is based on 4.14-rc2, but if it's better to have it based on
something else then I can change it around.
This patch set contains just the core arch code for RISC-V, so
while it builds an nominally boots, you can't print or take an
interrupt so it's not that useful. If you're looking to actually
boot a system it would probably be better to use the full patch
set listed below.
We've collected a handful of tags from reviewers, and the
remainder of the patch set only got minimal feedback last time.
Here's what changed:
- We now use the device tree to initialize the timer driver so
it's less tighly coupled with the arch port.
- I cleaned up the defconfigs -- there's actually now just one,
and it's empty. For now I think we're OK with what the kernel
sets as defaults, but I anticipate we'll begin to expand this
as people start to use the port more.
- The VDSO symbols version is sane.
- We WFI while spinning in the boot loop.
- A handful of comments have been added.
While there are still a handful of FIXMEs in this patch set,
we've started to get enough interest from various users and
contributors that maintaining an out of tree patch set is
starting to become a big burden. Hopefully the patches are good
enough to merge now, which will at least get everyone working in
a more reasonable manner as we clean up the remaining issues.
(v8) I know it may not be the ideal time to submit a patch set right
now, as it's the middle of the merge window, but things have
calmed down quite a bit in the last month so I thought it would
be good to get everyone on the same page. There's been a handful
of changes since the last patch set, but most of them are fairly
minor:
- We changed PAGE_OFFSET to allowing mapping more physical
memory on 64-bit systems. This is user configurable, as it
triggers a different code model that generates slightly less
efficient code.
- The device tree binding documentation is back, I'd managed to
lose it at some point.
- We now pass the atomic64 test suite
- The SBI timer driver has been refactored.
(v7) It's been a while since my last patch set, but the changes han
been fairly minimal:
- The PCI cleanup patches have been dropped, we'll do them as a
separate patch set later.
- We've the Kconfig entries from CONFIG_ISA_* to
CONFIG_RISCV_ISA_*, to make grep easier.
- There have been a handful of memory model related tweaks in
I/O land, particularly relating the PCI and the upcoming
platform specification. There are significant comments in the
relevant files. This is still a WIP, but I think we're close
to getting as good as we're going to get until we end up with
some more specifications.
(v6) As it's been only a day since the v5 patch set, the changes are
pretty minimal:
- The patch set is now based on linux-next/master, which I
believe is a better base now that we're getting closer to
upstream.
- EARLY_PRINTK is no longer an option. Since the SBI console is
reasonable, there's no penalty to enabling it (and thus no
benefit to disabling it).
- The mmap syscalls were refactored a bit.
(v5) Things have really started to calm down, so this is fairly
similar to the v4 patch set. The most interesting changes
include:
- We've moved back to a single patch set.
- SMP support has been fixed, I was accidentally running on a
non-SMP configuration. There were various mistakes all over
the tree as a result of this.
- The cmpxchg syscalls have been removed, as they were deemed a
bad idea. As a result, RISC-V Linux systems mandate the A
extension. The corresponding Kconfig entry to enable builds
on non-A systems has been removed.
- A few more atomic fixes: mostly fence changes, but those
resulted in a handful of additional macros that were no
longer necessary.
- riscv_early_sie has been removed.
(v4) There have only been a few changes since the v3 patch set:
- The cmpxchg64 syscall is no longer enabled on 32-bit systems.
It's not possible to provide this on SMP systems, and it's
not necessary as glibc knows not to call it.
- We provide a ELF_HWCAP so users can determine the ISA of the
machine the kernel is running on.
- The multi-line comments are in a better form.
- There were a handful of headers that could be replaced with
the asm-generic versions, and a few unnecessary definitions.
- We no longer use printk, but instead use pr_*.
- A few Kconfig and defconfig entries have been cleaned up.
(v3) A highlight of the changes since the v2 patch set includes:
- We've split out all our drivers into separate patch sets,
which I've already sent out to the relevant maintainers. I
haven't included those patches in this patch set, but some of
them are necessary to build our port.
- The patch set is now split up differently: rather than being
split per directory it is split per topic. Hopefully this
will make it easier to review the port on the mailing list.
The split is a bit rough, so you probably still want to look
at the patch set as a whole.
- atomic.h has been completely rewritten and is hopefully now
correct. I've attempted to sanitize the various other memory
model related code as well, and I think it should all be sane
now aside from a handful of FIXMEs commented in the code.
- We've changed the cmpexchg syscall to always exist and to not
be multiplexed. There is also a VDSO entry for compare and
exchange, which allows kernels with the A extension to
execute user code without the A extension reasonably fast.
- Our user-visible register state now contains enough space for
the Q extension for 128-bit floating point, as well as a few
words to allow extensibility to future ISA extensions like
the eventual V extension for vectors.
- A handful of driver cleanups, but these have been split into
separate patch sets now so I won't duplicate them here.
(v2) A highlight of the changes since the v1 patch set includes:
- We've split out our drivers into the right places, which
means now there's a lot more patches. I'll be submitting
these patches to various subsystem maintainers and including
them in any future RISC-V patch sets until they've been
merged.
- The SBI console driver has been completely rewritten to use
the HVC helpers and is now significantly smaller.
- We've begun to use weaker barriers as opposed to just the big
"fence". There's still some work to do here, specifically:
- We need fences in the relaxed MMIO functions.
- The non-relaxed MMIO functions are missing R/W bits on their fences.
- Many AMOs need the aq and rl bits set.
- We now have thread_info in task_struct. As a result, sscratch
now contains TP instead of SP. This was necessary because
thread_info is no longer on the stack.
- A few shared routines have been added that we use instead of
creating another arch copy"
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
* tag 'riscv-for-linus-4.15-arch-v9-premerge' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux:
RISC-V: Build Infrastructure
RISC-V: User-facing API
RISC-V: Paging and MMU
RISC-V: Device, timer, IRQs, and the SBI
RISC-V: Task implementation
RISC-V: ELF and module implementation
RISC-V: Generic library routines and assembly
RISC-V: Atomic and Locking Code
RISC-V: Init and Halt Code
dt-bindings: RISC-V CPU Bindings
lib: Add shared copies of some GCC library routines
MAINTAINERS: Add RISC-V
Pull crypto updates from Herbert Xu:
"Here is the crypto update for 4.15:
API:
- Disambiguate EBUSY when queueing crypto request by adding ENOSPC.
This change touches code outside the crypto API.
- Reset settings when empty string is written to rng_current.
Algorithms:
- Add OSCCA SM3 secure hash.
Drivers:
- Remove old mv_cesa driver (replaced by marvell/cesa).
- Enable rfc3686/ecb/cfb/ofb AES in crypto4xx.
- Add ccm/gcm AES in crypto4xx.
- Add support for BCM7278 in iproc-rng200.
- Add hash support on Exynos in s5p-sss.
- Fix fallback-induced error in vmx.
- Fix output IV in atmel-aes.
- Fix empty GCM hash in mediatek.
Others:
- Fix DoS potential in lib/mpi.
- Fix potential out-of-order issues with padata"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (162 commits)
lib/mpi: call cond_resched() from mpi_powm() loop
crypto: stm32/hash - Fix return issue on update
crypto: dh - Remove pointless checks for NULL 'p' and 'g'
crypto: qat - Clean up error handling in qat_dh_set_secret()
crypto: dh - Don't permit 'key' or 'g' size longer than 'p'
crypto: dh - Don't permit 'p' to be 0
crypto: dh - Fix double free of ctx->p
hwrng: iproc-rng200 - Add support for BCM7278
dt-bindings: rng: Document BCM7278 RNG200 compatible
crypto: chcr - Replace _manual_ swap with swap macro
crypto: marvell - Add a NULL entry at the end of mv_cesa_plat_id_table[]
hwrng: virtio - Virtio RNG devices need to be re-registered after suspend/resume
crypto: atmel - remove empty functions
crypto: ecdh - remove empty exit()
MAINTAINERS: update maintainer for qat
crypto: caam - remove unused param of ctx_map_to_sec4_sg()
crypto: caam - remove unneeded edesc zeroization
crypto: atmel-aes - Reset the controller before each use
crypto: atmel-aes - properly set IV after {en,de}crypt
hwrng: core - Reset user selected rng by writing "" to rng_current
...
Pull timer updates from Thomas Gleixner:
"Yet another big pile of changes:
- More year 2038 work from Arnd slowly reaching the point where we
need to think about the syscalls themself.
- A new timer function which allows to conditionally (re)arm a timer
only when it's either not running or the new expiry time is sooner
than the armed expiry time. This allows to use a single timer for
multiple timeout requirements w/o caring about the first expiry
time at the call site.
- A new NMI safe accessor to clock real time for the printk timestamp
work. Can be used by tracing, perf as well if required.
- A large number of timer setup conversions from Kees which got
collected here because either maintainers requested so or they
simply got ignored. As Kees pointed out already there are a few
trivial merge conflicts and some redundant commits which was
unavoidable due to the size of this conversion effort.
- Avoid a redundant iteration in the timer wheel softirq processing.
- Provide a mechanism to treat RTC implementations depending on their
hardware properties, i.e. don't inflict the write at the 0.5
seconds boundary which originates from the PC CMOS RTC to all RTCs.
No functional change as drivers need to be updated separately.
- The usual small updates to core code clocksource drivers. Nothing
really exciting"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (111 commits)
timers: Add a function to start/reduce a timer
pstore: Use ktime_get_real_fast_ns() instead of __getnstimeofday()
timer: Prepare to change all DEFINE_TIMER() callbacks
netfilter: ipvs: Convert timers to use timer_setup()
scsi: qla2xxx: Convert timers to use timer_setup()
block/aoe: discover_timer: Convert timers to use timer_setup()
ide: Convert timers to use timer_setup()
drbd: Convert timers to use timer_setup()
mailbox: Convert timers to use timer_setup()
crypto: Convert timers to use timer_setup()
drivers/pcmcia: omap1: Fix error in automated timer conversion
ARM: footbridge: Fix typo in timer conversion
drivers/sgi-xp: Convert timers to use timer_setup()
drivers/pcmcia: Convert timers to use timer_setup()
drivers/memstick: Convert timers to use timer_setup()
drivers/macintosh: Convert timers to use timer_setup()
hwrng/xgene-rng: Convert timers to use timer_setup()
auxdisplay: Convert timers to use timer_setup()
sparc/led: Convert timers to use timer_setup()
mips: ip22/32: Convert timers to use timer_setup()
...
Pull x86 core updates from Ingo Molnar:
"Note that in this cycle most of the x86 topics interacted at a level
that caused them to be merged into tip:x86/asm - but this should be a
temporary phenomenon, hopefully we'll back to the usual patterns in
the next merge window.
The main changes in this cycle were:
Hardware enablement:
- Add support for the Intel UMIP (User Mode Instruction Prevention)
CPU feature. This is a security feature that disables certain
instructions such as SGDT, SLDT, SIDT, SMSW and STR. (Ricardo Neri)
[ Note that this is disabled by default for now, there are some
smaller enhancements in the pipeline that I'll follow up with in
the next 1-2 days, which allows this to be enabled by default.]
- Add support for the AMD SEV (Secure Encrypted Virtualization) CPU
feature, on top of SME (Secure Memory Encryption) support that was
added in v4.14. (Tom Lendacky, Brijesh Singh)
- Enable new SSE/AVX/AVX512 CPU features: AVX512_VBMI2, GFNI, VAES,
VPCLMULQDQ, AVX512_VNNI, AVX512_BITALG. (Gayatri Kammela)
Other changes:
- A big series of entry code simplifications and enhancements (Andy
Lutomirski)
- Make the ORC unwinder default on x86 and various objtool
enhancements. (Josh Poimboeuf)
- 5-level paging enhancements (Kirill A. Shutemov)
- Micro-optimize the entry code a bit (Borislav Petkov)
- Improve the handling of interdependent CPU features in the early
FPU init code (Andi Kleen)
- Build system enhancements (Changbin Du, Masahiro Yamada)
- ... plus misc enhancements, fixes and cleanups"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (118 commits)
x86/build: Make the boot image generation less verbose
selftests/x86: Add tests for the STR and SLDT instructions
selftests/x86: Add tests for User-Mode Instruction Prevention
x86/traps: Fix up general protection faults caused by UMIP
x86/umip: Enable User-Mode Instruction Prevention at runtime
x86/umip: Force a page fault when unable to copy emulated result to user
x86/umip: Add emulation code for UMIP instructions
x86/cpufeature: Add User-Mode Instruction Prevention definitions
x86/insn-eval: Add support to resolve 16-bit address encodings
x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode
x86/insn-eval: Add wrapper function for 32 and 64-bit addresses
x86/insn-eval: Add support to resolve 32-bit address encodings
x86/insn-eval: Compute linear address in several utility functions
resource: Fix resource_size.cocci warnings
X86/KVM: Clear encryption attribute when SEV is active
X86/KVM: Decrypt shared per-cpu variables when SEV is active
percpu: Introduce DEFINE_PER_CPU_DECRYPTED
x86: Add support for changing memory encryption attribute in early boot
x86/io: Unroll string I/O when SEV is active
x86/boot: Add early boot support when running with SEV active
...
The GENERIC_IO option is set for every architecture except tile and score
as those define NO_IOMEM. The option only controls visibility of
CONFIG_MTD which doesn't appear to be necessary for any reason, so let's
just remove GENERIC_IO.
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Brian Norris <computersforpeace@gmail.com>
Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: Marek Vasut <marek.vasut@gmail.com>
Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: user-mode-linux-user@lists.sourceforge.net
Cc: linux-mtd@lists.infradead.org
Acked-by: Richard Weinberger <richard@nod.at>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Pull core locking updates from Ingo Molnar:
"The main changes in this cycle are:
- Another attempt at enabling cross-release lockdep dependency
tracking (automatically part of CONFIG_PROVE_LOCKING=y), this time
with better performance and fewer false positives. (Byungchul Park)
- Introduce lockdep_assert_irqs_enabled()/disabled() and convert
open-coded equivalents to lockdep variants. (Frederic Weisbecker)
- Add down_read_killable() and use it in the VFS's iterate_dir()
method. (Kirill Tkhai)
- Convert remaining uses of ACCESS_ONCE() to
READ_ONCE()/WRITE_ONCE(). Most of the conversion was Coccinelle
driven. (Mark Rutland, Paul E. McKenney)
- Get rid of lockless_dereference(), by strengthening Alpha atomics,
strengthening READ_ONCE() with smp_read_barrier_depends() and thus
being able to convert users of lockless_dereference() to
READ_ONCE(). (Will Deacon)
- Various micro-optimizations:
- better PV qspinlocks (Waiman Long),
- better x86 barriers (Michael S. Tsirkin)
- better x86 refcounts (Kees Cook)
- ... plus other fixes and enhancements. (Borislav Petkov, Juergen
Gross, Miguel Bernal Marin)"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE
rcu: Use lockdep to assert IRQs are disabled/enabled
netpoll: Use lockdep to assert IRQs are disabled/enabled
timers/posix-cpu-timers: Use lockdep to assert IRQs are disabled/enabled
sched/clock, sched/cputime: Use lockdep to assert IRQs are disabled/enabled
irq_work: Use lockdep to assert IRQs are disabled/enabled
irq/timings: Use lockdep to assert IRQs are disabled/enabled
perf/core: Use lockdep to assert IRQs are disabled/enabled
x86: Use lockdep to assert IRQs are disabled/enabled
smp/core: Use lockdep to assert IRQs are disabled/enabled
timers/hrtimer: Use lockdep to assert IRQs are disabled/enabled
timers/nohz: Use lockdep to assert IRQs are disabled/enabled
workqueue: Use lockdep to assert IRQs are disabled/enabled
irq/softirqs: Use lockdep to assert IRQs are disabled/enabled
locking/lockdep: Add IRQs disabled/enabled assertion APIs: lockdep_assert_irqs_enabled()/disabled()
locking/pvqspinlock: Implement hybrid PV queued/unfair locks
locking/rwlocks: Fix comments
x86/paravirt: Set up the virt_spin_lock_key after static keys get initialized
block, locking/lockdep: Assign a lock_class per gendisk used for wait_for_completion()
workqueue: Remove now redundant lock acquisitions wrt. workqueue flushes
...
- The old driver statement has been added to the kernel docs.
- We have a couple of new helper scripts. find-unused-docs.sh from Sayli
Karnic will point out kerneldoc comments that are not actually used in
the documentation. Jani Nikula's documentation-file-ref-check finds
references to non-existing files.
- A new ftrace document from Steve Rostedt.
- Vinod Koul converted the dmaengine docs to RST
Beyond that, it's mostly simple fixes.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJaCK37AAoJEI3ONVYwIuV63nwQALeqzVwGqqTwiyRyMqgEwMQM
je/6IurEwTHtyfwtW/mztCfNid1CLTiYZg7RET3/zlHjcUI/9VlV2dbBksGFgoQo
muHGqhwTJjXYREwjK3FkzrGckRsVZKJgdzmZYgukCCY6Ir7IffwJKYaLOCZN1S/l
4nBHQpt2nITo0WhdmZjaNRKOQxMA8nN5yGpOIl0neGE6ywIUMgauCCCHhxnOPVWg
ant1HliS8WR8Tizqt9wQgLCvs5lvklsBFibZPO9LBTPG2Zy3HIO9kb+npUAh2MTl
j0Wg39zzOFvVVErqErqUIwIuQ9IrfltHrEHYYoruTvDBXBiMKIcwApF+DS+H3WSp
TnDu3Qif4llM5SZsZGvcjawXNnbck+7SYOe9cyqpylV3SWMWrEX1tbUv6zVuVk+7
fencYBvEZgkJmWbjDeO/Z4S50STxRTzIxFwZgLft7g/RiHo9HvlubjjwQTqBFjxA
fVkolN7h69MGkrD8TF19eapyujqSXaNYH0pFYo87JNOjLgYmezUHyvHd8YeZJL31
Ll0h10HqSNVzJsjFolBMgrC3CcVjsEXdBufu0yVk45sAg9ZiMYOCpwa6Rtp+tfxa
uIBf1LKzfWSa0ocKx7+sMJt0B/CXwU3AMtsbYGyDhFhR2r3cp1NWBHf5nisz9etD
2Md9RDFAMLELZurewB9Q
=H6ud
-----END PGP SIGNATURE-----
Merge tag 'docs-4.15' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"A relatively calm cycle for the docs tree again.
- The old driver statement has been added to the kernel docs.
- We have a couple of new helper scripts. find-unused-docs.sh from
Sayli Karnic will point out kerneldoc comments that are not actually
used in the documentation. Jani Nikula's
documentation-file-ref-check finds references to non-existing files.
- A new ftrace document from Steve Rostedt.
- Vinod Koul converted the dmaengine docs to RST
Beyond that, it's mostly simple fixes.
This set reaches outside of Documentation/ a bit more than most. In
all cases, the changes are to comment docs, mostly from Randy, in
places where there didn't seem to be anybody better to take them"
* tag 'docs-4.15' of git://git.lwn.net/linux: (52 commits)
documentation: fb: update list of available compiled-in fonts
MAINTAINERS: update DMAengine documentation location
dmaengine: doc: ReSTize pxa_dma doc
dmaengine: doc: ReSTize dmatest doc
dmaengine: doc: ReSTize client API doc
dmaengine: doc: ReSTize provider doc
dmaengine: doc: Add ReST style dmaengine document
ftrace/docs: Add documentation on how to use ftrace from within the kernel
bug-hunting.rst: Fix an example and a typo in a Sphinx tag
scripts: Add a script to find unused documentation
samples: Convert timers to use timer_setup()
documentation: kernel-api: add more info on bitmap functions
Documentation: fix selftests related file refs
Documentation: fix ref to power basic-pm-debugging
Documentation: fix ref to trace stm content
Documentation: fix ref to coccinelle content
Documentation: fix ref to workqueue content
Documentation: fix ref to sphinx/kerneldoc.py
Documentation: fix locking rt-mutex doc refs
docs: dev-tools: correct Coccinelle version number
...
Attributes using NLA_U* and NLA_S* (where * is 8, 16,32 and 64) are
expected to be an exact length. Split these data types from
nla_attr_minlen into nla_attr_len and update validate_nla to require
the attribute to have exact length for them.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the
largest permitted inputs (16384 bits), the kernel spends 10+ seconds
doing modular exponentiation in mpi_powm() without rescheduling. If all
threads do it, it locks up the system. Moreover, it can cause
rcu_sched-stall warnings.
Notwithstanding the insanity of doing this calculation in kernel mode
rather than in userspace, fix it by calling cond_resched() as each bit
from the exponent is processed. It's still noninterruptible, but at
least it's preemptible now.
Do the cond_resched() once per bit rather than once per MPI limb because
each limb might still easily take 100+ milliseconds on slow CPUs.
Cc: <stable@vger.kernel.org> # v4.12+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Simple cases of overlapping changes in the packet scheduler.
Must easier to resolve this time.
Which probably means that I screwed it up somehow.
Signed-off-by: David S. Miller <davem@davemloft.net>
Purge the SRCU based file removal race protection in favour of the new,
refcount based debugfs_file_get()/debugfs_file_put() API.
Fixes: 49d200deaa ("debugfs: prevent access to removed files' private data")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
DMA access to encrypted memory cannot be performed when SEV is active.
In order for DMA to properly work when SEV is active, the SWIOTLB bounce
buffers must be used.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>C
Tested-by: Borislav Petkov <bp@suse.de>
Cc: kvm@vger.kernel.org
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20171020143059.3291-12-brijesh.singh@amd.com
Files removed in 'net-next' had their license header updated
in 'net'. We take the remove from 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWfswbQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykvEwCfXU1MuYFQGgMdDmAZXEc+xFXZvqgAoKEcHDNA
6dVh26uchcEQLN/XqUDt
=x306
-----END PGP SIGNATURE-----
Merge tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull initial SPDX identifiers from Greg KH:
"License cleanup: add SPDX license identifiers to some files
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the
'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally
binding shorthand, which can be used instead of the full boiler plate
text.
This patch is based on work done by Thomas Gleixner and Kate Stewart
and Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset
of the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to
license had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied
to a file was done in a spreadsheet of side by side results from of
the output of two independent scanners (ScanCode & Windriver)
producing SPDX tag:value files created by Philippe Ombredanne.
Philippe prepared the base worksheet, and did an initial spot review
of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537
files assessed. Kate Stewart did a file by file comparison of the
scanner results in the spreadsheet to determine which SPDX license
identifier(s) to be applied to the file. She confirmed any
determination that was not immediately clear with lawyers working with
the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained
>5 lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that
was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that
became the concluded license(s).
- when there was disagreement between the two scanners (one detected
a license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply
(and which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases,
confirmation by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.
The Windriver scanner is based on an older version of FOSSology in
part, so they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot
checks in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect
the correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial
patch version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch
license was not GPL-2.0 WITH Linux-syscall-note to ensure that the
applied SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
License cleanup: add SPDX license identifier to uapi header files with a license
License cleanup: add SPDX license identifier to uapi header files with no license
License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
syzkaller with KASAN reported an out-of-bounds read in
asn1_ber_decoder(). It can be reproduced by the following command,
assuming CONFIG_X509_CERTIFICATE_PARSER=y and CONFIG_KASAN=y:
keyctl add asymmetric desc $'\x30\x30' @s
The bug is that the length of an ASN.1 data value isn't validated in the
case where it is encoded using the short form, causing the decoder to
read past the end of the input buffer. Fix it by validating the length.
The bug report was:
BUG: KASAN: slab-out-of-bounds in asn1_ber_decoder+0x10cb/0x1730 lib/asn1_decoder.c:233
Read of size 1 at addr ffff88003cccfa02 by task syz-executor0/6818
CPU: 1 PID: 6818 Comm: syz-executor0 Not tainted 4.14.0-rc7-00008-g5f479447d983 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0xb3/0x10b lib/dump_stack.c:52
print_address_description+0x79/0x2a0 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x236/0x340 mm/kasan/report.c:409
__asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:427
asn1_ber_decoder+0x10cb/0x1730 lib/asn1_decoder.c:233
x509_cert_parse+0x1db/0x650 crypto/asymmetric_keys/x509_cert_parser.c:89
x509_key_preparse+0x64/0x7a0 crypto/asymmetric_keys/x509_public_key.c:174
asymmetric_key_preparse+0xcb/0x1a0 crypto/asymmetric_keys/asymmetric_type.c:388
key_create_or_update+0x347/0xb20 security/keys/key.c:855
SYSC_add_key security/keys/keyctl.c:122 [inline]
SyS_add_key+0x1cd/0x340 security/keys/keyctl.c:62
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x447c89
RSP: 002b:00007fca7a5d3bd8 EFLAGS: 00000246 ORIG_RAX: 00000000000000f8
RAX: ffffffffffffffda RBX: 00007fca7a5d46cc RCX: 0000000000447c89
RDX: 0000000020006f4a RSI: 0000000020006000 RDI: 0000000020001ff5
RBP: 0000000000000046 R08: fffffffffffffffd R09: 0000000000000000
R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007fca7a5d49c0 R15: 00007fca7a5d4700
Fixes: 42d5ec27f8 ("X.509: Add an ASN.1 decoder")
Cc: <stable@vger.kernel.org> # v3.7+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Pick up some of the MPX commits that modify the syscall entry code,
to have a common base and to reduce conflicts.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Smooth Cong Wang's bug fix into 'net-next'. Basically put
the bulk of the tcf_block_put() logic from 'net' into
tcf_block_put_ext(), but after the offload unbind.
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJZ9kEFAAoJEHm+PkMAQRiGw6wH/0j197qyGd0hkVFMJO6LAgN3
KQWS4nZ5BkVDocwv0RVnUJTtXqU1eozFgdVEtSoaFXpzlHGuptR2Tau9efDCJ7w3
/utZxqvhGebZd2T+j+/o/LE8BRQxhADBNJq2D/o0WNt8ecxuG0GIkhkEYt/o3z1v
/sxlwVwzXB7Dc/h1WcgGJG7cS6L9KzzAzGAS/iNvdFrPOygHBv8c0MxVZIiBIeeK
1nZdyvbyM8uenSyG+prGt9ENrqXZxxfwUxIchi2V7A9m1WmD5zijNkf1JCWji/O+
UsA1auxna7MwoxjxqZuGm4MlKOwZ+8xutk4JGgc+aP/ulndJbJYu+4op/3vaFBM=
=Mhx+
-----END PGP SIGNATURE-----
Backmerge tag 'v4.14-rc7' into drm-next
Linux 4.14-rc7
Requested by Ben Skeggs for nouveau to avoid major conflicts,
and things were getting a bit conflicty already, esp around amdgpu
reverts.
It turns out that some drivers seem to think it's ok to remap page
ranges from within interrupts and even NMI's. That is definitely not
the case, since the page table build-up is simply not interrupt-safe.
This showed up in the zero-day robot that reported it for the ACPI APEI
GHES ("Generic Hardware Error Source") driver. Normally it had been
hidden by the fact that no page table operations had been needed because
the vmalloc area had been set up by other things.
Apparently due to a recent change to the GHEI driver: commit
77b246b32b ("acpi: apei: check for pending errors when probing GHES
entries") 0day actually caught a case during bootup whenthe ioremap
called down to page allocation. But that recent change only showed the
symptom, it wasn't the root cause of the problem.
Hopefully it is limited to just that one driver.
If you need to access random physical memory, you either need to ioremap
in process context, or you need to use the FIXMAP facility to set one
particular fixmap entry to the required mapping - that can be done safely.
Cc: Borislav Petkov <bp@suse.de>
Cc: Len Brown <lenb@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Tyler Baicar <tbaicar@codeaurora.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Several conflicts here.
NFP driver bug fix adding nfp_netdev_is_nfp_repr() check to
nfp_fl_output() needed some adjustments because the code block is in
an else block now.
Parallel additions to net/pkt_cls.h and net/sch_generic.h
A bug fix in __tcp_retransmit_skb() conflicted with some of
the rbtree changes in net-next.
The tc action RCU callback fixes in 'net' had some overlap with some
of the recent tcf_block reworking.
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes CVE-2017-12193.
Fix a case in the assoc_array implementation in which a new leaf is
added that needs to go into a node that happens to be full, where the
existing leaves in that node cluster together at that level to the
exclusion of new leaf.
What needs to happen is that the existing leaves get moved out to a new
node, N1, at level + 1 and the existing node needs replacing with one,
N0, that has pointers to the new leaf and to N1.
The code that tries to do this gets this wrong in two ways:
(1) The pointer that should've pointed from N0 to N1 is set to point
recursively to N0 instead.
(2) The backpointer from N0 needs to be set correctly in the case N0 is
either the root node or reached through a shortcut.
Fix this by removing this path and using the split_node path instead,
which achieves the same end, but in a more general way (thanks to Eric
Biggers for spotting the redundancy).
The problem manifests itself as:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: assoc_array_apply_edit+0x59/0xe5
Fixes: 3cb989501c ("Add a generic associative array implementation.")
Reported-and-tested-by: WU Fan <u3536072@connect.hku.hk>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: stable@vger.kernel.org [v3.13-rc1+]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Please do not apply this to mainline directly, instead please re-run the
coccinelle script shown below and apply its output.
For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't harmful, and changing them results in
churn.
However, for some features, the read/write distinction is critical to
correct operation. To distinguish these cases, separate read/write
accessors must be used. This patch migrates (most) remaining
ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
coccinelle script:
----
// Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
// WRITE_ONCE()
// $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch
virtual patch
@ depends on patch @
expression E1, E2;
@@
- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)
@ depends on patch @
expression E;
@@
- ACCESS_ONCE(E)
+ READ_ONCE(E)
----
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: mpe@ellerman.id.au
Cc: shuah@kernel.org
Cc: snitzer@redhat.com
Cc: thor.thayer@linux.intel.com
Cc: tj@kernel.org
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There were quite a few overlapping sets of changes here.
Daniel's bug fix for off-by-ones in the new BPF branch instructions,
along with the added allowances for "data_end > ptr + x" forms
collided with the metadata additions.
Along with those three changes came veritifer test cases, which in
their final form I tried to group together properly. If I had just
trimmed GIT's conflict tags as-is, this would have split up the
meta tests unnecessarily.
In the socketmap code, a set of preemption disabling changes
overlapped with the rename of bpf_compute_data_end() to
bpf_compute_data_pointers().
Changes were made to the mv88e6060.c driver set addr method
which got removed in net-next.
The hyperv transport socket layer had a locking change in 'net'
which overlapped with a change of socket state macro usage
in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"A little more than usual this time around. Been travelling, so that is
part of it.
Anyways, here are the highlights:
1) Deal with memcontrol races wrt. listener dismantle, from Eric
Dumazet.
2) Handle page allocation failures properly in nfp driver, from Jaku
Kicinski.
3) Fix memory leaks in macsec, from Sabrina Dubroca.
4) Fix crashes in pppol2tp_session_ioctl(), from Guillaume Nault.
5) Several fixes in bnxt_en driver, including preventing potential
NVRAM parameter corruption from Michael Chan.
6) Fix for KRACK attacks in wireless, from Johannes Berg.
7) rtnetlink event generation fixes from Xin Long.
8) Deadlock in mlxsw driver, from Ido Schimmel.
9) Disallow arithmetic operations on context pointers in bpf, from
Jakub Kicinski.
10) Missing sock_owned_by_user() check in sctp_icmp_redirect(), from
Xin Long.
11) Only TCP is supported for sockmap, make that explicit with a
check, from John Fastabend.
12) Fix IP options state races in DCCP and TCP, from Eric Dumazet.
13) Fix panic in packet_getsockopt(), also from Eric Dumazet.
14) Add missing locked in hv_sock layer, from Dexuan Cui.
15) Various aquantia bug fixes, including several statistics handling
cures. From Igor Russkikh et al.
16) Fix arithmetic overflow in devmap code, from John Fastabend.
17) Fix busted socket memory accounting when we get a fault in the tcp
zero copy paths. From Willem de Bruijn.
18) Don't leave opt->tot_len uninitialized in ipv6, from Eric Dumazet"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
stmmac: Don't access tx_q->dirty_tx before netif_tx_lock
ipv6: flowlabel: do not leave opt->tot_len with garbage
of_mdio: Fix broken PHY IRQ in case of probe deferral
textsearch: fix typos in library helpers
rxrpc: Don't release call mutex on error pointer
net: stmmac: Prevent infinite loop in get_rx_timestamp_status()
net: stmmac: Fix stmmac_get_rx_hwtstamp()
net: stmmac: Add missing call to dev_kfree_skb()
mlxsw: spectrum_router: Configure TIGCR on init
mlxsw: reg: Add Tunneling IPinIP General Configuration Register
net: ethtool: remove error check for legacy setting transceiver type
soreuseport: fix initialization race
net: bridge: fix returning of vlan range op errors
sock: correct sk_wmem_queued accounting on efault in tcp zerocopy
bpf: add test cases to bpf selftests to cover all access tests
bpf: fix pattern matches for direct packet access
bpf: fix off by one for range markings with L{T, E} patterns
bpf: devmap fix arithmetic overflow in bitmap_size calculation
net: aquantia: Bad udp rate on default interrupt coalescing
net: aquantia: Enable coalescing management via ethtool interface
...
Fix spellos (typos) in textsearch library helpers.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are some good comments about bitmap operations in lib/bitmap.c
and include/linux/bitmap.h, so format them for document generation and
pull them into core-api/kernel-api.rst.
I converted the "tables" of functions from using tabs to using spaces
so that they are more readable in the source file and in the generated
output.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
dql_init always returned 0, and the only place that uses it
in network core code didn't care about the return value anyway.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Acked-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull locking fixes from Ingo Molnar:
"Two lockdep fixes for bugs introduced by the cross-release dependency
tracking feature - plus a commit that disables it because performance
regressed in an absymal fashion on some systems"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/lockdep: Disable cross-release features for now
locking/selftest: Avoid false BUG report
locking/lockdep: Fix stacktrace mess
Johan Hovold reported a big lockdep slowdown on his system, caused by lockdep:
> I had noticed that the BeagleBone Black boot time appeared to have
> increased significantly with 4.14 and yesterday I finally had time to
> investigate it.
>
> Boot time (from "Linux version" to login prompt) had in fact doubled
> since 4.13 where it took 17 seconds (with my current config) compared to
> the 35 seconds I now see with 4.14-rc4.
>
> I quick bisect pointed to lockdep and specifically the following commit:
>
> 28a903f63e ("locking/lockdep: Handle non(or multi)-acquisition of a crosslock")
Because the final v4.14 release is close, disable the cross-release lockdep
features for now.
Bisected-by: Johan Hovold <johan@kernel.org>
Debugged-by: Johan Hovold <johan@kernel.org>
Reported-by: Johan Hovold <johan@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Lindgren <tony@atomide.com>
Cc: kernel-team@lge.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mm@kvack.org
Cc: linux-omap@vger.kernel.org
Link: http://lkml.kernel.org/r/20171014072659.f2yr6mhm5ha3eou7@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Rename the unwinder config options from:
CONFIG_ORC_UNWINDER
CONFIG_FRAME_POINTER_UNWINDER
CONFIG_GUESS_UNWINDER
to:
CONFIG_UNWINDER_ORC
CONFIG_UNWINDER_FRAME_POINTER
CONFIG_UNWINDER_GUESS
... in order to give them a more logical config namespace.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/73972fc7e2762e91912c6b9584582703d6f1b8cc.1507924831.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Expand the "Runtime testing" menu by including more entries inside it
instead of after it. This is just Kconfig symbol movement.
This causes the (arch-independent) Runtime tests to be presented
(listed) all in one place instead of in multiple places.
Link: http://lkml.kernel.org/r/c194e5c4-2042-bf94-a2d8-7aa13756e257@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
digsig_verify() requests a user key, then accesses its payload.
However, a revoked key has a NULL payload, and we failed to check for
this. request_key() *does* skip revoked keys, but there is still a
window where the key can be revoked before we acquire its semaphore.
Fix it by checking for a NULL payload, treating it like a key which was
already revoked at the time it was requested.
Fixes: 051dbb918c ("crypto: digital signature verification support")
Reviewed-by: James Morris <james.l.morris@oracle.com>
Cc: <stable@vger.kernel.org> [v3.3+]
Cc: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
It's good style. I was also told that GCC 7 is more strict and might
give a warning when such comments are missing.
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Suggested-by: Andrei Borzenkov <arvidjaar@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
For kvec and bvec: feeds segments to given callback as long as it
returns 0. For iovec and pipe: fails.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The work-around for the expected failure is providing another failure :/
Only when CONFIG_PROVE_LOCKING=y do we increment unexpected_testcase_failures,
so only then do we need to decrement, otherwise we'll end up with a negative
number and that will again trigger a BUG (printout, not crash).
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: d82fed7529 ("locking/lockdep/selftests: Fix mixed read-write ABBA tests")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Switch the DO_ONCE() macro from the deprecated jump label API to the new
one. The new one is more readable, and for DO_ONCE() it also makes the
generated code more icache-friendly: now the one-time initialization
code is placed out-of-line at the jump target, rather than at the inline
fallthrough case.
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add kernel-doc notation for the gcd() function (so that it can be
added to the kernel-api documentation).
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
idr_replace() returns the old value on success, not 0.
Link: http://lkml.kernel.org/r/20170918162642.37511-1-ebiggers3@gmail.com
Signed-off-by: Eric Biggers <ebiggers@google.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Don't populate the read-only arrays dec32table and dec64table on the
stack, instead make them both static const. Makes the object code
smaller by over 10K bytes:
Before:
text data bss dec hex filename
31500 0 0 31500 7b0c lib/lz4/lz4_decompress.o
After:
text data bss dec hex filename
20237 176 0 20413 4fbd lib/lz4/lz4_decompress.o
(gcc version 7.2.0 x86_64)
Link: http://lkml.kernel.org/r/20170921221939.20820-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here are a few small fixes for 4.14-rc4.
The removal of DRIVER_ATTR() was almost completed by 4.14-rc1, but one
straggler made it in through some other tree (odds are, one of mine...)
So there's a simple removal of the last user, and then finally the macro
is removed from the tree.
There's a fix for old crazy udev instances that insist on reloading a
module when it is removed from the kernel due to the new uevents for
bind/unbind. This fixes the reported regression, hopefully some year in
the future we can drop the workaround, once users update to the latest
version, but I'm not holding my breath.
And then there's a build fix for a linker warning, and a buffer overflow
fix to match the PCI fixes you took through the PCI tree in the same
area.
All of these have been in linux-next for a few weeks while I've been
traveling, sorry for the delay.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWdN8qA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymLEgCfUSSBhxW04teEcPua4QygLv2omK0An2SRkpnY
28nn+D+AfeOByQImY8v+
=RQY+
-----END PGP SIGNATURE-----
Merge tag 'driver-core-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are a few small fixes for 4.14-rc4.
The removal of DRIVER_ATTR() was almost completed by 4.14-rc1, but one
straggler made it in through some other tree (odds are, one of
mine...) So there's a simple removal of the last user, and then
finally the macro is removed from the tree.
There's a fix for old crazy udev instances that insist on reloading a
module when it is removed from the kernel due to the new uevents for
bind/unbind. This fixes the reported regression, hopefully some year
in the future we can drop the workaround, once users update to the
latest version, but I'm not holding my breath.
And then there's a build fix for a linker warning, and a buffer
overflow fix to match the PCI fixes you took through the PCI tree in
the same area.
All of these have been in linux-next for a few weeks while I've been
traveling, sorry for the delay"
* tag 'driver-core-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: remove DRIVER_ATTR
fpga: altera-cvp: remove DRIVER_ATTR() usage
driver core: platform: Don't read past the end of "driver_override" buffer
base: arch_topology: fix section mismatch build warnings
driver core: suppress sending MODALIAS in UNBIND uevents
Add the rest of the CRC library functions to kernel-api.
- try to clarify crc32() by adding '@' to a function parameter
- reorder kernel-api CRC functions to be less random
- add more CRC functions to kernel-api
- correct the function parameter names in several places
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Many ports (m32r, microblaze, mips, parisc, score, and sparc) use
functionally identical copies of various GCC library routine files,
which came up as we were submitting the RISC-V port (which also uses
some of these).
This patch adds a new copy of these library routine files, which are
functionally identical to the various other copies. These are
availiable via Kconfig as CONFIG_GENERIC_$ROUTINE, which currently isn't
used anywhere.
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Pull parisc fixes from Helge Deller:
- Unbreak parisc bootloader by avoiding a gcc-7 optimization to convert
multiple byte-accesses into one word-access.
- Add missing HWPOISON page fault handler code. I completely missed
that when I added HWPOISON support during this merge window and it
only showed up now with the madvise07 LTP test case.
- Fix backtrace unwinding to stop when stack start has been reached.
- Issue warning if initrd has been loaded into memory regions with
broken RAM modules.
- Fix HPMC handler (parisc hardware fault handler) to comply with
architecture specification.
- Avoid compiler warnings about too large frame sizes.
- Minor init-section fixes.
* 'parisc-4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Unbreak bootloader due to gcc-7 optimizations
parisc: Reintroduce option to gzip-compress the kernel
parisc: Add HWPOISON page fault handler code
parisc: Move init_per_cpu() into init section
parisc: Check if initrd was loaded into broken RAM
parisc: Add PDCE_CHECK instruction to HPMC handler
parisc: Add wrapper for pdc_instr() firmware function
parisc: Move start_parisc() into init section
parisc: Stop unwinding at start of stack
parisc: Fix too large frame size warnings
Pull networking fixes from David Miller:
1) Fix NAPI poll list corruption in enic driver, from Christian
Lamparter.
2) Fix route use after free, from Eric Dumazet.
3) Fix regression in reuseaddr handling, from Josef Bacik.
4) Assert the size of control messages in compat handling since we copy
it in from userspace twice. From Meng Xu.
5) SMC layer bug fixes (missing RCU locking, bad refcounting, etc.)
from Ursula Braun.
6) Fix races in AF_PACKET fanout handling, from Willem de Bruijn.
7) Don't use ARRAY_SIZE on spinlock array which might have zero
entries, from Geert Uytterhoeven.
8) Fix miscomputation of checksum in ipv6 udp code, from Subash Abhinov
Kasiviswanathan.
9) Push the ipv6 header properly in ipv6 GRE tunnel driver, from Xin
Long.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
inet: fix improper empty comparison
net: use inet6_rcv_saddr to compare sockets
net: set tb->fast_sk_family
net: orphan frags on stand-alone ptype in dev_queue_xmit_nit
MAINTAINERS: update git tree locations for ieee802154 subsystem
net: prevent dst uses after free
net: phy: Fix truncation of large IRQ numbers in phy_attached_print()
net/smc: no close wait in case of process shut down
net/smc: introduce a delay
net/smc: terminate link group if out-of-sync is received
net/smc: longer delay for client link group removal
net/smc: adapt send request completion notification
net/smc: adjust net_device refcount
net/smc: take RCU read lock for routing cache lookup
net/smc: add receive timeout check
net/smc: add missing dev_put
net: stmmac: Cocci spatch "of_table"
lan78xx: Use default values loaded from EEPROM/OTP after reset
lan78xx: Allow EEPROM write for less than MAX_EEPROM_SIZE
lan78xx: Fix for eeprom read/write when device auto suspend
...
The parisc architecture has larger stack frames than most other
architectures on 32-bit kernels.
Increase the maximum allowed stack frame to 1280 bytes for parisc to
avoid warnings in the do_sys_poll() and pat_memconfig() functions.
Signed-off-by: Helge Deller <deller@gmx.de>
kbuild test robot reported a section mismatch warning w. gcc 4.x:
WARNING: lib/test_rhashtable.o(.text+0x139e):
Section mismatch in reference from the function rhltable_insert.clone.3() to the variable .init.data:rhlt
so remove this annotation.
Fixes: cdd4de372e ("test_rhashtable: add test case for rhl_table interface")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Issue is that if the data crosses a page boundary inside a compound
page, this check will incorrectly trigger a WARN_ON.
To fix this, compute the order using the head of the compound page and
adjust the offset to be relative to that head.
Fixes: 72e809ed81 ("iov_iter: sanity checks for copy to/from page
primitives")
Signed-off-by: Petar Penkov <ppenkov@google.com>
CC: Al Viro <viro@zeniv.linux.org.uk>
CC: Eric Dumazet <edumazet@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
We can build one skb and let it be cloned in netlink.
This is much faster, and use less memory (all clones will
share the same skb->head)
Tested:
time perf record (for f in `seq 1 3000` ; do ip netns add tast$f; done)
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 4.110 MB perf.data (~179584 samples) ]
real 0m24.227s # instead of 0m52.554s
user 0m0.329s
sys 0m23.753s # instead of 0m51.375s
14.77% ip [kernel.kallsyms] [k] __ip6addrlbl_add
14.56% ip [kernel.kallsyms] [k] netlink_broadcast_filtered
11.65% ip [kernel.kallsyms] [k] netlink_has_listeners
6.19% ip [kernel.kallsyms] [k] _raw_spin_lock_irqsave
5.66% ip [kernel.kallsyms] [k] kobject_uevent_env
4.97% ip [kernel.kallsyms] [k] memset_erms
4.67% ip [kernel.kallsyms] [k] refcount_sub_and_test
4.41% ip [kernel.kallsyms] [k] _raw_read_lock
3.59% ip [kernel.kallsyms] [k] refcount_inc_not_zero
3.13% ip [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.55% ip [kernel.kallsyms] [k] __wake_up
1.20% ip [kernel.kallsyms] [k] strlen
1.03% ip [kernel.kallsyms] [k] __wake_up_common
0.93% ip [kernel.kallsyms] [k] consume_skb
0.92% ip [kernel.kallsyms] [k] netlink_trim
0.87% ip [kernel.kallsyms] [k] insert_header
0.63% ip [kernel.kallsyms] [k] unmap_page_range
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to iterate over strings, just copy in one efficient memcpy() call.
Tested:
time perf record "(for f in `seq 1 3000` ; do ip netns add tast$f; done)"
[ perf record: Woken up 10 times to write data ]
[ perf record: Captured and wrote 8.224 MB perf.data (~359301 samples) ]
real 0m52.554s # instead of 1m7.492s
user 0m0.309s
sys 0m51.375s # instead of 1m6.875s
9.88% ip [kernel.kallsyms] [k] netlink_broadcast_filtered
8.86% ip [kernel.kallsyms] [k] string
7.37% ip [kernel.kallsyms] [k] __ip6addrlbl_add
5.68% ip [kernel.kallsyms] [k] netlink_has_listeners
5.52% ip [kernel.kallsyms] [k] memcpy_erms
4.76% ip [kernel.kallsyms] [k] __alloc_skb
4.54% ip [kernel.kallsyms] [k] vsnprintf
3.94% ip [kernel.kallsyms] [k] format_decode
3.80% ip [kernel.kallsyms] [k] kmem_cache_alloc_node_trace
3.71% ip [kernel.kallsyms] [k] kmem_cache_alloc_node
3.66% ip [kernel.kallsyms] [k] kobject_uevent_env
3.38% ip [kernel.kallsyms] [k] strlen
2.65% ip [kernel.kallsyms] [k] _raw_spin_lock_irqsave
2.20% ip [kernel.kallsyms] [k] kfree
2.09% ip [kernel.kallsyms] [k] memset_erms
2.07% ip [kernel.kallsyms] [k] ___cache_free
1.95% ip [kernel.kallsyms] [k] kmem_cache_free
1.91% ip [kernel.kallsyms] [k] _raw_read_lock
1.45% ip [kernel.kallsyms] [k] ksize
1.25% ip [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
1.00% ip [kernel.kallsyms] [k] widen_string
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This removes some #ifdef pollution and will ease follow up patches.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
also test rhltable. rhltable remove operations are slow as
deletions require a list walk, thus test with 1/16th of the given
entry count number to get a run duration similar to rhashtable one.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
add a test that tries to insert more than max_size elements.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
pass the entries to test as an argument instead.
Followup patch will add an rhlist test case; rhlist delete opererations
are slow so we need to use a smaller number to test it.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clarify that rhashtable_walk_{stop,start} will not reset the iterator to
the beginning of the hash table. Confusion between rhashtable_walk_enter
and rhashtable_walk_start has already lead to a bug.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current udev rules cause modules to be loaded on all device events save
for "remove". With the introduction of KOBJ_BIND/KOBJ_UNBIND this causes
issues, as driver modules that have devices bound to their drivers get
immediately reloaded, and it appears to the user that module unloading doe
snot work.
The standard udev matching rule is foillowing:
ENV{MODALIAS}=="?*", RUN{builtin}+="kmod load $env{MODALIAS}"
Given that MODALIAS data is not terribly useful for UNBIND event, let's zap
it from the generated uevent environment until we get userspace updated
with the correct udev rule that only loads modules on "add" event.
Reported-by: Jakub Kicinski <kubakici@wp.pl>
Tested-by: Jakub Kicinski <kubakici@wp.pl>
Fixes: 1455cf8dbf ("driver core: emit uevents when device is bound ...")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull zstd support from Chris Mason:
"Nick Terrell's patch series to add zstd support to the kernel has been
floating around for a while. After talking with Dave Sterba, Herbert
and Phillip, we decided to send the whole thing in as one pull
request.
zstd is a big win in speed over zlib and in compression ratio over
lzo, and the compression team here at FB has gotten great results
using it in production. Nick will continue to update the kernel side
with new improvements from the open source zstd userland code.
Nick has a number of benchmarks for the main zstd code in his lib/zstd
commit:
I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB
of RAM. The VM is running on a MacBook Pro with a 3.1 GHz Intel
Core i7 processor, 16 GB of RAM, and a SSD. I benchmarked using
`silesia.tar` [3], which is 211,988,480 B large. Run the following
commands for the benchmark:
sudo modprobe zstd_compress_test
sudo mknod zstd_compress_test c 245 0
sudo cp silesia.tar zstd_compress_test
The time is reported by the time of the userland `cp`.
The MB/s is computed with
1,536,217,008 B / time(buffer size, hash)
which includes the time to copy from userland.
The Adjusted MB/s is computed with
1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).
The memory reported is the amount of memory the compressor
requests.
| Method | Size (B) | Time (s) | Ratio | MB/s | Adj MB/s | Mem (MB) |
|----------|----------|----------|-------|---------|----------|----------|
| none | 11988480 | 0.100 | 1 | 2119.88 | - | - |
| zstd -1 | 73645762 | 1.044 | 2.878 | 203.05 | 224.56 | 1.23 |
| zstd -3 | 66988878 | 1.761 | 3.165 | 120.38 | 127.63 | 2.47 |
| zstd -5 | 65001259 | 2.563 | 3.261 | 82.71 | 86.07 | 2.86 |
| zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 | 16.13 | 13.22 |
| zstd -15 | 58009756 | 47.601 | 3.654 | 4.45 | 4.46 | 21.61 |
| zstd -19 | 54014593 | 102.835 | 3.925 | 2.06 | 2.06 | 60.15 |
| zlib -1 | 77260026 | 2.895 | 2.744 | 73.23 | 75.85 | 0.27 |
| zlib -3 | 72972206 | 4.116 | 2.905 | 51.50 | 52.79 | 0.27 |
| zlib -6 | 68190360 | 9.633 | 3.109 | 22.01 | 22.24 | 0.27 |
| zlib -9 | 67613382 | 22.554 | 3.135 | 9.40 | 9.44 | 0.27 |
I benchmarked zstd decompression using the same method on the same
machine. The benchmark file is located in the upstream zstd repo
under `contrib/linux-kernel/zstd_decompress_test.c` [4]. The
memory reported is the amount of memory required to decompress
data compressed with the given compression level. If you know the
maximum size of your input, you can reduce the memory usage of
decompression irrespective of the compression level.
| Method | Time (s) | MB/s | Adjusted MB/s | Memory (MB) |
|----------|----------|---------|---------------|-------------|
| none | 0.025 | 8479.54 | - | - |
| zstd -1 | 0.358 | 592.15 | 636.60 | 0.84 |
| zstd -3 | 0.396 | 535.32 | 571.40 | 1.46 |
| zstd -5 | 0.396 | 535.32 | 571.40 | 1.46 |
| zstd -10 | 0.374 | 566.81 | 607.42 | 2.51 |
| zstd -15 | 0.379 | 559.34 | 598.84 | 4.61 |
| zstd -19 | 0.412 | 514.54 | 547.77 | 8.80 |
| zlib -1 | 0.940 | 225.52 | 231.68 | 0.04 |
| zlib -3 | 0.883 | 240.08 | 247.07 | 0.04 |
| zlib -6 | 0.844 | 251.17 | 258.84 | 0.04 |
| zlib -9 | 0.837 | 253.27 | 287.64 | 0.04 |
I ran a long series of tests and benchmarks on the btrfs side and the
gains are very similar to the core benchmarks Nick ran"
* 'zstd-minimal' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
squashfs: Add zstd support
btrfs: Add zstd support
lib: Add zstd modules
lib: Add xxhash module
GFP_TEMPORARY was introduced by commit e12ba74d8f ("Group short-lived
and reclaimable kernel allocations") along with __GFP_RECLAIMABLE. It's
primary motivation was to allow users to tell that an allocation is
short lived and so the allocator can try to place such allocations close
together and prevent long term fragmentation. As much as this sounds
like a reasonable semantic it becomes much less clear when to use the
highlevel GFP_TEMPORARY allocation flag. How long is temporary? Can the
context holding that memory sleep? Can it take locks? It seems there is
no good answer for those questions.
The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
__GFP_RECLAIMABLE which in itself is tricky because basically none of
the existing caller provide a way to reclaim the allocated memory. So
this is rather misleading and hard to evaluate for any benefits.
I have checked some random users and none of them has added the flag
with a specific justification. I suspect most of them just copied from
other existing users and others just thought it might be a good idea to
use without any measuring. This suggests that GFP_TEMPORARY just
motivates for cargo cult usage without any reasoning.
I believe that our gfp flags are quite complex already and especially
those with highlevel semantic should be clearly defined to prevent from
confusion and abuse. Therefore I propose dropping GFP_TEMPORARY and
replace all existing users to simply use GFP_KERNEL. Please note that
SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
so they will be placed properly for memory fragmentation prevention.
I can see reasons we might want some gfp flag to reflect shorterm
allocations but I propose starting from a clear semantic definition and
only then add users with proper justification.
This was been brought up before LSF this year by Matthew [1] and it
turned out that GFP_TEMPORARY really doesn't have a clear semantic. It
seems to be a heuristic without any measured advantage for most (if not
all) its current users. The follow up discussion has revealed that
opinions on what might be temporary allocation differ a lot between
developers. So rather than trying to tweak existing users into a
semantic which they haven't expected I propose to simply remove the flag
and start from scratch if we really need a semantic for short term
allocations.
[1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org
[akpm@linux-foundation.org: fix typo]
[akpm@linux-foundation.org: coding-style fixes]
[sfr@canb.auug.org.au: drm/i915: fix up]
Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With gcc 4.1.2:
lib/test_bitmap.c:189: warning: integer constant is too large for `long' type
lib/test_bitmap.c:190: warning: integer constant is too large for `long' type
lib/test_bitmap.c:194: warning: integer constant is too large for `long' type
lib/test_bitmap.c:195: warning: integer constant is too large for `long' type
Add the missing "ULL" suffix to fix this.
Link: http://lkml.kernel.org/r/1505040523-31230-1-git-send-email-geert@linux-m68k.org
Fixes: 60ef690018 ("bitmap: introduce BITMAP_FROM_U64()")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
IDR only supports non-negative IDs. There used to be a 'WARN_ON_ONCE(id <
0)' in idr_replace(), but it was intentionally removed by commit
2e1c9b2867 ("idr: remove WARN_ON_ONCE() on negative IDs").
Then it was added back by commit 0a835c4f09 ("Reimplement IDR and IDA
using the radix tree"). However it seems that adding it back was a
mistake, given that some users such as drm_gem_handle_delete()
(DRM_IOCTL_GEM_CLOSE) pass in a value from userspace to idr_replace(),
allowing the WARN_ON_ONCE to be triggered. drm_gem_handle_delete()
actually just wants idr_replace() to return an error code if the ID is
not allocated, including in the case where the ID is invalid (negative).
So once again remove the bogus WARN_ON_ONCE().
This bug was found by syzkaller, which encountered the following
warning:
WARNING: CPU: 3 PID: 3008 at lib/idr.c:157 idr_replace+0x1d8/0x240 lib/idr.c:157
Kernel panic - not syncing: panic_on_warn set ...
CPU: 3 PID: 3008 Comm: syzkaller218828 Not tainted 4.13.0-rc4-next-20170811 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190
do_trap_no_signal arch/x86/kernel/traps.c:224 [inline]
do_trap+0x260/0x390 arch/x86/kernel/traps.c:273
do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:310
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323
invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:930
RIP: 0010:idr_replace+0x1d8/0x240 lib/idr.c:157
RSP: 0018:ffff8800394bf9f8 EFLAGS: 00010297
RAX: ffff88003c6c60c0 RBX: 1ffff10007297f43 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800394bfa78
RBP: ffff8800394bfae0 R08: ffffffff82856487 R09: 0000000000000000
R10: ffff8800394bf9a8 R11: ffff88006c8bae28 R12: ffffffffffffffff
R13: ffff8800394bfab8 R14: dffffc0000000000 R15: ffff8800394bfbc8
drm_gem_handle_delete+0x33/0xa0 drivers/gpu/drm/drm_gem.c:297
drm_gem_close_ioctl+0xa1/0xe0 drivers/gpu/drm/drm_gem.c:671
drm_ioctl_kernel+0x1e7/0x2e0 drivers/gpu/drm/drm_ioctl.c:729
drm_ioctl+0x72e/0xa50 drivers/gpu/drm/drm_ioctl.c:825
vfs_ioctl fs/ioctl.c:45 [inline]
do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685
SYSC_ioctl fs/ioctl.c:700 [inline]
SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
entry_SYSCALL_64_fastpath+0x1f/0xbe
Here is a C reproducer:
#include <fcntl.h>
#include <stddef.h>
#include <stdint.h>
#include <sys/ioctl.h>
#include <drm/drm.h>
int main(void)
{
int cardfd = open("/dev/dri/card0", O_RDONLY);
ioctl(cardfd, DRM_IOCTL_GEM_CLOSE,
&(struct drm_gem_close) { .handle = -1 } );
}
Link: http://lkml.kernel.org/r/20170906235306.20534-1-ebiggers3@gmail.com
Fixes: 0a835c4f09 ("Reimplement IDR and IDA using the radix tree")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: <stable@vger.kernel.org> [v4.11+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>