btrfs_calc_trans_metadata_size() does an unsigned 32-bit multiplication,
which can overflow if num_items >= 4 GB / (nodesize * BTRFS_MAX_LEVEL * 2).
For a nodesize of 16kB, this overflow happens at 16k items. Usually,
num_items is a small constant passed to btrfs_start_transaction(), but
we also use btrfs_calc_trans_metadata_size() for metadata reservations
for extent items in btrfs_delalloc_{reserve,release}_metadata().
In drop_outstanding_extents(), num_items is calculated as
inode->reserved_extents - inode->outstanding_extents. The difference
between these two counters is usually small, but if many delalloc
extents are reserved and then the outstanding extents are merged in
btrfs_merge_extent_hook(), the difference can become large enough to
overflow in btrfs_calc_trans_metadata_size().
The overflow manifests itself as a leak of a multiple of 4 GB in
delalloc_block_rsv and the metadata bytes_may_use counter. This in turn
can cause early ENOSPC errors. Additionally, these WARN_ONs in
extent-tree.c will be hit when unmounting:
WARN_ON(fs_info->delalloc_block_rsv.size > 0);
WARN_ON(fs_info->delalloc_block_rsv.reserved > 0);
WARN_ON(space_info->bytes_pinned > 0 ||
space_info->bytes_reserved > 0 ||
space_info->bytes_may_use > 0);
Fix it by casting nodesize to a u64 so that
btrfs_calc_trans_metadata_size() does a full 64-bit multiplication.
While we're here, do the same in btrfs_calc_trunc_metadata_size(); this
can't overflow with any existing uses, but it's better to be safe here
than have another hard-to-debug problem later on.
Cc: stable@vger.kernel.org
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Before this, we use 'filled' mode here, ie. if all range has been
filled with EXTENT_DEFRAG bits, get to clear it, but if the defrag
range joins the adjacent delalloc range, then we'll have EXTENT_DEFRAG
bits in extent_state until releasing this inode's pages, and that
prevents extent_data from being freed.
This clears the bit if any was found within the ordered extent.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
In verify_dir_item, it wants to printk name_len of dir_item but
printk data_len acutally.
Fix it by calling btrfs_dir_name_len instead of btrfs_dir_data_len.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEE4bay/IylYqM/npjQHv7KIOw4HPYFAlk6mLwTHG1rbEBwZW5n
dXRyb25peC5kZQAKCRAe/sog7Dgc9vx5CADTRLfMxgmmGsLcM8nO0sKn/7X/o4Hy
qprZy1Nu4J49kG/93pumIvWkcG9fzZ4hbDy+4DiEmL/fIumbvg56aaW90fyvW9Cl
HzL3S7Id8Db6Qwfr4Wd7r1IRYn08+om4ukO09GdskmMpS4wOr68+3WhutGGu1mEX
ZS2sTE/dKnozZqUw6agcwXJmAJWFbYBaVcF+x2GT07ako0jYXF29zqe/GlpmzVBJ
Wmib9DO2BeDm/QpNIot/I3Se86xS0AIH6ds++YRJX6k9dlFb4T2Liw6UysPSVCz6
QkAT3SiTu3TKoT1TXEyMQEQSpnrFtpMowdTk/s04bKibVIpTTskZm44c
=DNKW
-----END PGP SIGNATURE-----
Merge tag 'linux-can-fixes-for-4.12-20170609' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2017-06-09
this is a pull request of 6 patches for net/master.
There's a patch by Stephane Grosjean that fixes an uninitialized symbol warning
in the peak_canfd driver. A patch by Johan Hovold to fix the product-id
endianness in an error message in the the peak_usb driver. A patch by Oliver
Hartkopp to enable CAN FD for virtual CAN devices by default. Three patches by
me, one makes the helper function can_change_state() robust to be called with
cf == NULL. The next patch fixes a memory leak in the gs_usb driver. And the
last one fixes a lockdep splat by properly initialize the per-net
can_rcvlists_lock spin_lock.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The change to remove free_netdev() from ieee80211_if_free()
erroneously didn't add the necessary free_netdev() for when
ieee80211_if_free() is called directly in one place, rather
than as the priv_destructor. Add the missing call.
Fixes: cf124db566 ("net: Fix inconsistent teardown and release of private netdev state.")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPI's from the victim cpu are not handled in dev_cpu_callback.
So these pending IPI's would be sent to the remote cpu only when
NET_RX is scheduled on the victim cpu and since this trigger is
unpredictable it would result in packet latencies on the remote cpu.
This patch add support to send the pending ipi's of victim cpu.
Signed-off-by: Ashwanth Goli <ashwanth@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABAgAGBQJZOscBAAoJELDendYovxMvN8EH+wXMRtFufmdXxyh3Wi5IHbfg
B56J4mjrOFpw+NnNHZk5H0cUSDwYb14dRCEnLNIXUpzCAb0mRMhPclhe07IMLqe1
FEqz6qWAh301mugqu6PlXaPZs9af7A6t6LEnfbAxXzgthWEhfzOecOXo0D5oV9sN
e4qFfoY9/5IoSShbEuHVLf5OBs4S5rhyQ0DNCEfqnHKvCn0VlRlBQMTrYTNZG28O
jgAWdxIPKXxCy2hoVV/vovuan1F38v9ZeWyVbf03IGfAGjVBFHzIbd9dH1OJm6X0
H/RGfJW6VPvswEsZXD6z0UkMW1IXa8fKCjwtvkVf5BFrKDJi4QUB/wZuteqmxrY=
=BUAE
-----END PGP SIGNATURE-----
Merge tag 'for-linus-4.12b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
"A fix for Xen on ARM when dealing with 64kB page size of a guest"
* tag 'for-linus-4.12b-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/privcmd: Support correctly 64KB page granularity when mapping memory
The 5th generation Thinkpad X1 Carbons use Synaptics touchpads accessible
over SMBus/RMI, combined with ALPS or Elantech trackpoint devices instead
of classic IBM/Lenovo trackpoints. Unfortunately there is no way for ALPS
driver to detect whether it is dealing with touchpad + trackpoint
combination or just a trackpoint, so we end up with a "phantom" dualpoint
ALPS device in addition to real touchpad and trackpoint.
Given that we do not have any special advanced handling for ALPS or
Elantech trackpoints (unlike IBM trackpoints that have separate driver and
a host of options) we are better off keeping the trackpoints in PS/2
emulation mode. We achieve that by setting serio type to SERIO_PS_PSTHRU,
which will limit number of protocols psmouse driver will try. In addition
to getting rid of the "phantom" touchpads, this will also speed up probing
of F03 pass-through port.
Reported-by: Damjan Georgievski <gdamjan@gmail.com>
Suggested-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Been sitting on these for a couple of weeks waiting on some larger batches
to come in but it's been pretty quiet.
Just your garden variety fixes here:
- A few maintainers updates (ep93xx, Exynos, TI, Marvell)
- Some PM fixes for Atmel/at91 and Marvell
- A few DT fixes for Marvell, Versatile, TI Keystone, bcm283x
- A reset driver patch to set module license for symbol access
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJZN343AAoJEIwa5zzehBx3xJ0QAJSexz+rYI7V3aqvjtNmdEaE
2l7Rl4dNQ13u7RBx+67/m1vAgxTgefXahckuv6x4Jr5S5sQO++OkTm0XBO1+3trY
pwQVJYatOwDt5X7+HOKmTvCgFh48KyrNegXy1lvr/p77CyA+B61zQ2w9wqO0VXua
MQ05HzOt2JroKytPz70MywxtQpULWC8FGZTFbzZqUfdS30HxM4ZXp6gKxMDvRAqh
LpP2hfjCnM0H3QoeNXYsfSydI0T0J0PcavouUzGQk2XSA6k5g+MXpL1IUB+iN9EH
UdmEiVhDcNB3upWQ0lPFi84sexDXSqcu6M9VIozdC/LYDD1lGnHBEZuagoq72/xA
CEU3H81inCQ6cpYRgan7uzlA4+dqKf4HD3H1fkwrowblMQppWPeDe9e/5XAq73Xl
4+5GxXtDhK1KvPaH3USkTnFOjEQ2QELmDxdLqmiTXP8GnXdn5wJTobUj7z6HttXY
Q4jA7F/A8ObHbEbnZI9e8pmrnQeMd/cK47NCZTBkJgN2eIzPw/TJk/bQcIXAq/km
HcVn5R8GbrN9DwJMpMQN9fpH3sXCmcUxujbfldTYGdsBo8rvXChs8DHxJF94FXOV
rMO6Bb25bd7kN8oCvY3r7VeGavpSkO8WVWi3YnNW4KGF9/oGE24LGHdbChjoLyJH
rvv3uVsXtx2A9O9uGYl1
=WlSc
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Been sitting on these for a couple of weeks waiting on some larger
batches to come in but it's been pretty quiet.
Just your garden variety fixes here:
- A few maintainers updates (ep93xx, Exynos, TI, Marvell)
- Some PM fixes for Atmel/at91 and Marvell
- A few DT fixes for Marvell, Versatile, TI Keystone, bcm283x
- A reset driver patch to set module license for symbol access"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
MAINTAINERS: EP93XX: Update maintainership
MAINTAINERS: remove kernel@stlinux.com obsolete mailing list
ARM: dts: versatile: use #include "..." to include local DT
MAINTAINERS: add device-tree files to TI DaVinci entry
ARM: at91: select CONFIG_ARM_CPU_SUSPEND
ARM: dts: keystone-k2l: fix broken Ethernet due to disabled OSR
arm64: defconfig: enable some core options for 64bit Rockchip socs
arm64: marvell: dts: fix interrupts in 7k/8k crypto nodes
reset: hi6220: Set module license so that it can be loaded
MAINTAINERS: add irqchip related drivers to Marvell EBU maintainers
MAINTAINERS: sort F entries for Marvell EBU maintainers
ARM: davinci: PM: Do not free useful resources in normal path in 'davinci_pm_init'
ARM: davinci: PM: Free resources in error handling path in 'davinci_pm_init'
ARM: dts: bcm283x: Reserve first page for firmware
memory: atmel-ebi: mark PM ops as __maybe_unused
MAINTAINERS: Remove Javier Martinez Canillas as reviewer for Exynos
1.) Bugfix of function stmmac_get_tx_hwtstamp.
Corrected the tx timestamp available check (same as 4.8 and older)
Change printout from info syslevel to debug.
2.) Bugfix of function stmmac_get_rx_hwtstamp.
Corrected the rx timestamp available check (same as 4.8 and older)
Change printout from info syslevel to debug.
Fixes: ba1ffd74df ("stmmac: fix PTP support for GMAC4")
Signed-off-by: Mario Molitor <mario_molitor@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
According the CYCLON V documention only the bit 16 of snaptypesel should
set.
(more information see Table 17-20 (cv_5v4.pdf) :
Timestamp Snapshot Dependency on Register Bits)
Fixes: d2042052a0 ("stmmac: update the PTP header file")
Signed-off-by: Mario Molitor <mario_molitor@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
It looks like this:
Message from syslogd@flamingo at Apr 26 00:45:00 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 4
They seem to coincide with net namespace teardown.
The message is emitted by netdev_wait_allrefs().
Forced a kdump in netdev_run_todo, but found that the refcount on the lo
device was already 0 at the time we got to the panic.
Used bcc to check the blocking in netdev_run_todo. The only places
where we're off cpu there are in the rcu_barrier() and msleep() calls.
That behavior is expected. The msleep time coincides with the amount of
time we spend waiting for the refcount to reach zero; the rcu_barrier()
wait times are not excessive.
After looking through the list of callbacks that the netdevice notifiers
invoke in this path, it appears that the dst_dev_event is the most
interesting. The dst_ifdown path places a hold on the loopback_dev as
part of releasing the dev associated with the original dst cache entry.
Most of our notifier callbacks are straight-forward, but this one a)
looks complex, and b) places a hold on the network interface in
question.
I constructed a new bcc script that watches various events in the
liftime of a dst cache entry. Note that dst_ifdown will take a hold on
the loopback device until the invalidated dst entry gets freed.
[ __dst_free] on DST: ffff883ccabb7900 IF tap1008300eth0 invoked at 1282115677036183
__dst_free
rcu_nocb_kthread
kthread
ret_from_fork
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The inode destruction path for the 'dax' device filesystem incorrectly
assumes that the inode was initialized through 'alloc_dax()'. However,
if someone attempts to directly mount the dax filesystem with 'mount -t
dax dax mnt' that will bypass 'alloc_dax()' and the following failure
signatures may occur as a result:
kill_dax() must be called before final iput()
WARNING: CPU: 2 PID: 1188 at drivers/dax/super.c:243 dax_destroy_inode+0x48/0x50
RIP: 0010:dax_destroy_inode+0x48/0x50
Call Trace:
destroy_inode+0x3b/0x60
evict+0x139/0x1c0
iput+0x1f9/0x2d0
dentry_unlink_inode+0xc3/0x160
__dentry_kill+0xcf/0x180
? dput+0x37/0x3b0
dput+0x3a3/0x3b0
do_one_tree+0x36/0x40
shrink_dcache_for_umount+0x2d/0x90
generic_shutdown_super+0x1f/0x120
kill_anon_super+0x12/0x20
deactivate_locked_super+0x43/0x70
deactivate_super+0x4e/0x60
general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
RIP: 0010:kfree+0x6d/0x290
Call Trace:
<IRQ>
dax_i_callback+0x22/0x60
? dax_destroy_inode+0x50/0x50
rcu_process_callbacks+0x298/0x740
ida_remove called for id=0 which is not allocated.
WARNING: CPU: 0 PID: 0 at lib/idr.c:383 ida_remove+0x110/0x120
[..]
Call Trace:
<IRQ>
ida_simple_remove+0x2b/0x50
? dax_destroy_inode+0x50/0x50
dax_i_callback+0x3c/0x60
rcu_process_callbacks+0x298/0x740
Add missing initialization of the 'struct dax_device' and inode so that
the destruction path does not kfree() or ida_simple_remove()
uninitialized data.
Fixes: 7b6be8444e ("dax: refactor dax-fs into a generic provider of 'struct dax_device' instances")
Reported-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in bind() and connect()
handlers of the AF_UNIX socket. Since neither syscall enforces a minimum
size of the corresponding memory region, very short sockaddrs (zero or
one byte long) result in operating on uninitialized memory while
referencing .sa_family.
Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: 0d7e2d2166 ("IB/ipoib: add get_link_ksettings in ethtool")
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maniaxx reported a kernel boot crash in the EFI code, which I emulated
by using same invalid phys addr in code:
BUG: unable to handle kernel paging request at ffffffffff280001
IP: efi_bgrt_init+0xfb/0x153
...
Call Trace:
? bgrt_init+0xbc/0xbc
acpi_parse_bgrt+0xe/0x12
acpi_table_parse+0x89/0xb8
acpi_boot_init+0x445/0x4e2
? acpi_parse_x2apic+0x79/0x79
? dmi_ignore_irq0_timer_override+0x33/0x33
setup_arch+0xb63/0xc82
? early_idt_handler_array+0x120/0x120
start_kernel+0xb7/0x443
? early_idt_handler_array+0x120/0x120
x86_64_start_reservations+0x29/0x2b
x86_64_start_kernel+0x154/0x177
secondary_startup_64+0x9f/0x9f
There is also a similar bug filed in bugzilla.kernel.org:
https://bugzilla.kernel.org/show_bug.cgi?id=195633
The crash is caused by this commit:
7b0a911478 efi/x86: Move the EFI BGRT init code to early init code
The root cause is the firmware on those machines provides invalid BGRT
image addresses.
In a kernel before above commit BGRT initializes late and uses ioremap()
to map the image address. Ioremap validates the address, if it is not a
valid physical address ioremap() just fails and returns. However in current
kernel EFI BGRT initializes early and uses early_memremap() which does not
validate the image address, and kernel panic happens.
According to ACPI spec the BGRT image address should fall into
EFI_BOOT_SERVICES_DATA, see the section 5.2.22.4 of below document:
http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf
Fix this issue by validating the image address in efi_bgrt_init(). If the
image address does not fall into any EFI_BOOT_SERVICES_DATA areas we just
bail out with a warning message.
Reported-by: Maniaxx <tripleshiftone@gmail.com>
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 7b0a911478 ("efi/x86: Move the EFI BGRT init code to early init code")
Link: http://lkml.kernel.org/r/20170609084558.26766-2-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
CAN FD capable CAN interfaces can handle (classic) CAN 2.0 frames too.
New users usually fail at their first attempt to explore CAN FD on
virtual CAN interfaces due to the current CAN_MTU default.
Set the MTU to CANFD_MTU by default to reduce this confusion.
If someone *really* needs a 'classic CAN'-only device this can be set
with the 'ip' tool with e.g. 'ip link set vcan0 mtu 16' as before.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch adds the missing kfree() in gs_cmd_reset() to free the
memory that is not used anymore after usb_control_msg().
Cc: linux-stable <stable@vger.kernel.org>
Cc: Maximilian Schneider <max@schneidersoft.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Make sure to use the USB device product-id stored in host-byte order in
a probe error message.
Also remove a redundant reassignment of the local usb_dev variable which
had already been used to retrieve the product id.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch fixes two uninitialized symbol warnings in the new code adding
support of the PEAK-System PCAN-PCI Express FD boards, in the socket-CAN
network protocol family.
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
In OOM situations where no skb can be allocated, can_change_state() may
be called with cf == NULL. As this function updates the state and error
statistics it's not an option to skip the call to can_change_state() in
OOM situations.
This patch makes can_change_state() robust, so that it can be called
with cf == NULL.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
During an eeh call to cxl_remove can result in double free_irq of
psl,slice interrupts. This can happen if perst_reloads_same_image == 1
and call to cxl_configure_adapter() fails during slot_reset
callback. In such a case we see a kernel oops with following back-trace:
Oops: Kernel access of bad area, sig: 11 [#1]
Call Trace:
free_irq+0x88/0xd0 (unreliable)
cxl_unmap_irq+0x20/0x40 [cxl]
cxl_native_release_psl_irq+0x78/0xd8 [cxl]
pci_deconfigure_afu+0xac/0x110 [cxl]
cxl_remove+0x104/0x210 [cxl]
pci_device_remove+0x6c/0x110
device_release_driver_internal+0x204/0x2e0
pci_stop_bus_device+0xa0/0xd0
pci_stop_and_remove_bus_device+0x28/0x40
pci_hp_remove_devices+0xb0/0x150
pci_hp_remove_devices+0x68/0x150
eeh_handle_normal_event+0x140/0x580
eeh_handle_event+0x174/0x360
eeh_event_handler+0x1e8/0x1f0
This patch fixes the issue of double free_irq by checking that
variables that hold the virqs (err_hwirq, serr_hwirq, psl_virq) are
not '0' before un-mapping and resetting these variables to '0' when
they are un-mapped.
Cc: stable@vger.kernel.org
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If more than one gpio bank has the "pwm" property, only one will be
registered successfully, all the others will fail with:
mvebu-gpio: probe of f1018140.gpio failed with error -17
That's because in alloc_pwms(), the chip->base (aka "int pwm"), was not
set (thus, ==0) ; and 0 is a meaningful start value in alloc_pwm().
What was intended is mvpwm->chip->base = -1.
Like that, the numbering will be done auto-magically
Moreover, as the region might be already occupied by another pwm, we
shouldn't force:
mvpwm->chip->base = 0
nor
mvpwm->chip->base = id * MVEBU_MAX_GPIO_PER_BANK;
Tested on clearfog-pro (Marvell 88F6828)
Fixes: 757642f9a5 ("gpio: mvebu: Add limited PWM support")
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
The blink counter A was always selected because 0 was forced in the
blink select counter register.
The variable 'set' was obviously there to be used as the register value,
selecting the B counter when id==1 and A counter when id==0.
Tested on clearfog-pro (Marvell 88F6828)
Fixes: 757642f9a5 ("gpio: mvebu: Add limited PWM support")
Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Pull RCU fix from Paul E. McKenney:
" This series enables srcu_read_lock() and srcu_read_unlock() to be used from
interrupt handlers, which fixes a bug in KVM's use of SRCU in delivery
of interrupts to guest OSes. "
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If a key's refcount is dropped to zero between key_lookup() peeking at
the refcount and subsequently attempting to increment it, refcount_inc()
will see a zero refcount. Here, refcount_inc() will WARN_ONCE(), and
will *not* increment the refcount, which will remain zero.
Once key_lookup() drops key_serial_lock, it is possible for the key to
be freed behind our back.
This patch uses refcount_inc_not_zero() to perform the peek and increment
atomically.
Fixes: fff292914d ("security, keys: convert key.usage from atomic_t to refcount_t")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: David Windsor <dwindsor@gmail.com>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Hans Liljestrand <ishkamiel@gmail.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
The initial Diffie-Hellman computation made direct use of the MPI
library because the crypto module did not support DH at the time. Now
that KPP is implemented, KEYCTL_DH_COMPUTE should use it to get rid of
duplicate code and leverage possible hardware acceleration.
This fixes an issue whereby the input to the KDF computation would
include additional uninitialized memory when the result of the
Diffie-Hellman computation was shorter than the input prime number.
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Accessing a 'u8[4]' through a '__be32 *' violates alignment rules. Just
make the counter a __be32 instead.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: James Morris <james.l.morris@oracle.com>
If userspace called KEYCTL_DH_COMPUTE with kdf_params containing NULL
otherinfo but nonzero otherinfolen, the kernel would allocate a buffer
for the otherinfo, then feed it into the KDF without initializing it.
Fix this by always doing the copy from userspace (which will fail with
EFAULT in this scenario).
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Requesting "digest_null" in the keyctl_kdf_params caused an infinite
loop in kdf_ctr() because the "null" hash has a digest size of 0. Fix
it by rejecting hash algorithms with a digest size of 0.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: James Morris <james.l.morris@oracle.com>
While a 'struct key' itself normally does not contain sensitive
information, Documentation/security/keys.txt actually encourages this:
"Having a payload is not required; and the payload can, in fact,
just be a value stored in the struct key itself."
In case someone has taken this advice, or will take this advice in the
future, zero the key structure before freeing it. We might as well, and
as a bonus this could make it a bit more difficult for an adversary to
determine which keys have recently been in use.
This is safe because the key_jar cache does not use a constructor.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
As the previous patch did for encrypted-keys, zero sensitive any
potentially sensitive data related to the "trusted" key type before it
is freed. Notably, we were not zeroing the tpm_buf structures in which
the actual key is stored for TPM seal and unseal, nor were we zeroing
the trusted_key_payload in certain error paths.
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: David Safford <safford@us.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
For keys of type "encrypted", consistently zero sensitive key material
before freeing it. This was already being done for the decrypted
payloads of encrypted keys, but not for the master key and the keys
derived from the master key.
Out of an abundance of caution and because it is trivial to do so, also
zero buffers containing the key payload in encrypted form, although
depending on how the encrypted-keys feature is used such information
does not necessarily need to be kept secret.
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: David Safford <safford@us.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Zero the payloads of user and logon keys before freeing them. This
prevents sensitive key material from being kept around in the slab
caches after a key is released.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Before returning from add_key() or one of the keyctl() commands that
takes in a key payload, zero the temporary buffer that was allocated to
hold the key payload copied from userspace. This may contain sensitive
key material that should not be kept around in the slab caches.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
sys_add_key() and the KEYCTL_UPDATE operation of sys_keyctl() allowed a
NULL payload with nonzero length to be passed to the key type's
->preparse(), ->instantiate(), and/or ->update() methods. Various key
types including asymmetric, cifs.idmap, cifs.spnego, and pkcs7_test did
not handle this case, allowing an unprivileged user to trivially cause a
NULL pointer dereference (kernel oops) if one of these key types was
present. Fix it by doing the copy_from_user() when 'plen' is nonzero
rather than when '_payload' is non-NULL, causing the syscall to fail
with EFAULT as expected when an invalid buffer is specified.
Cc: stable@vger.kernel.org # 2.6.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
MACs should, in general, be compared using crypto_memneq() to prevent
timing attacks.
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
The encrypted-keys module was using a single global HMAC transform,
which could be rekeyed by multiple threads concurrently operating on
different keys, causing incorrect HMAC values to be calculated. Fix
this by allocating a new HMAC transform whenever we need to calculate a
HMAC. Also simplify things a bit by allocating the shash_desc's using
SHASH_DESC_ON_STACK() for both the HMAC and unkeyed hashes.
The following script reproduces the bug:
keyctl new_session
keyctl add user master "abcdefghijklmnop" @s
for i in $(seq 2); do
(
set -e
for j in $(seq 1000); do
keyid=$(keyctl add encrypted desc$i "new user:master 25" @s)
datablob="$(keyctl pipe $keyid)"
keyctl unlink $keyid > /dev/null
keyid=$(keyctl add encrypted desc$i "load $datablob" @s)
keyctl unlink $keyid > /dev/null
done
) &
done
Output with bug:
[ 439.691094] encrypted_key: bad hmac (-22)
add_key: Invalid argument
add_key: Invalid argument
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
With the 'encrypted' key type it was possible for userspace to provide a
data blob ending with a master key description shorter than expected,
e.g. 'keyctl add encrypted desc "new x" @s'. When validating such a
master key description, validate_master_desc() could read beyond the end
of the buffer. Fix this by using strncmp() instead of memcmp(). [Also
clean up the code to deduplicate some logic.]
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Since v4.9, the crypto API cannot (normally) be used to encrypt/decrypt
stack buffers because the stack may be virtually mapped. Fix this for
the padding buffers in encrypted-keys by using ZERO_PAGE for the
encryption padding and by allocating a temporary heap buffer for the
decryption padding.
Tested with CONFIG_DEBUG_SG=y:
keyctl new_session
keyctl add user master "abcdefghijklmnop" @s
keyid=$(keyctl add encrypted desc "new user:master 25" @s)
datablob="$(keyctl pipe $keyid)"
keyctl unlink $keyid
keyid=$(keyctl add encrypted desc "load $datablob" @s)
datablob2="$(keyctl pipe $keyid)"
[ "$datablob" = "$datablob2" ] && echo "Success!"
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
In join_session_keyring(), if install_session_keyring_to_cred() were to
fail, we would leak the keyring reference, just like in the bug fixed by
commit 23567fd052 ("KEYS: Fix keyring ref leak in
join_session_keyring()"). Fortunately this cannot happen currently, but
we really should be more careful. Do this by adding and using a new
error label at which the keyring reference is dropped.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Omit an extra message for a memory allocation failure in this function.
This issue was detected by using the Coccinelle software.
Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
We forgot to set the error code on this path so it could result in
returning NULL which leads to a NULL dereference.
Fixes: db6c43bd21 ("crypto: KEYS: convert public key and digsig asym to the akcipher api")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
With the new standardized functions, we can replace all ACCESS_ONCE()
calls across relevant security/keyrings/.
ACCESS_ONCE() does not work reliably on non-scalar types. For example
gcc 4.6 and 4.7 might remove the volatile tag for such accesses during
the SRA (scalar replacement of aggregates) step:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145
Update the new calls regardless of if it is a scalar type, this is
cleaner than having three alternatives.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>