Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. In this case, no initializers
are needed (they can be NULL initialized and callers adjusted to check
for NULL, which is more efficient than an indirect call).
Cc: Robin Holt <robinmholt@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
When trying to propagate an error result, the error return path attempts
to retain the error, but does this with an open cast across very different
types, which the upcoming structure layout randomization plugin flags as
being potentially dangerous in the face of randomization. This is a false
positive, but what this code actually wants to do is use ERR_CAST() to
retain the error value.
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
When trying to propagate an error result, the error return path attempts
to retain the error, but does this with an open cast across very different
types, which the upcoming structure layout randomization plugin flags as
being potentially dangerous in the face of randomization. This is a false
positive, but what this code actually wants to do is use ERR_CAST() to
retain the error value.
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
When the call to nfs_devname() fails, the error path attempts to retain
the error via the mnt variable, but this requires a cast across very
different types (char * to struct vfsmount *), which the upcoming
structure layout randomization plugin flags as being potentially
dangerous in the face of randomization. This is a false positive, but
what this code actually wants to do is retain the error value, so this
patch explicitly sets it, instead of using what seems to be an
unexpected cast.
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Stub functions in header files need to be static, or we can have
multiple definitions errors.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 6335e9f244 ("net: dsa: mv88e6xxx: mv88e6390X SERDES support")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
ad7152_write_raw_samp_freq() is called by ad7152_write_raw() with
chip->state_lock held. So, there is unavoidable deadlock when
ad7152_write_raw_samp_freq() locks the mutex itself.
The patch removes unneeded locking.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Fixes: 6572389bcc ("staging: iio: cdc: ad7152: Implement
IIO_CHAN_INFO_SAMP_FREQ attribute")
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Sabrina Dubroca reported an early panic:
BUG: unable to handle kernel paging request at ffffffffff240001
IP: efi_bgrt_init+0xdc/0x134
[...]
---[ end Kernel panic - not syncing: Attempted to kill the idle task!
... which was introduced by:
7b0a911478 ("efi/x86: Move the EFI BGRT init code to early init code")
The cause is that on this machine the firmware provides the EFI ACPI BGRT
table even on legacy non-EFI bootups - which table should be EFI only.
The garbage BGRT data causes the efi_bgrt_init() panic.
Add a check to skip efi_bgrt_init() in case non-EFI bootup to work around
this firmware bug.
Tested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: <stable@vger.kernel.org> # v4.11+
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 7b0a911478 ("efi/x86: Move the EFI BGRT init code to early init code")
Link: http://lkml.kernel.org/r/20170526113652.21339-6-matt@codeblueprint.co.uk
[ Rewrote the changelog to be more readable. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For EFI with the 'efi=old_map' kernel option specified, the kernel will panic
when KASLR is enabled:
BUG: unable to handle kernel paging request at 000000007febd57e
IP: 0x7febd57e
PGD 1025a067
PUD 0
Oops: 0010 [#1] SMP
Call Trace:
efi_enter_virtual_mode()
start_kernel()
x86_64_start_reservations()
x86_64_start_kernel()
start_cpu()
The root cause is that the identity mapping is not built correctly
in the 'efi=old_map' case.
On 'nokaslr' kernels, PAGE_OFFSET is 0xffff880000000000 which is PGDIR_SIZE
aligned. We can borrow the PUD table from the direct mappings safely. Given a
physical address X, we have pud_index(X) == pud_index(__va(X)).
However, on KASLR kernels, PAGE_OFFSET is PUD_SIZE aligned. For a given physical
address X, pud_index(X) != pud_index(__va(X)). We can't just copy the PGD entry
from direct mapping to build identity mapping, instead we need to copy the
PUD entries one by one from the direct mapping.
Fix it.
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Young <dyoung@redhat.com>
Cc: Frank Ramsay <frank.ramsay@hpe.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20170526113652.21339-5-matt@codeblueprint.co.uk
[ Fixed and reworded the changelog and code comments to be more readable. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Booting kexec kernel with "efi=old_map" in kernel command line hits
kernel panic as shown below.
BUG: unable to handle kernel paging request at ffff88007fe78070
IP: virt_efi_set_variable.part.7+0x63/0x1b0
PGD 7ea28067
PUD 7ea2b067
PMD 7ea2d067
PTE 0
[...]
Call Trace:
virt_efi_set_variable()
efi_delete_dummy_variable()
efi_enter_virtual_mode()
start_kernel()
x86_64_start_reservations()
x86_64_start_kernel()
start_cpu()
[ efi=old_map was never intended to work with kexec. The problem with
using efi=old_map is that the virtual addresses are assigned from the
memory region used by other kernel mappings; vmalloc() space.
Potentially there could be collisions when booting kexec if something
else is mapped at the virtual address we allocated for runtime service
regions in the initial boot - Matt Fleming ]
Since kexec was never intended to work with efi=old_map, disable
runtime services in kexec if booted with efi=old_map, so that we don't
panic.
Tested-by: Lee Chun-Yi <jlee@suse.com>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20170526113652.21339-4-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This shrinks the size of the structure on arm64 by 8 bytes by avoiding
padding of 4 bytes at two places.
Also add missing doc comment for freq_table
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
The frequency table shouldn't have any zero frequency entries and so
such a check isn't required. Though it would be better to make sure
'state' is within limits.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
'cpu_dev' is used by only one function, get_static_power(), and it
wouldn't be time consuming to get the cpu device structure within it.
This would help removing cpu_dev from struct cpufreq_cooling_device.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
The frequency passed to get_level() is returned by cpu_power_to_freq()
and it is guaranteed that get_level() can't fail.
Get rid of error code.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
We keep two arrays for idle time stats and allocate memory for them
separately. It would be much easier to follow if we create an array of
idle stats structure instead and allocate it once.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
The cpu_cooling driver keeps two tables:
- freq_table: table of frequencies in descending order, built from
policy->freq_table.
- power_table: table of frequencies and power in ascending order, built
from OPP table.
If the OPPs are used for the CPU device then both these tables are
actually built using the OPP core and should have the same frequency
entries. And there is no need to keep separate tables for this.
Lets merge them both.
Note that the new table is in descending order of frequencies and so the
'for' loops were required to be fixed at few places to make it work.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
'allowed_cpus' is a copy of policy->related_cpus and can be replaced by
it directly. At some places we are only concerned about online CPUs and
policy->cpus can be used there.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
The OPPs are registered for all CPUs of a cpufreq policy now and we
don't need to run the loop in build_dyn_power_table(). Just check for
the policy->cpu and we should be fine.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
The CPU cooling driver uses the cpufreq policy, to get clip_cpus, the
frequency table, etc. Most of the callers of CPU cooling driver's
registration routines have the cpufreq policy with them, but they only
pass the policy->related_cpus cpumask. The __cpufreq_cooling_register()
routine then gets the policy by itself and uses it.
It would be much better if the callers can pass the policy instead
directly. This also fixes a basic design flaw, where the policy can be
freed while the CPU cooling driver is still active.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
There is only one user of cpufreq_cooling_get_level() and that already
has pointer to the cpufreq_cdev structure. It can directly call
get_level() instead and we can get rid of cpufreq_cooling_get_level().
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Objects of "struct cpufreq_cooling_device" are named a bit
inconsistently. Lets use cpufreq_cdev everywhere. Also note that the
lists containing such devices is renamed similarly too.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
After the lock is dropped, it is possible that the cpufreq_dev gets
freed before we call get_level() and that can cause kernel to crash.
Drop the lock after we are done using the structure.
Cc: 4.2+ <stable@vger.kernel.org> # 4.2+
Fixes: 02373d7c69 ("thermal: cpu_cooling: fix lockdep problems in cpu_cooling")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Some Ethernet drivers will attach/connect to a PHY device before calling
register_netdevice() which is responsible for calling netdev_register_kobject()
which would do the network device's kobject initialization. In such a case,
sysfs_create_link() would return -ENOENT because the network device's kobject
is not ready yet, and we would fail to connect to the PHY device.
In order to keep things simple and symetrical, we just take the success path as
indicative of the ability to access the network device's kobject, and create
the second link if that's the case.
Fixes: 5568363f0c ("net: phy: Create sysfs reciprocal links for attached_dev/phydev")
Reported-by: Woojung Hung <Woojung.Huh@microchip.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mv88e6xxx_serdes_power returns an error, so no need to print an error
message inside of it. Rather print it in its caller when the error is
ignored, which is in the mv88e6xxx_port_disable void function.
Catch and return its error in the counterpart mv88e6xxx_port_enable.
Fixes: 04aca99382 ("dsa: mv88e6xxx: Enable/Disable SERDES on port enable/disable")
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladislav Yasevich says:
====================
rtnetlink: Updates to rtnetlink_event()
First is the patch to add IFLA_EVENT attribute to the netlink message. It
supports only currently white-listed events.
Like before, this is just an attribute that gets added to the rtnetlink
message only when the messaged was generated as a result of a netdev event.
In my case, this is necessary since I want to trap NETDEV_NOTIFY_PEERS
event (also possibly NETDEV_RESEND_IGMP event) and perform certain actions
in user space. This is not possible since the messages generated as
a result of netdev events do not usually contain any changed data. They
are just notifications. This patch exposes this notification type to
userspace.
Second, I remove duplicate messages that a result of a change to bonding
options. If netlink is used to configure bonding options, 2 messages
are generated, one as a result NETDEV_CHANGEINFODATA event triggered by
bonding code and one a result of device state changes triggered by
netdev_state_change (called from do_setlink).
V6: Updated names and refactored to make it less tied to netdev events.
(From David Ahern)
V5: Rebased. Added iproute2 patch to the series.
V4:
* Removed the patch the removed NETDEV_CHANGENAME from event whitelist.
It doesn't trigger duplicate messages since name changes can only be
done while device is down and netdev_state_change() doesn't report
changes while device is down.
* Added a patch to clean-up duplicate messages on bonding option changes.
V3: Rebased. Cleaned-up duplicate event.
V2: Added missed events (from David Ahern)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Whenever a user changes bonding options, a NETDEV_CHANGEINFODATA
notificatin is generated which results in a rtnelink message to
be sent. While runnig 'ip monitor', we can actually see 2 messages,
one a result of the event, and the other a result of state change
that is generated bo netdev_state_change(). However, this is not
always the case. If bonding changes were done via sysfs or ifenslave
(old ioctl interface), then only 1 message is seen.
This patch removes duplicate messages in the case of using netlink
to configure bonding. It introduceds a separte function that
triggers a netdev event and uses that function in the syfs and ioctl
cases.
This was discovered while auditing all the different envents and
continues the effort of cleaning up duplicated netlink messages.
CC: David Ahern <dsa@cumulusnetworks.com>
CC: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When netdev events happen, a rtnetlink_event() handler will send
messages for every event in it's white list. These messages contain
current information about a particular device, but they do not include
the iformation about which event just happened. So, it is impossible
to tell what just happend for these events.
This patch adds a new extension to RTM_NEWLINK message called IFLA_EVENT
that would have an encoding of event that triggered this
message. This would allow the the message consumer to easily determine
if it needs to perform certain actions.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
syszkaller fuzzer triggered a divide by zero, when set calibration
through ioctl().
To fix it, test 'bitrate' if it is negative or 0, just return -EINVAL.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Firo Yang <firogm@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The binding documentation for the mv88e6xxx switch is missing the
eeprom-length property, which has been implemented since May 2016,
commit f8cd8753de ("dsa: mv88e6xxx: Handle eeprom-length property")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>