Commit Graph

710487 Commits

Author SHA1 Message Date
Cong Wang
57767e7853 cls_matchall: use tcf_exts_get_net() before call_rcu()
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09 10:03:10 +09:00
Cong Wang
d5f984f5af cls_fw: use tcf_exts_get_net() before call_rcu()
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09 10:03:09 +09:00
Cong Wang
0dadc117ac cls_flower: use tcf_exts_get_net() before call_rcu()
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09 10:03:09 +09:00
Cong Wang
22f7cec93f cls_flow: use tcf_exts_get_net() before call_rcu()
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09 10:03:09 +09:00
Cong Wang
ed14816814 cls_cgroup: use tcf_exts_get_net() before call_rcu()
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09 10:03:09 +09:00
Cong Wang
aae2c35ec8 cls_bpf: use tcf_exts_get_net() before call_rcu()
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09 10:03:09 +09:00
Cong Wang
0b2a59894b cls_basic: use tcf_exts_get_net() before call_rcu()
Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09 10:03:09 +09:00
Cong Wang
e4b95c41df net_sched: introduce tcf_exts_get_net() and tcf_exts_put_net()
Instead of holding netns refcnt in tc actions, we can minimize
the holding time by saving it in struct tcf_exts instead. This
means we can just hold netns refcnt right before call_rcu() and
release it after tcf_exts_destroy() is done.

However, because on netns cleanup path we call tcf_proto_destroy()
too, obviously we can not hold netns for a zero refcnt, in this
case we have to do cleanup synchronously. It is fine for RCU too,
the caller cleanup_net() already waits for a grace period.

For other cases, refcnt is non-zero and we can safely grab it as
normal and release it after we are done.

This patch provides two new API for each filter to use:
tcf_exts_get_net() and tcf_exts_put_net(). And all filters now can
use the following pattern:

void __destroy_filter() {
  tcf_exts_destroy();
  tcf_exts_put_net();  // <== release netns refcnt
  kfree();
}
void some_work() {
  rtnl_lock();
  __destroy_filter();
  rtnl_unlock();
}
void some_rcu_callback() {
  tcf_queue_work(some_work);
}

if (tcf_exts_get_net())  // <== hold netns refcnt
  call_rcu(some_rcu_callback);
else
  __destroy_filter();

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09 10:03:09 +09:00
Cong Wang
c7e460ce55 Revert "net_sched: hold netns refcnt for each action"
This reverts commit ceffcc5e25.
If we hold that refcnt, the netns can never be destroyed until
all actions are destroyed by user, this breaks our netns design
which we expect all actions are destroyed when we destroy the
whole netns.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09 10:03:09 +09:00
Andrey Konovalov
8f56246291 net: usb: asix: fill null-ptr-deref in asix_suspend
When asix_suspend() is called dev->driver_priv might not have been
assigned a value, so we need to check that it's not NULL.

Similar issue is present in asix_resume(), this patch fixes it as well.

Found by syzkaller.

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [] PREEMPT SMP KASAN
Modules linked in:
CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c 
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
task: ffff88006bb36300 task.stack: ffff88006bba8000
RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629
RSP: 0018:ffff88006bbae718 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644
RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008
RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40
R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80
FS:  0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0
Call Trace:
 usb_suspend_interface drivers/usb/core/driver.c:1209
 usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314
 usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852
 __rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334
 rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461
 rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596
 __pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009
 pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251
 usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487
 hub_port_connect drivers/usb/core/hub.c:4903
 hub_port_connect_change drivers/usb/core/hub.c:5009
 port_event drivers/usb/core/hub.c:5115
 hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
 process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
 worker_thread+0x221/0x1850 kernel/workqueue.c:2253
 kthread+0x3a1/0x470 kernel/kthread.c:231
 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00
00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03 <80>
3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c
RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718
---[ end trace dfc4f5649284342c ]---

Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09 09:22:13 +09:00
David S. Miller
1a8e6b48fb Revert "net: usb: asix: fill null-ptr-deref in asix_suspend"
This reverts commit baedf68a06.

There is an updated version of this fix which covers
the problem more thoroughly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-09 09:21:44 +09:00
Gustavo A. R. Silva
1be9c3a0a0 ACPI: Mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-09 00:55:16 +01:00
Kees Cook
3653bc95bc timer: Prepare to change all DEFINE_TIMER() callbacks
Before we can globally change the function prototype of all timer callbacks,
we have to change those set up by DEFINE_TIMER(). Prepare for this by
casting the callbacks until the prototype changes globally.

Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-08 15:54:09 -08:00
Kees Cook
8ef81c6548 netfilter: ipvs: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Wensong Zhang <wensong@linux-vs.org>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: Florian Westphal <fw@strlen.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: lvs-devel@vger.kernel.org
Cc: netfilter-devel@vger.kernel.org
Cc: coreteam@netfilter.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
2017-11-08 15:53:58 -08:00
Kees Cook
8e5f4ba0cd scsi: qla2xxx: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: qla2xxx-upstream@qlogic.com
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Tested-by: Bart Van Assche <Bart.VanAssche@wdc.com>
2017-11-08 15:51:35 -08:00
Colin Ian King
71c50dbe1f ACPI / LPSS: Remove redundant initialization of clk
The pointer clk is being initialized to ERR_PTR(-ENODEV) however
this value is never read before it is set to clk_data->clk. Thus
the initialization is redundant and can be mored.

Cleans up clang warning:
Value stored to 'clk' during its initialization is never read

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-09 00:43:30 +01:00
George Cherian
85b1407bf6 ACPI / CPPC: Make CPPC ACPI driver aware of PCC subspace IDs
Based on ACPI 6.2 Section 8.4.7.1.9 If the PCC register space is used,
all PCC registers, for all processors in the same performance domain
(as defined by _PSD), must be defined to be in the same subspace.

Based on Section 14.1 of ACPI specification, it is possible to have a
maximum of 256 PCC subspace IDs. Add support of multiple PCC subspace
ID instead of using a single global pcc_data structure.

While at that, fix the time_delta check in send_pcc_cmd() so that
last_mpar_reset and mpar_count are initialized properly.

Signed-off-by: George Cherian <george.cherian@cavium.com>
Reviewed-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-09 00:39:54 +01:00
George Cherian
c4b766c2f3 mailbox: PCC: Move the MAX_PCC_SUBSPACES definition to header file
Move the MAX_PCC_SUBSPACES definition to acpi/pcc.h file in
preparation to add subspace ID support for cppc_acpi driver.

Signed-off-by: George Cherian <george.cherian@cavium.com>
Reviewed-by: Prashanth Prakash <pprakash@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-09 00:39:53 +01:00
Colin Ian King
3e87ead412 ACPI / sysfs: Make function param_set_trace_method_name() static
The function param_set_trace_method_name is local to the source and does
not need to be in global scope, so make it static.

Cleans up sparse warning:
symbol 'param_set_trace_method_name' was not declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-09 00:35:41 +01:00
Hans de Goede
84d3f6b764 ACPI / button: Delay acpi_lid_initialize_state() until first user space open
ACPI _LID methods may depend on OpRegions and do not always handle
handlers for those OpRegions not being present properly e.g. :

            Method (_LID, 0, NotSerialized)  // _LID: Lid Status
            {
                If ((^^I2C5.PMI1.AVBL == One) && (^^GPO2.AVBL == One))
                {
                    Return (^^GPO2.LPOL) /* \_SB_.GPO2.LPOL */
                }
            }

Note the missing Return (1) when either of the OpRegions is not available,
this causes (in this case) a report of the lid-switch being closed,
which causes userspace to do an immediate suspend at boot.

This commit delays getting the initial state and thus calling _LID for
the first time until userspace opens the /dev/input/event# node. This
ensures that all drivers will have had a chance to load and registerer
their OpRegions before the first _LID call, fixing this issue.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-09 00:30:44 +01:00
Lv Zheng
53c5eaabae ACPI / EC: Fix regression related to triggering source of EC event handling
Originally the Samsung quirks removed by commit 4c237371 can be covered
by commit e923e8e7 and ec_freeze_events=Y mode. But commit 9c40f956
changed ec_freeze_events=Y back to N, making this problem re-surface.

Actually, if commit e923e8e7 is robust enough, we can freely change
ec_freeze_events mode, so this patch fixes the issue by improving
commit e923e8e7.

Related commits listed in the merged order:

 Commit: e923e8e79e
 Subject: ACPI / EC: Fix an issue that SCI_EVT cannot be detected
          after event is enabled

 Commit: 4c237371f2
 Subject: ACPI / EC: Remove old CLEAR_ON_RESUME quirk

 Commit: 9c40f956ce
 Subject: Revert "ACPI / EC: Enable event freeze mode..." to fix
          a regression

This patch not only fixes the reported post-resume EC event triggering
source issue, but also fixes an unreported similar issue related to the
driver bind by adding EC event triggering source in ec_install_handlers().

Fixes: e923e8e79e (ACPI / EC: Fix an issue that SCI_EVT cannot be detected after event is enabled)
Fixes: 4c237371f2 (ACPI / EC: Remove old CLEAR_ON_RESUME quirk)
Fixes: 9c40f956ce (Revert "ACPI / EC: Enable event freeze mode..." to fix a regression)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=196833
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Reported-by: Alistair Hamilton <ahpatent@gmail.com>
Tested-by: Alistair Hamilton <ahpatent@gmail.com>
Cc: 4.11+ <stable@vger.kernel.org> # 4.11+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-09 00:23:53 +01:00
Sakari Ailus
5e4b1b707b device property: Add a macro for interating over graph endpoints
Add a convenience macro for iterating over graph endpoints. Iterating over
graph endpoints using fwnode_graph_get_next_endpoint() is a recurring
pattern, and this macro allows calling that function in a slightly more
convenient way. For instance,

	for (child = NULL;
	     (child = fwnode_graph_get_next_endpoint(fwnode, child)); )

becomes

	fwnode_graph_for_each_endpoint(fwnode, child)

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-09 00:17:23 +01:00
Sakari Ailus
cf89a31ca5 device property: Make fwnode_handle_get() return the fwnode
The fwnode_handle_get() function is used to obtain a reference to an
fwnode. A common usage pattern for the OF equivalent of the function is:

	mynode = of_node_get(node);

Similarly make fwnode_handle_get() return the fwnode to which the
reference was obtained.

Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-09 00:17:22 +01:00
Arnd Bergmann
c0041d40ba APEI / ERST: use 64-bit timestamps
32-bit timestamps are deprecated in the kernel, so we should not use
get_seconds(). In this case, the 'struct cper_record_header' structure
already contains a 64-bit field, so the only required change is to use
the safe ktime_get_real_seconds() interface as a replacement.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-09 00:13:39 +01:00
Rafael J. Wysocki
e029b9bf12 Merge branch 'pm-cpufreq-sched'
* pm-cpufreq-sched:
  cpufreq: schedutil: Examine the correct CPU when we update util
2017-11-09 00:07:56 +01:00
Viresh Kumar
07458f6a51 cpufreq: schedutil: Reset cached_raw_freq when not in sync with next_freq
'cached_raw_freq' is used to get the next frequency quickly but should
always be in sync with sg_policy->next_freq. There is a case where it is
not and in such cases it should be reset to avoid switching to incorrect
frequencies.

Consider this case for example:

 - policy->cur is 1.2 GHz (Max)
 - New request comes for 780 MHz and we store that in cached_raw_freq.
 - Based on 780 MHz, we calculate the effective frequency as 800 MHz.
 - We then see the CPU wasn't idle recently and choose to keep the next
   freq as 1.2 GHz.
 - Now we have cached_raw_freq is 780 MHz and sg_policy->next_freq is
   1.2 GHz.
 - Now if the utilization doesn't change in then next request, then the
   next target frequency will still be 780 MHz and it will match with
   cached_raw_freq. But we will choose 1.2 GHz instead of 800 MHz here.

Fixes: b7eaf1aab9 (cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely)
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 4.12+ <stable@vger.kernel.org> # 4.12+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-08 23:59:33 +01:00
Himanshu Jha
2dd9789c76 freezer: Fix typo in freezable_schedule_timeout() comment
Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-08 23:54:27 +01:00
Rajat Jain
95b982b451 PM / s2idle: Clear the events_check_enabled flag
Problem: This flag does not get cleared currently in the suspend or
resume path in the following cases:

 * In case some driver's suspend routine returns an error.
 * Successful s2idle case
 * etc?

Why is this a problem: What happens is that the next suspend attempt
could fail even though the user did not enable the flag by writing to
/sys/power/wakeup_count. This is 1 use case how the issue can be seen
(but similar use case with driver suspend failure can be thought of):

 1. Read /sys/power/wakeup_count
 2. echo count > /sys/power/wakeup_count
 3. echo freeze > /sys/power/wakeup_count
 4. Let the system suspend, and wakeup the system using some wake source
    that calls pm_wakeup_event() e.g. power button or something.
 5. Note that the combined wakeup count would be incremented due
    to the pm_wakeup_event() in the resume path.
 6. After resuming the events_check_enabled flag is still set.

At this point if the user attempts to freeze again (without writing to
/sys/power/wakeup_count), the suspend would fail even though there has
been no wake event since the past resume.

Address that by clearing the flag just before a resume is completed,
so that it is always cleared for the corner cases mentioned above.

Signed-off-by: Rajat Jain <rajatja@google.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-08 23:52:02 +01:00
Gautham R. Shenoy
f7bc9b209e cpufreq: stats: Handle the case when trans_table goes beyond PAGE_SIZE
On platforms with large number of Pstates, the transition table, which
is a NxN matrix, can overflow beyond the PAGE_SIZE boundary.

This can be seen on POWER9 which has 100+ Pstates.

As a result, each time the trans_table is read for any of the CPUs, we
will get the following error.

---------------------------------------------------
fill_read_buffer: show+0x0/0xa0 returned bad count
---------------------------------------------------

This patch ensures that in case of an overflow, we print a warning
once in the dmesg and return FILE TOO LARGE error for this and all
subsequent accesses of trans_table.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-08 23:41:25 +01:00
Bhumika Goyal
0011c6da99 cpufreq: arm_big_little: make cpufreq_arm_bL_ops structures const
Make these const as they are only getting passed to the functions
bL_cpufreq_{register/unregister} having the arguments as const.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-08 23:22:20 +01:00
Bhumika Goyal
cd6ce860eb cpufreq: arm_big_little: make function arguments and structure pointer const
Make the arguments of functions bL_cpufreq_{register/unregister} as
const as the ops pointer does not modify the fields of the
cpufreq_arm_bL_ops structure it points to. The pointer arm_bL_ops is
also getting initialized with ops but the pointer does not modify the
fields. So, make the function argument and the structure pointer const.
Add const to function prototypes too.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-08 23:22:19 +01:00
Gaurav Jindal
3fc74bd8a7 cpuidle: Avoid assignment in if () argument
Clean up cpuidle_enable_device() to avoid doing an assignment
in an expression evaluated as an argument of if (), which also
makes the code in question more readable.

Signed-off-by: Gaurav Jindal <gauravjindal1104@gmail.com>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-08 23:15:30 +01:00
Gaurav Jindal
e7b06a09e7 cpuidle: Clean up cpuidle_enable_device() error handling a bit
Do not fetch per CPU drv if cpuidle_curr_governor is NULL
to avoid useless per CPU processing.

Signed-off-by: Gaurav Jindal <gauravjindal1104@gmail.com>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-08 23:09:52 +01:00
Ville Syrjälä
ff1656790b ACPI / PM: Fix acpi_pm_notifier_lock vs flush_workqueue() deadlock
acpi_remove_pm_notifier() ends up calling flush_workqueue() while
holding acpi_pm_notifier_lock, and that same lock is taken by
by the work via acpi_pm_notify_handler(). This can deadlock.

To fix the problem let's split the single lock into two: one to
protect the dev->wakeup between the work vs. add/remove, and
another one to handle notifier installation vs. removal.

After commit a1d14934ea "workqueue/lockdep: 'Fix' flush_work()
annotation" I was able to kill the machine (Intel Braswell)
very easily with 'powertop --auto-tune', runtime suspending i915,
and trying to wake it up via the USB keyboard. The cases when
it didn't die are presumably explained by lockdep getting disabled
by something else (cpu hotplug locking issues usually).

Fortunately I still got a lockdep report over netconsole
(trickling in very slowly), even though the machine was
otherwise practically dead:

[  112.179806] ======================================================
[  114.670858] WARNING: possible circular locking dependency detected
[  117.155663] 4.13.0-rc6-bsw-bisect-00169-ga1d14934ea4b  Not tainted
[  119.658101] ------------------------------------------------------
[  121.310242] xhci_hcd 0000:00:14.0: xHCI host not responding to stop endpoint command.
[  121.313294] xhci_hcd 0000:00:14.0: xHCI host controller not responding, assume dead
[  121.313346] xhci_hcd 0000:00:14.0: HC died; cleaning up
[  121.313485] usb 1-6: USB disconnect, device number 3
[  121.313501] usb 1-6.2: USB disconnect, device number 4
[  134.747383] kworker/0:2/47 is trying to acquire lock:
[  137.220790]  (acpi_pm_notifier_lock){+.+.}, at: [<ffffffff813cafdf>] acpi_pm_notify_handler+0x2f/0x80
[  139.721524]
[  139.721524] but task is already holding lock:
[  144.672922]  ((&dpc->work)){+.+.}, at: [<ffffffff8109ce90>] process_one_work+0x160/0x720
[  147.184450]
[  147.184450] which lock already depends on the new lock.
[  147.184450]
[  154.604711]
[  154.604711] the existing dependency chain (in reverse order) is:
[  159.447888]
[  159.447888] ->  ((&dpc->work)){+.+.}:
[  164.183486]        __lock_acquire+0x1255/0x13f0
[  166.504313]        lock_acquire+0xb5/0x210
[  168.778973]        process_one_work+0x1b9/0x720
[  171.030316]        worker_thread+0x4c/0x440
[  173.257184]        kthread+0x154/0x190
[  175.456143]        ret_from_fork+0x27/0x40
[  177.624348]
[  177.624348] ->  ("kacpi_notify"){+.+.}:
[  181.850351]        __lock_acquire+0x1255/0x13f0
[  183.941695]        lock_acquire+0xb5/0x210
[  186.046115]        flush_workqueue+0xdd/0x510
[  190.408153]        acpi_os_wait_events_complete+0x31/0x40
[  192.625303]        acpi_remove_notify_handler+0x133/0x188
[  194.820829]        acpi_remove_pm_notifier+0x56/0x90
[  196.989068]        acpi_dev_pm_detach+0x5f/0xa0
[  199.145866]        dev_pm_domain_detach+0x27/0x30
[  201.285614]        i2c_device_probe+0x100/0x210
[  203.411118]        driver_probe_device+0x23e/0x310
[  205.522425]        __driver_attach+0xa3/0xb0
[  207.634268]        bus_for_each_dev+0x69/0xa0
[  209.714797]        driver_attach+0x1e/0x20
[  211.778258]        bus_add_driver+0x1bc/0x230
[  213.837162]        driver_register+0x60/0xe0
[  215.868162]        i2c_register_driver+0x42/0x70
[  217.869551]        0xffffffffa0172017
[  219.863009]        do_one_initcall+0x45/0x170
[  221.843863]        do_init_module+0x5f/0x204
[  223.817915]        load_module+0x225b/0x29b0
[  225.757234]        SyS_finit_module+0xc6/0xd0
[  227.661851]        do_syscall_64+0x5c/0x120
[  229.536819]        return_from_SYSCALL_64+0x0/0x7a
[  231.392444]
[  231.392444] ->  (acpi_pm_notifier_lock){+.+.}:
[  235.124914]        check_prev_add+0x44e/0x8a0
[  237.024795]        __lock_acquire+0x1255/0x13f0
[  238.937351]        lock_acquire+0xb5/0x210
[  240.840799]        __mutex_lock+0x75/0x940
[  242.709517]        mutex_lock_nested+0x1c/0x20
[  244.551478]        acpi_pm_notify_handler+0x2f/0x80
[  246.382052]        acpi_ev_notify_dispatch+0x44/0x5c
[  248.194412]        acpi_os_execute_deferred+0x14/0x30
[  250.003925]        process_one_work+0x1ec/0x720
[  251.803191]        worker_thread+0x4c/0x440
[  253.605307]        kthread+0x154/0x190
[  255.387498]        ret_from_fork+0x27/0x40
[  257.153175]
[  257.153175] other info that might help us debug this:
[  257.153175]
[  262.324392] Chain exists of:
[  262.324392]   acpi_pm_notifier_lock --> "kacpi_notify" --> (&dpc->work)
[  262.324392]
[  267.391997]  Possible unsafe locking scenario:
[  267.391997]
[  270.758262]        CPU0                    CPU1
[  272.431713]        ----                    ----
[  274.060756]   lock((&dpc->work));
[  275.646532]                                lock("kacpi_notify");
[  277.260772]                                lock((&dpc->work));
[  278.839146]   lock(acpi_pm_notifier_lock);
[  280.391902]
[  280.391902]  *** DEADLOCK ***
[  280.391902]
[  284.986385] 2 locks held by kworker/0:2/47:
[  286.524895]  :  ("kacpi_notify"){+.+.}, at: [<ffffffff8109ce90>] process_one_work+0x160/0x720
[  288.112927]  :  ((&dpc->work)){+.+.}, at: [<ffffffff8109ce90>] process_one_work+0x160/0x720
[  289.727725]

Fixes: c072530f39 (ACPI / PM: Revork the handling of ACPI device wakeup notifications)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: 3.17+ <stable@vger.kernel.org> # 3.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-08 23:02:43 +01:00
Jiri Kosina
87df26175e x86/mm: Unbreak modules that rely on external PAGE_KERNEL availability
Commit 7744ccdbc1 ("x86/mm: Add Secure Memory Encryption (SME)
support") as a side-effect made PAGE_KERNEL all of a sudden unavailable
to modules which can't make use of EXPORT_SYMBOL_GPL() symbols.

This is because once SME is enabled, sme_me_mask (which is introduced as
EXPORT_SYMBOL_GPL) makes its way to PAGE_KERNEL through _PAGE_ENC,
causing imminent build failure for all the modules which make use of all
the EXPORT-SYMBOL()-exported API (such as vmap(), __vmalloc(),
remap_pfn_range(), ...).

Exporting (as EXPORT_SYMBOL()) interfaces (and having done so for ages)
that take pgprot_t argument, while making it impossible to -- all of a
sudden -- pass PAGE_KERNEL to it, feels rather incosistent.

Restore the original behavior and make it possible to pass PAGE_KERNEL
to all its EXPORT_SYMBOL() consumers.

[ This is all so not wonderful. We shouldn't need that "sme_me_mask"
  access at all in all those places that really don't care about that
  level of detail, and just want _PAGE_KERNEL or whatever.

  We have some similar issues with _PAGE_CACHE_WP and _PAGE_NOCACHE,
  both of which hide a "cachemode2protval()" call, and which also ends
  up using another EXPORT_SYMBOL(), but at least that only triggers for
  the much more rare cases.

  Maybe we could move these dynamic page table bits to be generated much
  deeper down in the VM layer, instead of hiding them in the macros that
  everybody uses.

  So this all would merit some cleanup. But not today.   - Linus ]

Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Despised-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-08 13:52:36 -08:00
Heiko Carstens
ead7a22e9b s390: avoid undefined behaviour
At a couple of places smatch emits warnings like this:

    arch/s390/mm/vmem.c:409 vmem_map_init() warn:
        right shifting more than type allows

In fact shifting a signed type right is undefined. Avoid this and add
an unsigned long cast. The shifted values are always positive.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2017-11-08 22:11:20 +01:00
Heiko Carstens
8bc1e4ec79 s390/disassembler: generate opcode tables from text file
The current way of adding new instructions to the opcode tables is
painful and error prone. Therefore add, similar to binutils, a text
file which contains all opcodes and the corresponding mnemonics and
instruction formats.

A small gen_opcode_table tool then generates a header file with the
required enums and opcode table initializers at the prepare step of
the kernel build.

This way only a simple text file has to be maintained, which can be
rather easily extended.

Unlike before where there were plenty of opcode tables and a large
switch statement to find the correct opcode table, there is now only
one opcode table left which contains all instructions. A second opcode
offset table now contains offsets within the opcode table to find
instructions which have the same opcode prefix. In order to save space
all 1-byte opcode instructions are grouped together at the end of the
opcode table. This is also quite similar to like it was before.

In addition also move and change code and definitions within the
disassembler. As a side effect this reduces the size required for the
code and opcode tables by ~1.5k.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2017-11-08 22:11:02 +01:00
Heiko Carstens
dac6dc267d s390/disassembler: remove insn_to_mnemonic()
insn_to_mnemonic() was introduced ages ago for KVM debugging, but is
unused in the meantime. Therefore remove it.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2017-11-08 22:10:49 +01:00
Arnd Bergmann
399c5acd58 s390/dasd: avoid calling do_gettimeofday()
do_gettimeofday() is deprecated because it's not y2038-safe on
32-bit architectures. Since it is basically a wrapper around
ktime_get_real_ts64(), we can just call that function directly
instead.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[sth@linux.vnet.ibm.com: fix build]
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2017-11-08 22:10:29 +01:00
Linus Torvalds
c0f3ea1589 stop using '%pK' for /proc/kallsyms pointer values
Not only is it annoying to have one single flag for all pointers, as if
that was a global choice and all kernel pointers are the same, but %pK
can't get the 'access' vs 'open' time check right anyway.

So make the /proc/kallsyms pointer value code use logic specific to that
particular file.  We do continue to honor kptr_restrict, but the default
(which is unrestricted) is changed to instead take expected users into
account, and restrict access by default.

Right now the only actual expected user is kernel profiling, which has a
separate sysctl flag for kernel profile access.  There may be others.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-08 12:51:04 -08:00
Thiago Jung Bauermann
e5729f86a2 ima: Remove redundant conditional operator
A non-zero value is converted to 1 when assigned to a bool variable, so the
conditional operator in is_ima_appraise_enabled is redundant.

The value of a comparison operator is either 1 or 0 so the conditional
operator in ima_inode_setxattr is redundant as well.

Confirmed that the patch is correct by comparing the object file from
before and after the patch. They are identical.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2017-11-08 15:16:36 -05:00
Thomas Meyer
39adb92598 ima: Fix bool initialization/comparison
Bool initializations should use true and false. Bool tests don't need
comparisons.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2017-11-08 15:16:36 -05:00
Bruno E. O. Meneguele
7c9bc0983f ima: check signature enforcement against cmdline param instead of CONFIG
When the user requests MODULE_CHECK policy and its kernel is compiled
with CONFIG_MODULE_SIG_FORCE not set, all modules would not load, just
those loaded in initram time. One option the user would have would be
set a kernel cmdline param (module.sig_enforce) to true, but the IMA
module check code doesn't rely on this value, it checks just
CONFIG_MODULE_SIG_FORCE.

This patch solves this problem checking for the exported value of
module.sig_enforce cmdline param intead of CONFIG_MODULE_SIG_FORCE,
which holds the effective value (CONFIG || param).

Signed-off-by: Bruno E. O. Meneguele <brdeoliv@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2017-11-08 15:16:36 -05:00
Bruno E. O. Meneguele
fda784e50a module: export module signature enforcement status
A static variable sig_enforce is used as status var to indicate the real
value of CONFIG_MODULE_SIG_FORCE, once this one is set the var will hold
true, but if the CONFIG is not set the status var will hold whatever
value is present in the module.sig_enforce kernel cmdline param: true
when =1 and false when =0 or not present.

Considering this cmdline param take place over the CONFIG value when
it's not set, other places in the kernel could misbehave since they
would have only the CONFIG_MODULE_SIG_FORCE value to rely on. Exporting
this status var allows the kernel to rely in the effective value of
module signature enforcement, being it from CONFIG value or cmdline
param.

Signed-off-by: Bruno E. O. Meneguele <brdeoliv@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2017-11-08 15:16:36 -05:00
Boshi Wang
ebe7c0a7be ima: fix hash algorithm initialization
The hash_setup function always sets the hash_setup_done flag, even
when the hash algorithm is invalid.  This prevents the default hash
algorithm defined as CONFIG_IMA_DEFAULT_HASH from being used.

This patch sets hash_setup_done flag only for valid hash algorithms.

Fixes: e7a2ad7eb6 "ima: enable support for larger default filedata hash
	algorithms"
Signed-off-by: Boshi Wang <wangboshi@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2017-11-08 15:16:36 -05:00
Matthew Garrett
0485d066d8 EVM: Only complain about a missing HMAC key once
A system can validate EVM digital signatures without requiring an HMAC
key, but every EVM validation will generate a kernel error. Change this
so we only generate an error once.

Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2017-11-08 15:16:36 -05:00
Matthew Garrett
f00d797507 EVM: Allow userspace to signal an RSA key has been loaded
EVM will only perform validation once a key has been loaded. This key
may either be a symmetric trusted key (for HMAC validation and creation)
or the public half of an asymmetric key (for digital signature
validation). The /sys/kernel/security/evm interface allows userland to
signal that a symmetric key has been loaded, but does not allow userland
to signal that an asymmetric public key has been loaded.

This patch extends the interface to permit userspace to pass a bitmask
of loaded key types. It also allows userspace to block loading of a
symmetric key in order to avoid a compromised system from being able to
load an additional key type later.

Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2017-11-08 15:16:36 -05:00
Matthew Garrett
096b854648 EVM: Include security.apparmor in EVM measurements
Apparmor will be gaining support for security.apparmor labels, and it
would be helpful to include these in EVM validation now so appropriate
signatures can be generated even before full support is merged.

Signed-off-by: Matthew Garrett <mjg59@google.com>
Acked-by: John Johansen <John.johansen@canonical.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2017-11-08 15:16:36 -05:00
Mimi Zohar
bb02b186d0 ima: call ima_file_free() prior to calling fasync
The file hash is calculated and written out as an xattr after
calling fasync().  In order for the file data and metadata to be
written out to disk at the same time, this patch calculates the
file hash and stores it as an xattr before calling fasync.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2017-11-08 15:16:36 -05:00
Christoph Hellwig
a7d3d0392a integrity: use kernel_read_file_from_path() to read x509 certs
The CONFIG_IMA_LOAD_X509 and CONFIG_EVM_LOAD_X509 options permit
loading x509 signed certificates onto the trusted keyrings without
verifying the x509 certificate file's signature.

This patch replaces the call to the integrity_read_file() specific
function with the common kernel_read_file_from_path() function.
To avoid verifying the file signature, this patch defines
READING_X509_CERTFICATE.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2017-11-08 15:16:36 -05:00