linux

Author	SHA1	Message	Date
Benjamin Romer	30de72dbb5	staging: unisys: remove MEMSET define Remove the redundant MEMSET define in commontypes.h and fix everyplace that uses it. Signed-off-by: Benjamin Romer <benjamin.romer@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-08-01 14:38:45 -07:00
Benjamin Romer	90f3b509bf	staging: unisys: remove UINTN type This patch removes UINTN from commontypes.h, using u64 in the one spot this type was used. Signed-off-by: Benjamin Romer <benjamin.romer@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-08-01 14:38:45 -07:00
Benjamin Romer	87b7307ae1	staging: unisys: remove unused defines from commontypes.h Delete #defines that aren't used anywhere. Signed-off-by: Benjamin Romer <benjamin.romer@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-08-01 14:38:45 -07:00
Benjamin Romer	f6f7005abe	staging: unisys: remove non-kernel code from commontypes.h This patch deletes everything in common types that was in the else section of a #ifdef __KERNEL__ block. Signed-off-by: Benjamin Romer <benjamin.romer@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-08-01 14:38:45 -07:00
Benjamin Romer	ec03a7db32	staging: unisys: remove S64 type This patch switches all use of the S64 typedef to use the kernel's s64 type instead. Signed-off-by: Benjamin Romer <benjamin.romer@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-08-01 14:38:45 -07:00
Benjamin Romer	c14f13ba30	staging: unisys: remove S32 type Delete the S32 type from commontypes.h since it wasn't being used. Signed-off-by: Benjamin Romer <benjamin.romer@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-08-01 14:38:45 -07:00
Benjamin Romer	15c564ecf2	staging: unisys: remove S16 type This patch switches all use of the S16 typedef to use the kernel's s16 type instead. Signed-off-by: Benjamin Romer <benjamin.romer@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-08-01 14:38:44 -07:00
Benjamin Romer	d717723439	staging: unisys: remove S8 type Delete the S8 type from commontypes.h since it wasn't being used. Signed-off-by: Benjamin Romer <benjamin.romer@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-08-01 14:38:44 -07:00
Benjamin Romer	5fc0229ae5	staging: unisys: remove U64 type This patch switches all use of the U64 typedef to use the kernel's u64 type instead. Signed-off-by: Benjamin Romer <benjamin.romer@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-08-01 14:38:44 -07:00
Benjamin Romer	b3c55b13a1	staging: unisys: remove U32 type This patch switches all use of the U32 typedef to use the kernel's u32 type instead. Signed-off-by: Benjamin Romer <benjamin.romer@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-08-01 14:38:44 -07:00
Benjamin Romer	b06bdf7d5b	staging: unisys: remove U16 type This patch switches all use of the U16 typedef to use the kernel's u16 type instead. Signed-off-by: Benjamin Romer <benjamin.romer@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-08-01 14:38:44 -07:00
Linus Torvalds	818be5894e	Fix dm bufio shrinker to properly zero-fill all fields. Fix race in dm cache that caused improper reporting of the number of dirty blocks in the cache. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJT28K2AAoJEMUj8QotnQNau30IAOAiW683+TnSpq4MXRywUNKl gaLtC65Q04KesZ2FtZYaYUOdfF8nsmuyBiSlHl8RLzGWCYJcQQAMdXVEgCZ+qXKX OhD85zhySXUamvKfq2wX452kAK6O2eR//Azc3d57uWhGboXTZrqTDc4QLRDJQoAF 8b4g4r/NCV6fAtHtEfB9JtLyrk1kOUGpHdF2rSFy20IUKs1RPZRZzNYEh5KB9ZWI DeoZY6GrqR1bLlOAL3Usd43fYGdgv+Mn1HaR+Xgn7LVl2HpRIts4M8Y54F0xko4F T26rJMZlMppolgZBXElrDm8ly6bDPglfU7ymJMleiPEdCLhpLX/jqoPEtgsqGI8= =pQe5 -----END PGP SIGNATURE----- Merge tag 'dm-3.16-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: "Fix dm bufio shrinker to properly zero-fill all fields. Fix race in dm cache that caused improper reporting of the number of dirty blocks in the cache" * tag 'dm-3.16-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm cache: fix race affecting dirty block count dm bufio: fully initialize shrinker	2014-08-01 12:50:05 -07:00
Linus Torvalds	9642a1041e	ARM: Straggler SoC fix for 3.16 A DT bugfix for Nomadik that had an ambigouos double-inversion of a gpio line, and one MAINTAINER URL update that might as well go in now. We could hold off until the merge window, but then we'll just have to mark the DT fix for stable and it just seems like in total causing more work. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQIcBAABAgAGBQJT2dyEAAoJEIwa5zzehBx3rZMQAKwoGghFh2t/4USXOWEDT1ns B0cPMvg3wjlOxnJsC4Nvi9BclIBBlUImHoHYG8Un7/aJXkuRpOC36Zg2wWDJUkVC NZOwHAj/wBwPTKQnHKSQ2Nyw/h1t1FidlWt07GcFxmYvl00pfE707MHKiRTd/pUY T4/VELaUXmcJ2/9WqscNLi4TzuQBd5eBb1n48GRzVsTfjJo7jXExCsuKWNOpbPar eDqZjw7KAU73T8d0hJmZtJuI8iZ0I3mMYFrA7Sp1srXEpw/ZKfvn6+SBH3CzZdxz oDXZYYBl4wYuqqW7I42esZkcsyGmCE58KS0tYWZgvQUJzrHeyV6myuEn2qt2ROzQ Ii5huDBSbxNOk2D+v5JevBG8SuDFMS6Op1Gj9fUB7F/+P+EDzccwSkejPZYAnHDd JHWiks6CU4xRzlSrPb4AtQ3UkX0GO1QPlhhT/TQD0wOAn9XMkC9hfYBBn+pnusXm l8F+DSGx2UUziwF2f4tTGcpKeiTN71g8xizHFs4LHhctnVeb/G68a5s1Cxsfq7Bd tW3Y/b1sy5q/TllKTL6qTSqTS+GPldV/w4IwBm4tJs3Q5lMoJErvFQqTEPpIwcRP xDVBT0BpEStvxMNYVhIEE4C50k02zLhqS2CH9pqzHVBFYhhNlp2g23QDsbU2yWgG sPg0G+Ts8yxeS1DZotnq =sz9W -----END PGP SIGNATURE----- Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM straggler SoC fix from Olof Johansson: "A DT bugfix for Nomadik that had an ambigouos double-inversion of a gpio line, and one MAINTAINER URL update that might as well go in now. We could hold off until the merge window, but then we'll just have to mark the DT fix for stable and it just seems like in total causing more work" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: MAINTAINERS: Update Tegra Git URL ARM: nomadik: fix up double inversion in DT	2014-08-01 12:49:02 -07:00
Russell King	c70fbb01b1	Two different fixes for the same problem making some ARM nommu configurations not boot since 3.6-rc1. The problem is that user_addr_max returned the biggest available RAM address which makes some copy_from_user variants fail to read from XIP memory. Even in the presence of one of the two fixes the other still makes sense, so both patches are included here. This problem was the last one preventing efm32 boot to a prompt with mainline. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEABECAAYFAlOyfhsACgkQ6suMTIUe0hZWggCePaoe/S+aDki9B2ASCY0zVkRq XE8AoM5G4yRgnL3zitI2ftvvlp4xx1mS =4Vjn -----END PGP SIGNATURE----- Merge tag 'nommu-for-rmk' of git://git.pengutronix.de/git/ukl/linux into devel-stable Two different fixes for the same problem making some ARM nommu configurations not boot since 3.6-rc1. The problem is that user_addr_max returned the biggest available RAM address which makes some copy_from_user variants fail to read from XIP memory. Even in the presence of one of the two fixes the other still makes sense, so both patches are included here. This problem was the last one preventing efm32 boot to a prompt with mainline.	2014-08-01 19:54:26 +01:00
Dan Carpenter	6ba19bf066	clk: checking wrong variable in __set_clk_parents() There is a cut and paste bug so we check "pclk" instead of "clk". Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mike Turquette <mturquette@linaro.org>	2014-08-01 10:47:26 -07:00
Anssi Hannula	44fa816bb7	dm cache: fix race affecting dirty block count nr_dirty is updated without locking, causing it to drift so that it is non-zero (either a small positive integer, or a very large one when an underflow occurs) even when there are no actual dirty blocks. This was due to a race between the workqueue and map function accessing nr_dirty in parallel without proper protection. People were seeing under runs due to a race on increment/decrement of nr_dirty, see: https://lkml.org/lkml/2014/6/3/648 Fix this by using an atomic_t for nr_dirty. Reported-by: roma1390@gmail.com Signed-off-by: Anssi Hannula <anssi.hannula@iki.fi> Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org	2014-08-01 12:25:22 -04:00
Greg Thelen	d8c712ea47	dm bufio: fully initialize shrinker `1d3d4437ea` ("vmscan: per-node deferred work") added a flags field to struct shrinker assuming that all shrinkers were zero filled. The dm bufio shrinker is not zero filled, which leaves arbitrary kmalloc() data in flags. So far the only defined flags bit is SHRINKER_NUMA_AWARE. But there are proposed patches which add other bits to shrinker.flags (e.g. memcg awareness). Rather than simply initializing the shrinker, this patch uses kzalloc() when allocating the dm_bufio_client to ensure that the embedded shrinker and any other similar structures are zeroed. This fixes theoretical over aggressive shrinking of dm bufio objects. If the uninitialized dm_bufio_client.shrinker.flags contains SHRINKER_NUMA_AWARE then shrink_slab() would call the dm shrinker for each numa node rather than just once. This has been broken since 3.12. Signed-off-by: Greg Thelen <gthelen@google.com> Acked-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # v3.12+	2014-08-01 12:07:21 -04:00
Thierry Reding	c4121c650e	ata: libahci: Silence compiler warning on 64-bit Commit `725c7b570f` (ata: libahci_platform: move port_map parameters into the AHCI structure) moves flags into the struct ahci_host_priv's .flags field, which causes compiler warnings on 64-bit builds when that value is cast to a void * pointer. Cast to an unsigned long so that the subsequent cast to a pointer doesn't produce a warning. Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2014-08-01 11:18:53 -04:00
Stephan Mueller	ce5481d01f	crypto: drbg - fix failure of generating multiple of 216 bytes The function drbg_generate_long slices the request into 216 byte or smaller chunks. However, the loop, however invokes the random number generation function with zero bytes when the request size is a multiple of 2**16 bytes. The fix prevents zero bytes requests. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:14 +08:00
Tom Lendacky	6391723293	crypto: ccp - Do not sign extend input data to CCP The CCP hardware interprets all numbers as unsigned numbers, therefore sign extending input data is not valid. Modify the function calls for RSA and ECC to not perform sign extending. This patch is based on the cryptodev-2.6 kernel tree. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:13 +08:00
Jarod Wilson	2fc0d258bc	crypto: testmgr - add missing spaces to drbg error strings There are a few missing spaces in the error text strings for drbg_cavs_test, trivial fix. CC: "David S. Miller" <davem@davemloft.net> CC: linux-crypto@vger.kernel.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Acked-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:13 +08:00
Pramod Gurav	c659d07f11	crypto: atmel-tdes - Switch to managed version of kzalloc This patch switches data allocation from kzalloc to devm_kzalloc. It also removes some kfree() on data that was earlier allocated using devm_kzalloc() from probe as well as remove funtions. CC: Herbert Xu <herbert@gondor.apana.org.au> CC: "David S. Miller" <davem@davemloft.net> CC: Grant Likely <grant.likely@linaro.org> CC: Rob Herring <robh+dt@kernel.org> Signed-off-by: Pramod Gurav <pramod.gurav@smartplayin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:12 +08:00
Pramod Gurav	593901aa04	crypto: atmel-sha - Switch to managed version of kzalloc This patch switches data allocation from kzalloc to devm_kzalloc. It also removed some kfree() on data that was earlier allocated using devm_kzalloc(). CC: Herbert Xu <herbert@gondor.apana.org.au> CC: "David S. Miller" <davem@davemloft.net> CC: Grant Likely <grant.likely@linaro.org> CC: Rob Herring <robh+dt@kernel.org> CC: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Pramod Gurav <pramod.gurav@smartplayin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:12 +08:00
Ard Biesheuvel	3b9b8fe0ad	crypto: testmgr - use chunks smaller than algo block size in chunk tests This patch updates many of the chunked tcrypt test cases so that not all of the chunks are an exact multiple of the block size. This should help uncover cases where the residue passed to blkcipher_walk_done() is incorrect. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:11 +08:00
Tadeusz Struk	4f74c3989b	crypto: qat - Fixed SKU1 dev issue Fix for issue with SKU1 device. SKU1 device has 8 micro engines as opposed to 12 in other SKUs so it was not possible to start the non-existing micro engines. Signed-off-by: Bo Cui <bo.cui@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:10 +08:00
Tadeusz Struk	d9a44abf3a	crypto: qat - Use hweight for bit counting Use predefined hweight32 function instead of writing a new one. Signed-off-by: Pingchao Yang <pingchao.yang@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:08 +08:00
Tadeusz Struk	689917211c	crypto: qat - Updated print outputs Updated pr_err output to make it more consistent. Signed-off-by: Pingchao Yang <pingchao.yang@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:08 +08:00
Tadeusz Struk	9a147cb323	crypto: qat - change ae_num to ae_id Change the logic how acceleration engines are indexed to make it easier to read. Aslo some return code values updates to better reflect what failed. Signed-off-by: Pingchao Yang <pingchao.yang@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:06 +08:00
Tadeusz Struk	8c1f8e3bbf	crypto: qat - change slice->regions to slice->region Change ptr name slice->regions to slice->region to reflect the same in the page struct. Signed-off-by: Pingchao Yang <pingchao.yang@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:06 +08:00
Tadeusz Struk	df0088f507	crypto: qat - use min_t macro prefer min_t() macro over two open-coded logical tests Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:04 +08:00
Tadeusz Struk	45cff26080	crypto: qat - remove unnecessary parentheses Resolve new strict checkpatch hits CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around ... Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:03 +08:00
Tadeusz Struk	a7d217617b	crypto: qat - remove unneeded header Remove include of a no longer necessary header file. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:02 +08:00
Tadeusz Struk	53275baa03	crypto: qat - checkpatch blank lines Fix new checkpatch hits: CHECK:LINE_SPACING: Please use a blank line after function/struct/union/enum declarations Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:01 +08:00
Tadeusz Struk	341b2a3541	crypto: qat - remove unnecessary return codes Remove unnecessary return code variables and change function types accordingly. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:36:00 +08:00
Mark Rustad	3e3dc25fe7	crypto: Resolve shadow warnings Change formal parameters to not clash with global names to eliminate many W=2 warnings. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-08-01 22:35:55 +08:00
Mark Rutland	ea1719672f	arm64: add newline to I-cache policy string Due to a missing newline in the I-cache policy detection log output, it's possible to get some ratehr unfortunate output at boot time: CPU1: Booted secondary processor Detected VIPT I-cache on CPU1CPU2: Booted secondary processor Detected VIPT I-cache on CPU2CPU3: Booted secondary processor Detected VIPT I-cache on CPU3CPU4: Booted secondary processor Detected PIPT I-cache on CPU4CPU5: Booted secondary processor Detected PIPT I-cache on CPU5Brought up 6 CPUs SMP: Total of 6 processors activated. This patch adds the missing newline to the format string, cleaning up the output. Fixes: `59ccc0d41b` ("arm64: cachetype: report weakest cache policy") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2014-08-01 14:00:06 +01:00
Jan Kara	504d58745c	timer: Fix lock inversion between hrtimer_bases.lock and scheduler locks clockevents_increase_min_delta() calls printk() from under hrtimer_bases.lock. That causes lock inversion on scheduler locks because printk() can call into the scheduler. Lockdep puts it as: ====================================================== [ INFO: possible circular locking dependency detected ] 3.15.0-rc8-06195-g939f04b #2 Not tainted ------------------------------------------------------- trinity-main/74 is trying to acquire lock: (&port_lock_key){-.....}, at: [<811c60be>] serial8250_console_write+0x8c/0x10c but task is already holding lock: (hrtimer_bases.lock){-.-...}, at: [<8103caeb>] hrtimer_try_to_cancel+0x13/0x66 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #5 (hrtimer_bases.lock){-.-...}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<8103c918>] __hrtimer_start_range_ns+0x1c/0x197 [<8107ec20>] perf_swevent_start_hrtimer.part.41+0x7a/0x85 [<81080792>] task_clock_event_start+0x3a/0x3f [<810807a4>] task_clock_event_add+0xd/0x14 [<8108259a>] event_sched_in+0xb6/0x17a [<810826a2>] group_sched_in+0x44/0x122 [<81082885>] ctx_sched_in.isra.67+0x105/0x11f [<810828e6>] perf_event_sched_in.isra.70+0x47/0x4b [<81082bf6>] __perf_install_in_context+0x8b/0xa3 [<8107eb8e>] remote_function+0x12/0x2a [<8105f5af>] smp_call_function_single+0x2d/0x53 [<8107e17d>] task_function_call+0x30/0x36 [<8107fb82>] perf_install_in_context+0x87/0xbb [<810852c9>] SYSC_perf_event_open+0x5c6/0x701 [<810856f9>] SyS_perf_event_open+0x17/0x19 [<8142f8ee>] syscall_call+0x7/0xb -> #4 (&ctx->lock){......}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f04c>] _raw_spin_lock+0x21/0x30 [<81081df3>] __perf_event_task_sched_out+0x1dc/0x34f [<8142cacc>] __schedule+0x4c6/0x4cb [<8142cae0>] schedule+0xf/0x11 [<8142f9a6>] work_resched+0x5/0x30 -> #3 (&rq->lock){-.-.-.}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f04c>] _raw_spin_lock+0x21/0x30 [<81040873>] __task_rq_lock+0x33/0x3a [<8104184c>] wake_up_new_task+0x25/0xc2 [<8102474b>] do_fork+0x15c/0x2a0 [<810248a9>] kernel_thread+0x1a/0x1f [<814232a2>] rest_init+0x1a/0x10e [<817af949>] start_kernel+0x303/0x308 [<817af2ab>] i386_start_kernel+0x79/0x7d -> #2 (&p->pi_lock){-.-...}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<810413dd>] try_to_wake_up+0x1d/0xd6 [<810414cd>] default_wake_function+0xb/0xd [<810461f3>] __wake_up_common+0x39/0x59 [<81046346>] __wake_up+0x29/0x3b [<811b8733>] tty_wakeup+0x49/0x51 [<811c3568>] uart_write_wakeup+0x17/0x19 [<811c5dc1>] serial8250_tx_chars+0xbc/0xfb [<811c5f28>] serial8250_handle_irq+0x54/0x6a [<811c5f57>] serial8250_default_handle_irq+0x19/0x1c [<811c56d8>] serial8250_interrupt+0x38/0x9e [<810510e7>] handle_irq_event_percpu+0x5f/0x1e2 [<81051296>] handle_irq_event+0x2c/0x43 [<81052cee>] handle_level_irq+0x57/0x80 [<81002a72>] handle_irq+0x46/0x5c [<810027df>] do_IRQ+0x32/0x89 [<8143036e>] common_interrupt+0x2e/0x33 [<8142f23c>] _raw_spin_unlock_irqrestore+0x3f/0x49 [<811c25a4>] uart_start+0x2d/0x32 [<811c2c04>] uart_write+0xc7/0xd6 [<811bc6f6>] n_tty_write+0xb8/0x35e [<811b9beb>] tty_write+0x163/0x1e4 [<811b9cd9>] redirected_tty_write+0x6d/0x75 [<810b6ed6>] vfs_write+0x75/0xb0 [<810b7265>] SyS_write+0x44/0x77 [<8142f8ee>] syscall_call+0x7/0xb -> #1 (&tty->write_wait){-.....}: [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<81046332>] __wake_up+0x15/0x3b [<811b8733>] tty_wakeup+0x49/0x51 [<811c3568>] uart_write_wakeup+0x17/0x19 [<811c5dc1>] serial8250_tx_chars+0xbc/0xfb [<811c5f28>] serial8250_handle_irq+0x54/0x6a [<811c5f57>] serial8250_default_handle_irq+0x19/0x1c [<811c56d8>] serial8250_interrupt+0x38/0x9e [<810510e7>] handle_irq_event_percpu+0x5f/0x1e2 [<81051296>] handle_irq_event+0x2c/0x43 [<81052cee>] handle_level_irq+0x57/0x80 [<81002a72>] handle_irq+0x46/0x5c [<810027df>] do_IRQ+0x32/0x89 [<8143036e>] common_interrupt+0x2e/0x33 [<8142f23c>] _raw_spin_unlock_irqrestore+0x3f/0x49 [<811c25a4>] uart_start+0x2d/0x32 [<811c2c04>] uart_write+0xc7/0xd6 [<811bc6f6>] n_tty_write+0xb8/0x35e [<811b9beb>] tty_write+0x163/0x1e4 [<811b9cd9>] redirected_tty_write+0x6d/0x75 [<810b6ed6>] vfs_write+0x75/0xb0 [<810b7265>] SyS_write+0x44/0x77 [<8142f8ee>] syscall_call+0x7/0xb -> #0 (&port_lock_key){-.....}: [<8104a62d>] __lock_acquire+0x9ea/0xc6d [<8104a942>] lock_acquire+0x92/0x101 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<811c60be>] serial8250_console_write+0x8c/0x10c [<8104e402>] call_console_drivers.constprop.31+0x87/0x118 [<8104f5d5>] console_unlock+0x1d7/0x398 [<8104fb70>] vprintk_emit+0x3da/0x3e4 [<81425f76>] printk+0x17/0x19 [<8105bfa0>] clockevents_program_min_delta+0x104/0x116 [<8105c548>] clockevents_program_event+0xe7/0xf3 [<8105cc1c>] tick_program_event+0x1e/0x23 [<8103c43c>] hrtimer_force_reprogram+0x88/0x8f [<8103c49e>] __remove_hrtimer+0x5b/0x79 [<8103cb21>] hrtimer_try_to_cancel+0x49/0x66 [<8103cb4b>] hrtimer_cancel+0xd/0x18 [<8107f102>] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30 [<81080705>] task_clock_event_stop+0x20/0x64 [<81080756>] task_clock_event_del+0xd/0xf [<81081350>] event_sched_out+0xab/0x11e [<810813e0>] group_sched_out+0x1d/0x66 [<81081682>] ctx_sched_out+0xaf/0xbf [<81081e04>] __perf_event_task_sched_out+0x1ed/0x34f [<8142cacc>] __schedule+0x4c6/0x4cb [<8142cae0>] schedule+0xf/0x11 [<8142f9a6>] work_resched+0x5/0x30 other info that might help us debug this: Chain exists of: &port_lock_key --> &ctx->lock --> hrtimer_bases.lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(hrtimer_bases.lock); lock(&ctx->lock); lock(hrtimer_bases.lock); lock(&port_lock_key); * DEADLOCK * 4 locks held by trinity-main/74: #0: (&rq->lock){-.-.-.}, at: [<8142c6f3>] __schedule+0xed/0x4cb #1: (&ctx->lock){......}, at: [<81081df3>] __perf_event_task_sched_out+0x1dc/0x34f #2: (hrtimer_bases.lock){-.-...}, at: [<8103caeb>] hrtimer_try_to_cancel+0x13/0x66 #3: (console_lock){+.+...}, at: [<8104fb5d>] vprintk_emit+0x3c7/0x3e4 stack backtrace: CPU: 0 PID: 74 Comm: trinity-main Not tainted 3.15.0-rc8-06195-g939f04b #2 00000000 81c3a310 8b995c14 81426f69 8b995c44 81425a99 8161f671 8161f570 8161f538 8161f559 8161f538 8b995c78 8b142bb0 00000004 8b142fdc 8b142bb0 8b995ca8 8104a62d 8b142fac 000016f2 81c3a310 00000001 00000001 00000003 Call Trace: [<81426f69>] dump_stack+0x16/0x18 [<81425a99>] print_circular_bug+0x18f/0x19c [<8104a62d>] __lock_acquire+0x9ea/0xc6d [<8104a942>] lock_acquire+0x92/0x101 [<811c60be>] ? serial8250_console_write+0x8c/0x10c [<811c6032>] ? wait_for_xmitr+0x76/0x76 [<8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e [<811c60be>] ? serial8250_console_write+0x8c/0x10c [<811c60be>] serial8250_console_write+0x8c/0x10c [<8104af87>] ? lock_release+0x191/0x223 [<811c6032>] ? wait_for_xmitr+0x76/0x76 [<8104e402>] call_console_drivers.constprop.31+0x87/0x118 [<8104f5d5>] console_unlock+0x1d7/0x398 [<8104fb70>] vprintk_emit+0x3da/0x3e4 [<81425f76>] printk+0x17/0x19 [<8105bfa0>] clockevents_program_min_delta+0x104/0x116 [<8105cc1c>] tick_program_event+0x1e/0x23 [<8103c43c>] hrtimer_force_reprogram+0x88/0x8f [<8103c49e>] __remove_hrtimer+0x5b/0x79 [<8103cb21>] hrtimer_try_to_cancel+0x49/0x66 [<8103cb4b>] hrtimer_cancel+0xd/0x18 [<8107f102>] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30 [<81080705>] task_clock_event_stop+0x20/0x64 [<81080756>] task_clock_event_del+0xd/0xf [<81081350>] event_sched_out+0xab/0x11e [<810813e0>] group_sched_out+0x1d/0x66 [<81081682>] ctx_sched_out+0xaf/0xbf [<81081e04>] __perf_event_task_sched_out+0x1ed/0x34f [<8104416d>] ? __dequeue_entity+0x23/0x27 [<81044505>] ? pick_next_task_fair+0xb1/0x120 [<8142cacc>] __schedule+0x4c6/0x4cb [<81047574>] ? trace_hardirqs_off_caller+0xd7/0x108 [<810475b0>] ? trace_hardirqs_off+0xb/0xd [<81056346>] ? rcu_irq_exit+0x64/0x77 Fix the problem by using printk_deferred() which does not call into the scheduler. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-08-01 12:54:41 +02:00
Eric Biggers	6d2b6170c8	vfs: fix check for fallocate on active swapfile Fix the broken check for calling sys_fallocate() on an active swapfile, introduced by commit `0790b31b69` ("fs: disallow all fallocate operation on active swapfile"). Signed-off-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-01 02:36:04 -04:00
Christoph Hellwig	af43647277	direct-io: fix AIO regression The direct-io.c rewrite to use the iov_iter infrastructure stopped updating the size field in struct dio_submit, and thus rendered the check for allowing asynchronous completions to always return false. Fix this by comparing it to the count of bytes in the iov_iter instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Tim Chen <tim.c.chen@linux.intel.com> Tested-by: Tim Chen <tim.c.chen@linux.intel.com>	2014-08-01 02:35:51 -04:00
Linus Torvalds	6f0928036b	ACPI fix for 3.16-rc8 One commit that fixes a problem causing PNP devices to be associated with wrong ACPI device objects sometimes during device enumeration due to an incorrect check in a matching function. That problem was uncovered by the ACPI device enumeration rework in 3.14. / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJT2rg/AAoJEILEb/54YlRxY34QAIhrjBbGFr1xzNwQHMqH9mzd djC1E4GpxLIMcL4Sy12sCXX0qDjoFQ63w5kC41Xb7WHidHpN+8UMoCiVtA4mgunI sBoaqMoRjOnCs/MvbrBYgZHfp+JcUV2AjpjeK3Z+WDpNkV+ZMt/5RMLMZMhyhdXN rlebpmOwB4AMflDxkiL/7fEgyd1JZg1j27hJ0hhV0UAJri4M9Gjdts8MHe5tfQAr ByC/aikpkW22h1AmF8j9+hTX1N5BddYQtPDmOGeNCq+78oOUafEdQZZdkI98yMTl uxJBn3Pn5hksWmhU6gLjXepWnFyaoELFLL37sgiyh4ZsivsBUjZyk5Ix1RFDJr82 EGZRS/qghB6OqUNDcNVlOOzF1Zv39Szl1TsIguZaecvqYC7SNu8imFeyJ5c8S512 abJey6YH3WN1/cQ9iE1gGqzkJpvrjoau9Jf4skUpgZraEvosbnOQ0nd9RxeVzADV EgRP7OLlEEb25cUQhPnhNTy8RpugmFswK9qAEJOGCrBTEg0yeRsfUWB07nPvNvn3 LxHPTfkPqu48TG4Y/HEVD49NucJqoPw8RIrIKq5eG2ga6rhP8k3ekPnwezZ8rBx0 T5ya4DRcyqimjwZWCCH3Q8hS6O1vJADPAzhiF3zYJ6dTFqIhj3phq9+ld0cxCCL/ IHpKXn6s2UczPDWPxiyq =a1tD -----END PGP SIGNATURE----- Merge tag 'pm+acpi-3.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "One commit that fixes a problem causing PNP devices to be associated with wrong ACPI device objects sometimes during device enumeration due to an incorrect check in a matching function. That problem was uncovered by the ACPI device enumeration rework in 3.14" * tag 'pm+acpi-3.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / PNP: Fix acpi_pnp_match()	2014-07-31 16:42:10 -07:00
Linus Torvalds	7c909b0937	A single patch to re-enable audio which is broken on all DRA7 SoC-based platforms. Missed this one from the last set of fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQIcBAABAgAGBQJT2meOAAoJEDqPOy9afJhJAOQP+QHbmLd7zB1qyjqpvBhsSihr ATqoRAqvx0uOc3hSRY1Zecl2hMmBI1pWDTEToMV+mVDGceV6RHdGiWLgF5+V279s DR765F1Zxr9MPlx9GWuC6q4cV+o3pRdkgsuktrfizgzI2SikxJfHDnoPu+9ebvW7 Bqn4mvu6XKXDKkPPhVEqxzzPFonM6VGRSyq+XVkbWRdEz4nR03IXHggmEWkAlZj/ bja3CDwHFROtvnPmqDVf25Sz3rB+PyktjKpz2hHuLsneQ7ZreuEth/4cv3NInwso UJPCeiJwyCL8p4yKo+CIE1MMva81PHchaUl84PEBuxxUZPdAGpf439rYv2BTsZqa JAG7v6JBVdStJ3BUWGu/6xInoaHA02X+8yrS6RYXXPeR0CEb4tND5kuVl++6bdgo WQMR1/COm1hz/Z6pqM0R9/BMMb45pJrNtcurQ4OVQy9I9Pso1ylo2WYx+s828eR9 hDyInfait+W0V3BwXdd+YBk/Dv8nNckMf3ct1yjNV1Rt5Qs07deZEGCj0jfOVKYy TWyo1wQon85e/5DKexClpIgDjRCAXH0JpG3uJXARPoc4yaejRIrEkg+LaP8PpdKb Xz927L/VZ14m5K6FjE7CqnxTxq88jusaGqwBcHozfRBSoONibL1uQXS3qngmWvbc dwQJs0klGbAryTtdEO5c =8Z+Z -----END PGP SIGNATURE----- Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux Pull clock driver fix from Mike Turquette: "A single patch to re-enable audio which is broken on all DRA7 SoC-based platforms. Missed this one from the last set of fixes" * tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux: clk: ti: clk-7xx: Correct ABE DPLL configuration	2014-07-31 10:02:15 -07:00
Linus Torvalds	5196626dae	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This adds missing SELinux labeling to AF_ALG sockets which apparently causes SELinux (or at least the SELinux people) to misbehave :)" * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: af_alg - properly label AF_ALG socket	2014-07-31 10:01:34 -07:00
Linus Torvalds	48418bb6e3	SCSI fixes on 20140731 This is a potential data corruption fix: If we get an error sending down a barrier, we simply ignore it meaning the barrier semantics get violated without anyone being any the wiser. If the system crashes at this point, the filesystem potentially becomes corrupt. Fix is to report errors on failed barriers. Signed-off-by: James Bottomley <JBottomley@Parallels.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJT2kOfAAoJEDeqqVYsXL0Mx/0IAK61PvW/DdujlbcPLzBGCXHZ 5uXmYDgCN6cyq5QB5jfr/9QpXQVDyuugP5JER/j1ltP1TMsMkYdoZqJFu6wJYG/N OxRgqr+CZ8dn80I2K/yd4RRA8jsEoIf2KqnyzblXzRlP3r9rpvSiwZGS5UN3TESn taDF6OVbS6WZrNjcCC03jPrRfdSSOREvc3mEIOm5T8Ah2PwCGBFt/sU9F1jSA7eY yVyX4FP4anIX3BQA3NvOze3TM7GrIm+U6kJByvuzq37lRNa5W/zuVbIctG+w7dzH f7DGfjGkZpusXSSKGv8Fzy1uzmIyPmyYNEODlCYp2XoBpCUpzEx2B91eWn/H+J4= =j3oI -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI barrier fix from James Bottomley: "This is a potential data corruption fix: If we get an error sending down a barrier, we simply ignore it meaning the barrier semantics get violated without anyone being any the wiser. If the system crashes at this point, the filesystem potentially becomes corrupt. Fix is to report errors on failed barriers" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: handle flush errors properly	2014-07-31 10:00:42 -07:00
Mike Turquette	d7d3d26fa5	Samsung clock patches for 3.17 1) non-critical fixes (without need to push to stable): `d5e136a` clk: samsung: Register clk provider only after registering its all clocks `305cfab` clk: samsung: Make of_device_id array const `e9d5295` clk: samsung: exynos5420: Setup clocks before system suspend `f65d518` clk: samsung: trivial: Correct typo in author's name 2) Exynos CLKOUT driver: `800c979` clk: samsung: exynos4: Add missing CPU/DMC clock hierarchy `01f7ec2` clk: samsung: exynos4: Add CLKOUT clock hierarchy `1e832e5` clk: samsung: Add driver to control CLKOUT line on Exynos SoCs `d19bb39` ARM: dts: exynos: Update PMU node with CLKOUT related data 3) Clock hierarchy extensions: `17d3f1d` clk: exynos4: Add PPMU IP block source clocks. `ca5b402` clk: samsung: register exynos5420 apll/kpll configuration data 4) ARM CLKDOWN functionality enablement for Exynos4 and 3250: `42773b2` clk: samsung: exynos4: Enable ARMCLK down feature `45c5b0a` clk: samsung: exynos3250: Enable ARMCLK down feature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJT2kn8AAoJEIv3Hb8G/Xru1OMP/j6lh7aMyNm6E0mHFJd0Pfjy foe5N9RdQeRfoLySkItqJbBmgujkjjxstSpENCMp8VINlxxHfxi4Nl0bk34efy5f xYMVrmyZoB5dO4W/QmamGIiysD6aRhJ+kbwN+fai05/y+XUt8nUSTH7VdBabq9d3 2O1kRjOMhcdnGQgs/V2XK3SvX2+iUycNAi3JKv1ai1OtB8JiykCeN4FOJr2xPFkv CS5kZj+ofor3SZ6NnmJq52Uuto+ck9NLpp6ohCNlvf6PC52oGa/l1KU693r534Rf nbbNeCOEeByqXPMuL/SxAQzZOqbMRADef42X6tVTK6qpTx58Iep9LRybolYCrMor s4p92u9gsLsURQ85f02mYecnqLMEoeZb5p7sOmjZ0QuHWXm5PVkhIzOYweXEzFtE MeEcVIEiaSpqrm94s7iPYNXleTfLHvoi7jSRjfJayqffNuUeBMfKG6gkmYogU/Ou 9RrGsB+m8dyz/vqvqtRkZznOBaFblwqhSdeY2F+x9Onk/Bin3wzh/9NIge8HJk2P H62R1EUQePLCS9cZJS95jBmSAWRXPD6yaq4xIj2LTuN0uFhCO2FRxWW2eFu9OqE4 DfDIDA1S77aqMzIYjSUUis1yhf4RnSnqy2il5iFMsbiA0/19rYgLiyNrpym8AS+T +ErdKOkHjkwEUZbzobH6 =wCSD -----END PGP SIGNATURE----- Merge tag 'for_3.17/samsung-clk' of git://git.kernel.org/pub/scm/linux/kernel/git/tfiga/samsung-clk into clk-next-samsung Samsung clock patches for 3.17 1) non-critical fixes (without need to push to stable): `d5e136a` clk: samsung: Register clk provider only after registering its all clocks `305cfab` clk: samsung: Make of_device_id array const `e9d5295` clk: samsung: exynos5420: Setup clocks before system suspend `f65d518` clk: samsung: trivial: Correct typo in author's name 2) Exynos CLKOUT driver: `800c979` clk: samsung: exynos4: Add missing CPU/DMC clock hierarchy `01f7ec2` clk: samsung: exynos4: Add CLKOUT clock hierarchy `1e832e5` clk: samsung: Add driver to control CLKOUT line on Exynos SoCs `d19bb39` ARM: dts: exynos: Update PMU node with CLKOUT related data 3) Clock hierarchy extensions: `17d3f1d` clk: exynos4: Add PPMU IP block source clocks. `ca5b402` clk: samsung: register exynos5420 apll/kpll configuration data 4) ARM CLKDOWN functionality enablement for Exynos4 and 3250: `42773b2` clk: samsung: exynos4: Enable ARMCLK down feature `45c5b0a` clk: samsung: exynos3250: Enable ARMCLK down feature	2014-07-31 09:32:18 -07:00
Dave Hansen	a5102476a2	x86/mm: Set TLB flush tunable to sane value (33) This has been run through Intel's LKP tests across a wide range of modern sytems and workloads and it wasn't shown to make a measurable performance difference positive or negative. Now that we have some shiny new tracepoints, we can actually figure out what the heck is going on. During a kernel compile, 60% of the flush_tlb_mm_range() calls are for a single page. It breaks down like this: size percent percent<= V V V GLOBAL: 2.20% 2.20% avg cycles: 2283 1: 56.92% 59.12% avg cycles: 1276 2: 13.78% 72.90% avg cycles: 1505 3: 8.26% 81.16% avg cycles: 1880 4: 7.41% 88.58% avg cycles: 2447 5: 1.73% 90.31% avg cycles: 2358 6: 1.32% 91.63% avg cycles: 2563 7: 1.14% 92.77% avg cycles: 2862 8: 0.62% 93.39% avg cycles: 3542 9: 0.08% 93.47% avg cycles: 3289 10: 0.43% 93.90% avg cycles: 3570 11: 0.20% 94.10% avg cycles: 3767 12: 0.08% 94.18% avg cycles: 3996 13: 0.03% 94.20% avg cycles: 4077 14: 0.02% 94.23% avg cycles: 4836 15: 0.04% 94.26% avg cycles: 5699 16: 0.06% 94.32% avg cycles: 5041 17: 0.57% 94.89% avg cycles: 5473 18: 0.02% 94.91% avg cycles: 5396 19: 0.03% 94.95% avg cycles: 5296 20: 0.02% 94.96% avg cycles: 6749 21: 0.18% 95.14% avg cycles: 6225 22: 0.01% 95.15% avg cycles: 6393 23: 0.01% 95.16% avg cycles: 6861 24: 0.12% 95.28% avg cycles: 6912 25: 0.05% 95.32% avg cycles: 7190 26: 0.01% 95.33% avg cycles: 7793 27: 0.01% 95.34% avg cycles: 7833 28: 0.01% 95.35% avg cycles: 8253 29: 0.08% 95.42% avg cycles: 8024 30: 0.03% 95.45% avg cycles: 9670 31: 0.01% 95.46% avg cycles: 8949 32: 0.01% 95.46% avg cycles: 9350 33: 3.11% 98.57% avg cycles: 8534 34: 0.02% 98.60% avg cycles: 10977 35: 0.02% 98.62% avg cycles: 11400 We get in to dimishing returns pretty quickly. On pre-IvyBridge CPUs, we used to set the limit at 8 pages, and it was set at 128 on IvyBrige. That 128 number looks pretty silly considering that less than 0.5% of the flushes are that large. The previous code tried to size this number based on the size of the TLB. Good idea, but it's error-prone, needs maintenance (which it didn't get up to now), and probably would not matter in practice much. Settting it to 33 means that we cover the mallopt M_TRIM_THRESHOLD, which is the most universally common size to do flushes. That's the short version. Here's the long one for why I chose 33: 1. These numbers have a constant bias in the timestamps from the tracing. Probably counts for a couple hundred cycles in each of these tests, but it should be fairly _even_ across all of them. The smallest delta between the tracepoints I have ever seen is 335 cycles. This is one reason the cycles/page cost goes down in general as the flushes get larger. The true cost is nearer to 100 cycles. 2. A full flush is more expensive than a single invlpg, but not by much (single percentages). 3. A dtlb miss is 17.1ns (~45 cycles) and a itlb miss is 13.0ns (~34 cycles). At those rates, refilling the 512-entry dTLB takes 22,000 cycles. 4. 22,000 cycles is approximately the equivalent of doing 85 invlpg operations. But, the odds are that the TLB can actually be filled up faster than that because TLB misses that are close in time also tend to leverage the same caches. 6. ~98% of flushes are <=33 pages. There are a lot of flushes of 33 pages, probably because libc's M_TRIM_THRESHOLD is set to 128k (32 pages) 7. I've found no consistent data to support changing the IvyBridge vs. SandyBridge tunable by a factor of 16 I used the performance counters on this hardware (IvyBridge i5-3320M) to figure out the tlb miss costs: ocperf.py stat -e dtlb_load_misses.walk_duration,dtlb_load_misses.walk_completed,dtlb_store_misses.walk_duration,dtlb_store_misses.walk_completed,itlb_misses.walk_duration,itlb_misses.walk_completed,itlb.itlb_flush 7,720,030,970 dtlb_load_misses_walk_duration [57.13%] 169,856,353 dtlb_load_misses_walk_completed [57.15%] 708,832,859 dtlb_store_misses_walk_duration [57.17%] 19,346,823 dtlb_store_misses_walk_completed [57.17%] 2,779,687,402 itlb_misses_walk_duration [57.15%] 82,241,148 itlb_misses_walk_completed [57.13%] 770,717 itlb_itlb_flush [57.11%] Show that a dtlb miss is 17.1ns (~45 cycles) and a itlb miss is 13.0ns (~34 cycles). At those rates, refilling the 512-entry dTLB takes 22,000 cycles. On a SandyBridge system with more cores and larger caches, those are dtlb=13.4ns and itlb=9.5ns. cat perf.stat.txt \| perl -pe 's/,//g' \| awk '/itlb_misses_walk_duration/ { icyc+=$1 } /itlb_misses_walk_completed/ { imiss+=$1 } /dtlb_._walk_duration/ { dcyc+=$1 } /dtlb_..completed/ { dmiss+=$1 } END {print "itlb cyc/miss: ", icyc/imiss, " dtlb cyc/miss: ", dcyc/dmiss, " ----- ", icyc,imiss, dcyc,dmiss } On Westmere CPUs, the counters to use are: itlb_flush,itlb_misses.walk_cycles,itlb_misses.any,dtlb_misses.walk_cycles,dtlb_misses.any The assumptions that this code went in under: https://lkml.org/lkml/2012/6/12/119 say that a flush and a refill are about 100ns. Being generous, that is over by a factor of 6 on the refill side, although it is fairly close on the cost of an invlpg. An increase of a single invlpg operation seems to lengthen the flush range operation by about 200 cycles. Here is one example of the data collected for flushing 10 and 11 pages (full data are below): 10: 0.43% 93.90% avg cycles: 3570 cycles/page: 357 samples: 4714 11: 0.20% 94.10% avg cycles: 3767 cycles/page: 342 samples: 2145 How to generate this table: echo 10000 > /sys/kernel/debug/tracing/buffer_size_kb echo x86-tsc > /sys/kernel/debug/tracing/trace_clock echo 'reason != 0' > /sys/kernel/debug/tracing/events/tlb/tlb_flush/filter echo 1 > /sys/kernel/debug/tracing/events/tlb/tlb_flush/enable Pipe the trace output in to this script: http://sr71.net/~dave/intel/201402-tlb/trace-time-diff-process.pl.txt Note that these data were gathered with the invlpg threshold set to 150 pages. Only data points with >=50 of samples were printed: Flush % of %<= in flush this pages es size ------------------------------------------------------------------------------ -1: 2.20% 2.20% avg cycles: 2283 cycles/page: xxxx samples: 23960 1: 56.92% 59.12% avg cycles: 1276 cycles/page: 1276 samples: 620895 2: 13.78% 72.90% avg cycles: 1505 cycles/page: 752 samples: 150335 3: 8.26% 81.16% avg cycles: 1880 cycles/page: 626 samples: 90131 4: 7.41% 88.58% avg cycles: 2447 cycles/page: 611 samples: 80877 5: 1.73% 90.31% avg cycles: 2358 cycles/page: 471 samples: 18885 6: 1.32% 91.63% avg cycles: 2563 cycles/page: 427 samples: 14397 7: 1.14% 92.77% avg cycles: 2862 cycles/page: 408 samples: 12441 8: 0.62% 93.39% avg cycles: 3542 cycles/page: 442 samples: 6721 9: 0.08% 93.47% avg cycles: 3289 cycles/page: 365 samples: 917 10: 0.43% 93.90% avg cycles: 3570 cycles/page: 357 samples: 4714 11: 0.20% 94.10% avg cycles: 3767 cycles/page: 342 samples: 2145 12: 0.08% 94.18% avg cycles: 3996 cycles/page: 333 samples: 864 13: 0.03% 94.20% avg cycles: 4077 cycles/page: 313 samples: 289 14: 0.02% 94.23% avg cycles: 4836 cycles/page: 345 samples: 236 15: 0.04% 94.26% avg cycles: 5699 cycles/page: 379 samples: 390 16: 0.06% 94.32% avg cycles: 5041 cycles/page: 315 samples: 643 17: 0.57% 94.89% avg cycles: 5473 cycles/page: 321 samples: 6229 18: 0.02% 94.91% avg cycles: 5396 cycles/page: 299 samples: 224 19: 0.03% 94.95% avg cycles: 5296 cycles/page: 278 samples: 367 20: 0.02% 94.96% avg cycles: 6749 cycles/page: 337 samples: 185 21: 0.18% 95.14% avg cycles: 6225 cycles/page: 296 samples: 1964 22: 0.01% 95.15% avg cycles: 6393 cycles/page: 290 samples: 83 23: 0.01% 95.16% avg cycles: 6861 cycles/page: 298 samples: 61 24: 0.12% 95.28% avg cycles: 6912 cycles/page: 288 samples: 1307 25: 0.05% 95.32% avg cycles: 7190 cycles/page: 287 samples: 533 26: 0.01% 95.33% avg cycles: 7793 cycles/page: 299 samples: 94 27: 0.01% 95.34% avg cycles: 7833 cycles/page: 290 samples: 66 28: 0.01% 95.35% avg cycles: 8253 cycles/page: 294 samples: 73 29: 0.08% 95.42% avg cycles: 8024 cycles/page: 276 samples: 846 30: 0.03% 95.45% avg cycles: 9670 cycles/page: 322 samples: 296 31: 0.01% 95.46% avg cycles: 8949 cycles/page: 288 samples: 79 32: 0.01% 95.46% avg cycles: 9350 cycles/page: 292 samples: 60 33: 3.11% 98.57% avg cycles: 8534 cycles/page: 258 samples: 33936 34: 0.02% 98.60% avg cycles: 10977 cycles/page: 322 samples: 268 35: 0.02% 98.62% avg cycles: 11400 cycles/page: 325 samples: 177 36: 0.01% 98.63% avg cycles: 11504 cycles/page: 319 samples: 161 37: 0.02% 98.65% avg cycles: 11596 cycles/page: 313 samples: 182 38: 0.02% 98.66% avg cycles: 11850 cycles/page: 311 samples: 195 39: 0.01% 98.68% avg cycles: 12158 cycles/page: 311 samples: 128 40: 0.01% 98.68% avg cycles: 11626 cycles/page: 290 samples: 78 41: 0.04% 98.73% avg cycles: 11435 cycles/page: 278 samples: 477 42: 0.01% 98.73% avg cycles: 12571 cycles/page: 299 samples: 74 43: 0.01% 98.74% avg cycles: 12562 cycles/page: 292 samples: 78 44: 0.01% 98.75% avg cycles: 12991 cycles/page: 295 samples: 108 45: 0.01% 98.76% avg cycles: 13169 cycles/page: 292 samples: 78 46: 0.02% 98.78% avg cycles: 12891 cycles/page: 280 samples: 261 47: 0.01% 98.79% avg cycles: 13099 cycles/page: 278 samples: 67 48: 0.01% 98.80% avg cycles: 13851 cycles/page: 288 samples: 77 49: 0.01% 98.80% avg cycles: 13749 cycles/page: 280 samples: 66 50: 0.01% 98.81% avg cycles: 13949 cycles/page: 278 samples: 73 52: 0.00% 98.82% avg cycles: 14243 cycles/page: 273 samples: 52 54: 0.01% 98.83% avg cycles: 15312 cycles/page: 283 samples: 87 55: 0.01% 98.84% avg cycles: 15197 cycles/page: 276 samples: 109 56: 0.02% 98.86% avg cycles: 15234 cycles/page: 272 samples: 208 57: 0.00% 98.86% avg cycles: 14888 cycles/page: 261 samples: 53 58: 0.01% 98.87% avg cycles: 15037 cycles/page: 259 samples: 59 59: 0.01% 98.87% avg cycles: 15752 cycles/page: 266 samples: 63 62: 0.00% 98.89% avg cycles: 16222 cycles/page: 261 samples: 54 64: 0.02% 98.91% avg cycles: 17179 cycles/page: 268 samples: 248 65: 0.12% 99.03% avg cycles: 18762 cycles/page: 288 samples: 1324 85: 0.00% 99.10% avg cycles: 21649 cycles/page: 254 samples: 50 127: 0.01% 99.18% avg cycles: 32397 cycles/page: 255 samples: 75 128: 0.13% 99.31% avg cycles: 31711 cycles/page: 247 samples: 1466 129: 0.18% 99.49% avg cycles: 33017 cycles/page: 255 samples: 1927 181: 0.33% 99.84% avg cycles: 2489 cycles/page: 13 samples: 3547 256: 0.05% 99.91% avg cycles: 2305 cycles/page: 9 samples: 550 512: 0.03% 99.95% avg cycles: 2133 cycles/page: 4 samples: 304 1512: 0.01% 99.99% avg cycles: 3038 cycles/page: 2 samples: 65 Here are the tlb counters during a 10-second slice of a kernel compile for a SandyBridge system. It's better than IvyBridge, but probably due to the larger caches since this was one of the 'X' extreme parts. 10,873,007,282 dtlb_load_misses_walk_duration 250,711,333 dtlb_load_misses_walk_completed 1,212,395,865 dtlb_store_misses_walk_duration 31,615,772 dtlb_store_misses_walk_completed 5,091,010,274 itlb_misses_walk_duration 163,193,511 itlb_misses_walk_completed 1,321,980 itlb_itlb_flush 10.008045158 seconds time elapsed # cat perf.stat.1392743721.txt \| perl -pe 's/,//g' \| awk '/itlb_misses_walk_duration/ { icyc+=$1 } /itlb_misses_walk_completed/ { imiss+=$1 } /dtlb_._walk_duration/ { dcyc+=$1 } /dtlb_..completed/ { dmiss+=$1 } END {print "itlb cyc/miss: ", icyc/imiss/3.3, " dtlb cyc/miss: ", dcyc/dmiss/3.3, " ----- ", icyc,imiss, dcyc,dmiss }' itlb ns/miss: 9.45338 dtlb ns/miss: 12.9716 Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/r/20140731154103.10C1115E@viggo.jf.intel.com Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-07-31 08:48:51 -07:00
Dave Hansen	2d040a1ce9	x86/mm: New tunable for single vs full TLB flush Most of the logic here is in the documentation file. Please take a look at it. I know we've come full-circle here back to a tunable, but this new one is WAY simpler. I challenge anyone to describe in one sentence how the old one worked. Here's the way the new one works: If we are flushing more pages than the ceiling, we use the full flush, otherwise we use per-page flushes. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/r/20140731154101.12B52CAF@viggo.jf.intel.com Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-07-31 08:48:51 -07:00
Dave Hansen	d17d8f9ded	x86/mm: Add tracepoints for TLB flushes We don't have any good way to figure out what kinds of flushes are being attempted. Right now, we can try to use the vm counters, but those only tell us what we actually did with the hardware (one-by-one vs full) and don't tell us what was actually _requested_. This allows us to select out "interesting" TLB flushes that we might want to optimize (like the ranged ones) and ignore the ones that we have very little control over (the ones at context switch). Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/r/20140731154059.4C96CBA5@viggo.jf.intel.com Acked-by: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-07-31 08:48:51 -07:00
Dave Hansen	a23421f111	x86/mm: Unify remote INVLPG code There are currently three paths through the remote flush code: 1. full invalidation 2. single page invalidation using invlpg 3. ranged invalidation using invlpg This takes 2 and 3 and combines them in to a single path by making the single-page one just be the start and end be start plus a single page. This makes placement of our tracepoint easier. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/r/20140731154058.E0F90408@viggo.jf.intel.com Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-07-31 08:48:51 -07:00
Dave Hansen	9dfa6dee53	x86/mm: Fix missed global TLB flush stat If we take the if (end == TLB_FLUSH_ALL \|\| vmflag & VM_HUGETLB) { local_flush_tlb(); goto out; } path out of flush_tlb_mm_range(), we will have flushed the tlb, but not incremented NR_TLB_LOCAL_FLUSH_ALL. This unifies the way out of the function so that we always take a single path when doing a full tlb flush. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/r/20140731154056.FF763B76@viggo.jf.intel.com Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-07-31 08:48:50 -07:00
Dave Hansen	e9f4e0a9fe	x86/mm: Rip out complicated, out-of-date, buggy TLB flushing I think the flush_tlb_mm_range() code that tries to tune the flush sizes based on the CPU needs to get ripped out for several reasons: 1. It is obviously buggy. It uses mm->total_vm to judge the task's footprint in the TLB. It should certainly be using some measure of RSS, NOT ->total_vm since only resident memory can populate the TLB. 2. Haswell, and several other CPUs are missing from the intel_tlb_flushall_shift_set() function. Thus, it has been demonstrated to bitrot quickly in practice. 3. It is plain wrong in my vm: [ 0.037444] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0 [ 0.037444] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0 [ 0.037444] tlb_flushall_shift: 6 Which leads to it to never use invlpg. 4. The assumptions about TLB refill costs are wrong: http://lkml.kernel.org/r/1337782555-8088-3-git-send-email-alex.shi@intel.com (more on this in later patches) 5. I can not reproduce the original data: https://lkml.org/lkml/2012/5/17/59 I believe the sample times were too short. Running the benchmark in a loop yields times that vary quite a bit. Note that this leaves us with a static ceiling of 1 page. This is a conservative, dumb setting, and will be revised in a later patch. This also removes the code which attempts to predict whether we are flushing data or instructions. We expect instruction flushes to be relatively rare and not worth tuning for explicitly. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/r/20140731154055.ABC88E89@viggo.jf.intel.com Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-07-31 08:48:50 -07:00

1 2 3 4 5 ...

460346 Commits