Commit Graph

753920 Commits

Author SHA1 Message Date
Song Liu
9511bce9fe perf/core: Fix bad use of igrab()
As Miklos reported and suggested:

 "This pattern repeats two times in trace_uprobe.c and in
  kernel/events/core.c as well:

      ret = kern_path(filename, LOOKUP_FOLLOW, &path);
      if (ret)
          goto fail_address_parse;

      inode = igrab(d_inode(path.dentry));
      path_put(&path);

  And it's wrong.  You can only hold a reference to the inode if you
  have an active ref to the superblock as well (which is normally
  through path.mnt) or holding s_umount.

  This way unmounting the containing filesystem while the tracepoint is
  active will give you the "VFS: Busy inodes after unmount..." message
  and a crash when the inode is finally put.

  Solution: store path instead of inode."

This patch fixes the issue in kernel/event/core.c.

Reviewed-and-tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reported-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <kernel-team@fb.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 375637bc52 ("perf/core: Introduce address range filtering")
Link: http://lkml.kernel.org/r/20180418062907.3210386-2-songliubraving@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-25 08:11:10 +02:00
Song Liu
a1150c2022 perf/core: Fix group scheduling with mixed hw and sw events
When hw and sw events are mixed in the same group, they are all attached
to the hw perf_event_context. This sometimes requires moving group of
perf_event to a different context.

We found a bug in how the kernel handles this, for example if we do:

   perf stat -e '{faults,ref-cycles,faults}'  -I 1000

     1.005591180              1,297      faults
     1.005591180        457,476,576      ref-cycles
     1.005591180    <not supported>      faults

First, sw event "faults" is attached to the sw context, and becomes the
group leader. Then, hw event "ref-cycles" is attached, so both events
are moved to the hw context. Last, another sw "faults" tries to attach,
but it fails because of mismatch between the new target ctx (from sw
pmu) and the group_leader's ctx (hw context, same as ref-cycles).

The broken condition is:
   group_leader is sw event;
   group_leader is on hw context;
   add a sw event to the group.

Fix this scenario by checking group_leader's context (instead of just
event type). If group_leader is on hw context, use the ->pmu of this
context to look up context for the new event.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <kernel-team@fb.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: b04243ef70 ("perf: Complete software pmu grouping")
Link: http://lkml.kernel.org/r/20180503194716.162815-1-songliubraving@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-25 08:11:10 +02:00
Ingo Molnar
bd9c67ad96 Merge branch 'linus' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-25 08:02:43 +02:00
Linus Torvalds
b50694381c Merge branch 'stable/for-linus-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
Pull swiotlb fix from Konrad Rzeszutek Wilk:
 "One single fix in here: under Xen the DMA32 heap (in the hypervisor)
  would end up looking like swiss cheese.

  The reason being that for every coherent DMA allocation we didn't do
  the proper hypercall to tell Xen to return the page back to the DMA32
  heap. End result was (eventually) no DMA32 space if you (for example)
  continously unloaded and loaded modules"

* 'stable/for-linus-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
  xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent
2018-05-24 14:42:43 -07:00
Linus Torvalds
34b48b8789 Merge candidates for 4.17-rc
- Remove bouncing addresses from the MAINTAINERS file
 - Kernel oops and bad error handling fixes for hfi, i40iw, cxgb4, and hns drivers
 - Various small LOC behavioral/operational bugs in mlx5, hns, qedr and i40iw drivers
 - Two fixes for patches already sent during the merge window
 - A long standing bug related to not decreasing the pinned pages count in the right
   MM was found and fixed
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCgAGBQJbByPQAAoJEDht9xV+IJsa164P/AihB/vbn9MBdK3pe1OSUGTm
 tKZJ/Y6nY/Q/XTJSeM2wNECk8fOrZbKuLBz2XlPRsB2djp4ugC5WWfK9YbwWMGXG
 I5B/lB8VTorQr8E5i9lqqMDQc8aF8VcGJtdqVE3nD4JsVTrQSGiSnw45/BARDUm3
 OycJJMDOWhDj2wnNSa+JfjPemIMDM1jse7DnsJfDsGfTMS/G+6nyzjKIlEnnFZ8/
 PBxhq0q7C5viNDwwn2GsAVUrATTlW48SY0WYhkgMdSl20d2th9wMZqNMqtniz8NP
 lg87SrhzsAPOTlbSWlYYkAnzE7nEhfJyIfYUp2piNJeYuOohYPtO6w99Tqjl/GmU
 uLIYIXtZCxAK1Zb/znc49HkRVL5YFDsQGXdtYy7tvRZPwwR32kowUtpKIWaZFz8O
 BA/x+Zgqu9AlwqSWwQwxmMbUX42RRwhNJDVyTYlXQSSzhfgFaLIZARqb4K6HxeNN
 vZN0BK+x6pX6FI7hpdsqNRtH1oo4SNUBxiuUsrZ7cy7GqYNdUJ6piygDgmERaJxU
 svIUJof/+OoU1QyErQ0JgUEK/3jOHbjxSPb/rjQeqxAnCqhaGOuNGMtdfsGqgvBU
 x/u3eDcbfi/LBErXR46gYtxnOQ8I2BB+m8erUc/GVvCzWrX+R7ELZYpBrP5Pcu/6
 mr2D7hDqgZHbeU8aB8+D
 =uFZh
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "This is pretty much just the usual array of smallish driver bugs.

   - remove bouncing addresses from the MAINTAINERS file

   - kernel oops and bad error handling fixes for hfi, i40iw, cxgb4, and
     hns drivers

   - various small LOC behavioral/operational bugs in mlx5, hns, qedr
     and i40iw drivers

   - two fixes for patches already sent during the merge window

   - a long-standing bug related to not decreasing the pinned pages
     count in the right MM was found and fixed"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (28 commits)
  RDMA/hns: Move the location for initializing tmp_len
  RDMA/hns: Bugfix for cq record db for kernel
  IB/uverbs: Fix uverbs_attr_get_obj
  RDMA/qedr: Fix doorbell bar mapping for dpi > 1
  IB/umem: Use the correct mm during ib_umem_release
  iw_cxgb4: Fix an error handling path in 'c4iw_get_dma_mr()'
  RDMA/i40iw: Avoid panic when reading back the IRQ affinity hint
  RDMA/i40iw: Avoid reference leaks when processing the AEQ
  RDMA/i40iw: Avoid panic when objects are being created and destroyed
  RDMA/hns: Fix the bug with NULL pointer
  RDMA/hns: Set NULL for __internal_mr
  RDMA/hns: Enable inner_pa_vld filed of mpt
  RDMA/hns: Set desc_dma_addr for zero when free cmq desc
  RDMA/hns: Fix the bug with rq sge
  RDMA/hns: Not support qp transition from reset to reset for hip06
  RDMA/hns: Add return operation when configured global param fail
  RDMA/hns: Update convert function of endian format
  RDMA/hns: Load the RoCE dirver automatically
  RDMA/hns: Bugfix for rq record db for kernel
  RDMA/hns: Add rq inline flags judgement
  ...
2018-05-24 14:12:05 -07:00
Linus Torvalds
d7b66b4ab0 for-4.17-rc6-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlsG/ecACgkQxWXV+ddt
 WDvr3w/8D12pwR9sPcEwxD4pvoLv7LP1VRQy2u+ivSifdBD7MueKh3y0igUMyARR
 LERsK0zUsTQGkkC6c7ZYd4cT9PikPpXtO1P9iATFAKqR/YMDIV/haSqT8DwbI/qb
 7F+ZMeTy1LzL01YlYBrGVDxP8AWVO2Dml6JolYxzplILSLvdPH6G8xOSjei/p9sm
 RK5ERHJENEI0l/cThpiLoAEWjzciPtR39T5Hq45onHyCs3bjJCcx51/QE8sBsl8x
 +BKvCmL40UKd30YKudJZYDM6NgMgWENhfTtIZQIInv99sMNCxIgTEUdX8ExdyjRZ
 24rst/BuQz4d8r/8zqE/hdFsHRGWwnEiYmGWylanPY5KdQ41ULfXC06xuoNOLoW8
 KQwD8SWv+W5vEJW0UQz5cb3vUgv5RnUzPvcmMfSztLeo2K4zj6zCK5L6XJwIJNbM
 1AJR7R4TRkQdf5QEeziFl738Yv1AgsPQuKSiiFa9YwXMLU8dYXlx14ioUzBL8MLe
 1wZPJ03x/N7eKJ0g6OIAAVfUTFFejv4Z2B2IDoObuLLsPwTdK6tS+9tJ5mos7ngG
 Vf1ZVmhmeJdw1qwK8ROzAJHkK807KgGO7LWmA7tIVLwWuZX14F7xLQIg3Ux3MhIh
 NhoBTFy2AGmdE0hFYv/4FA5dnUOU4VTVYVw3QUV4DMc0XIodZrE=
 =iYyx
 -----END PGP SIGNATURE-----

Merge tag 'for-4.17-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fix from David Sterba:
 "A one-liner that prevents leaking an internal error value 1 out of the
  ftruncate syscall.

  This has been observed in practice. The steps to reproduce make a
  common pattern (open/write/fync/ftruncate) but also need the
  application to not check only for negative values and happens only for
  compressed inlined files.

  The conditions are narrow but as this could break userspace I think
  it's better to merge it now and not wait for the merge window"

* tag 'for-4.17-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Btrfs: fix error handling in btrfs_truncate()
2018-05-24 11:47:43 -07:00
Joonsoo Kim
d883c6cf3b Revert "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE"
This reverts the following commits that change CMA design in MM.

 3d2054ad8c ("ARM: CMA: avoid double mapping to the CMA area if CONFIG_HIGHMEM=y")

 1d47a3ec09 ("mm/cma: remove ALLOC_CMA")

 bad8c6c0b1 ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")

Ville reported a following error on i386.

  Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
  microcode: microcode updated early to revision 0x4, date = 2013-06-28
  Initializing CPU#0
  Initializing HighMem for node 0 (000377fe:00118000)
  Initializing Movable for node 0 (00000001:00118000)
  BUG: Bad page state in process swapper  pfn:377fe
  page:f53effc0 count:0 mapcount:-127 mapping:00000000 index:0x0
  flags: 0x80000000()
  raw: 80000000 00000000 00000000 ffffff80 00000000 00000100 00000200 00000001
  page dumped because: nonzero mapcount
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0-rc5-elk+ #145
  Hardware name: Dell Inc. Latitude E5410/03VXMC, BIOS A15 07/11/2013
  Call Trace:
   dump_stack+0x60/0x96
   bad_page+0x9a/0x100
   free_pages_check_bad+0x3f/0x60
   free_pcppages_bulk+0x29d/0x5b0
   free_unref_page_commit+0x84/0xb0
   free_unref_page+0x3e/0x70
   __free_pages+0x1d/0x20
   free_highmem_page+0x19/0x40
   add_highpages_with_active_regions+0xab/0xeb
   set_highmem_pages_init+0x66/0x73
   mem_init+0x1b/0x1d7
   start_kernel+0x17a/0x363
   i386_start_kernel+0x95/0x99
   startup_32_smp+0x164/0x168

The reason for this error is that the span of MOVABLE_ZONE is extended
to whole node span for future CMA initialization, and, normal memory is
wrongly freed here.  I submitted the fix and it seems to work, but,
another problem happened.

It's so late time to fix the later problem so I decide to reverting the
series.

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Laura Abbott <labbott@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-24 10:07:50 -07:00
Linus Torvalds
577e75e0c9 Merge branch 'for-4.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
 "Nothing too interesting.  Four patches to update the blacklist and
  add a controller ID"

* 'for-4.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  ahci: Add PCI ID for Cannon Lake PCH-LP AHCI
  libata: blacklist Micron 500IT SSD with MU01 firmware
  libata: Apply NOLPM quirk for SAMSUNG PM830 CXM13D1Q.
  libata: Blacklist some Sandisk SSDs for NCQ
2018-05-24 09:36:16 -07:00
Linus Torvalds
b68ea0ee03 for-linus-20180524
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJbBramAAoJEPfTWPspceCm2gsP/R5p/Vo8Ml303cdgCOrw+Q+T
 W/Wp6hGmNUZ7HjMlfF85rF1bqpgGoj226qIibHvKi7eYwUScDNgJwcrrWnNazbGx
 Rnl/+NQ/H/38pBvvDJKjth9n0LY/O1geemhnWzYZmqmZJe3jFNTDpw5pGy4RkTpJ
 4DCuXg2n81x+kDt7Nslb0dYONkrc1yUjelHS/sJKlUoPs1MvsICGsWNS+Lw0WMIj
 Ls282QNs1Z6Y1gcM18UXsTNJSXQFTBmsbH3CIBukHckP2wjZrMEvgM+uxIORqwID
 0DJWBSVNJMxrsyZtkMMkkz2wMjPjSvx3HjD/ULpglJD6nGSwRAiQr6XIxARSEEDL
 3prQyhEeVGSioOjt5IR2Llt9QDYVhb1GJqqHYmDZux0SMCVg1Pv7mclrudpXGpzq
 v2mSJqnwLofOuTFrSh6afgxClD/yh1UNKByf4Ni78PagAgNS4SjEIz8VDG4m1iNu
 UgWMbM+vqpZhgIlSDsmnqFEVyGwCWwUOHGCeshQXIy9xaWbCtmaGQpNh4Tho273O
 wH6F/jwCnw1BcUeiwOQi+qcwEf1z6nRTGGb3hZt7u2Tj6r8HZzNHtkRaFXsIrnvA
 Vc3Y9+YzRWBVV6wzqPC5K+alvGs5ZXLj7BIxQorEkoMkNCYWiPsiBS/rtc0OYiST
 b2rC+5eYzquEZ42aqIIn
 =QBaF
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20180524' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Two fixes that should go into this release:

   - a loop writeback error clearing fix from Jeff

   - the sr sense fix from myself"

* tag 'for-linus-20180524' of git://git.kernel.dk/linux-block:
  loop: clear wb_err in bd_inode when detaching backing file
  sr: pass down correctly sized SCSI sense buffer
2018-05-24 08:53:20 -07:00
Linus Torvalds
9ca5a2ae42 Power management fix for 4.17-rc7
Fix a regression from the 4.15 cycle that caused the system suspend
 and resume overhead to increase on many systems and triggered more
 serious problems on some of them (Rafael Wysocki).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJbBoz0AAoJEILEb/54YlRxO1MP/2y26kAQLcKzaLiLVbTJW60V
 p+1WEEWYG4S0gIEd5fgy5PzLoX50s6+JQ+JGxV5D/hMofyDSgeK1Ldi/jPT2+3rT
 ag3XW1xYdnGz8SgCNhlDOonxZwRZiFL3SvlRRvP6jCLDFtRSpqCQpzVFD48DHgF/
 y/MuJW95yRwzhvRWIlRViowHLyuGR81ymCNOZ1d9rNrv7KnxFQZdSR76hbMI9bD8
 Ufm/bqc7T+TxIgft0L7584uqgfo0iptZWH1l8RpvDYBPIrR40NWibb5QEjJ+Jn9a
 7nvB6Uw2ICxNTVUuEHRJ08MmhBdchykJnFEru+MH7Mse53MXB2u872kFaz9NbfXv
 xBvCbQJN9eFkN8QfG1+UnX2UIVn/Fl7LxiXPwBOfWz2MxWOqFm5FCKH6ZKitnXnE
 lLbAp7bSxWmHmjURALGv31vlG5rm8GyM8cKJInq9Ms34Y5OeU+/t1MQMS5X50His
 eihn67CBUZmAf2Bo6KENOjuH2fHkCYe4gu2ewnThwflZhVo4cCZJjk07lC56FZai
 TC9Zu1W0mr0CAauyRnT2+GAyGQCoHxBnSMpJ6ZhlGAdUPwerRDsUmpqJYl8iGOIO
 WH3apiV0UkvEJy0Wp6P9/YakJ4PRjK32SwJwAGK1hA0vqRbHpPyYCoZKA16z8zOC
 Rdle9Gq3QIQuRE7RWuVx
 =BS+S
 -----END PGP SIGNATURE-----

Merge tag 'pm-4.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "Fix a regression from the 4.15 cycle that caused the system suspend
  and resume overhead to increase on many systems and triggered more
  serious problems on some of them (Rafael Wysocki)"

* tag 'pm-4.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / core: Fix direct_complete handling for devices with no callbacks
2018-05-24 08:49:56 -07:00
Mika Westerberg
4544e403eb ahci: Add PCI ID for Cannon Lake PCH-LP AHCI
This one should be using the default LPM policy for mobile chipsets so
add the PCI ID to the driver list of supported revices.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
2018-05-24 07:03:32 -07:00
Omar Sandoval
d50147381a Btrfs: fix error handling in btrfs_truncate()
Jun Wu at Facebook reported that an internal service was seeing a return
value of 1 from ftruncate() on Btrfs in some cases. This is coming from
the NEED_TRUNCATE_BLOCK return value from btrfs_truncate_inode_items().

btrfs_truncate() uses two variables for error handling, ret and err.
When btrfs_truncate_inode_items() returns non-zero, we set err to the
return value. However, NEED_TRUNCATE_BLOCK is not an error. Make sure we
only set err if ret is an error (i.e., negative).

To reproduce the issue: mount a filesystem with -o compress-force=zstd
and the following program will encounter return value of 1 from
ftruncate:

int main(void) {
        char buf[256] = { 0 };
        int ret;
        int fd;

        fd = open("test", O_CREAT | O_WRONLY | O_TRUNC, 0666);
        if (fd == -1) {
                perror("open");
                return EXIT_FAILURE;
        }

        if (write(fd, buf, sizeof(buf)) != sizeof(buf)) {
                perror("write");
                close(fd);
                return EXIT_FAILURE;
        }

        if (fsync(fd) == -1) {
                perror("fsync");
                close(fd);
                return EXIT_FAILURE;
        }

        ret = ftruncate(fd, 128);
        if (ret) {
                printf("ftruncate() returned %d\n", ret);
                close(fd);
                return EXIT_FAILURE;
        }

        close(fd);
        return EXIT_SUCCESS;
}

Fixes: ddfae63cc8 ("btrfs: move btrfs_truncate_block out of trans handle")
CC: stable@vger.kernel.org # 4.15+
Reported-by: Jun Wu <quark@fb.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-05-24 11:56:57 +02:00
Ingo Molnar
861410270a perf/core improvements:
. Create extra kernel maps to help in decoding samples in x86 PTI entry
   trampolines (Adrian Hunter)
 
 . Copy x86 PTI entry trampoline sections in the kcore copy used for
   annotation and intel_pt CPU traces decoding (Adrian Hunter)
 
 - Support 'perf annotate --group' for non-explicit recorded event
   "groups", showing multiple columns, one for each event, just like
   when dealing with explicit event groups (those enclosed with {}) (Jin Yao)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEELb9bqkb7Te0zijNb1lAW81NSqkAFAlsFxrQACgkQ1lAW81NS
 qkBI1Q/+MrZcMY1qZAIXW2UPX2gzxUzN9ocpdUX2UnduZE/93oQvrAryWlm7qoz+
 bIVuHyXFYpVtjvXErDlUHNCKbJpSLE1sHDtWeHe9KumNO2UYNNxCvu+QI5ix+wiM
 Gv001Xn5Mth936qS6t+Q6yzxF3HqjdAwtQndJ53wu/BYguBaTlyTcl9LT9z5YI85
 IyVmFztHUIWBlXgW+wz1XerVLSPejqs4a0SlbiR2OX5qZdZ4wCl+JZXwvEMurgss
 nxbI0CvuPFPsA7mt65f4Kox1vvB/L+5LPOIhcsTUBJ6MmXtil1FOQX/+FlcyzKsx
 Ucrcry3b2KdFHvxUoz5FE38QMXtQwsmPSb1gTd3RBkQT1UYQkn0OKxJyUQRbuAIM
 ioE40OwFOqhYXuFjCh3m/Stf6iuUPYumBxEOVOAGRW+5dVESyM4qhpphsJY/nEoo
 p+tMkggoe2ULx9JplHfTX7iVx3zeSmmdFPwtJi3cYiYRqckNpDHECL6XSLur97jZ
 TVCOBmr2cnh0tyoG/mKxX6PlggisUGC5z4DOJCI7bCyn5TushmGMEVKNRKWA1JST
 vFoptBkbpTGOawq3KKARUvQ7GZ8sMim7rwY3bxEWa45+zE3ySidfms74eLYdL9f+
 +LdYMmgiUWKjgiFBHm6ZCMcPI780cdWQcdAl83/3JOBj0lC7b40=
 =9PtG
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-4.18-20180523' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements from Arnaldo Carvalho de Melo:

- Create extra kernel maps to help in decoding samples in x86 PTI entry
  trampolines (Adrian Hunter)

- Copy x86 PTI entry trampoline sections in the kcore copy used for
  annotation and intel_pt CPU traces decoding (Adrian Hunter)

- Support 'perf annotate --group' for non-explicit recorded event
  "groups", showing multiple columns, one for each event, just like
  when dealing with explicit event groups (those enclosed with {}) (Jin Yao)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-24 07:30:25 +02:00
oulijun
55ba49cbce RDMA/hns: Move the location for initializing tmp_len
When posted work request, it need to compute the length of
all sges of every wr and fill it into the msg_len field of
send wqe. Thus, While posting multiple wr,
tmp_len should be reinitialized to zero.

Fixes: 8b9b8d143b ("RDMA/hns: Fix the endian problem for hns")
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-23 15:45:44 -06:00
oulijun
05d6a4ddb6 RDMA/hns: Bugfix for cq record db for kernel
When use cq record db for kernel, it needs to set the hr_cq->db_en
to 1 and configure the dma address of record cq db of qp context.

Fixes: 86188a8810 ("RDMA/hns: Support cq record doorbell for kernel space")
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-23 15:45:44 -06:00
Jason Gunthorpe
f4602cbb0a IB/uverbs: Fix uverbs_attr_get_obj
The err pointer comes from uverbs_attr_get, not from the uobject member,
which does not store an ERR_PTR.

Fixes: be934cca9e ("IB/uverbs: Add device memory registration ioctl support")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
2018-05-23 15:25:53 -06:00
Kalderon, Michal
30bf066cd9 RDMA/qedr: Fix doorbell bar mapping for dpi > 1
Each user_context receives a separate dpi value and thus a different
address on the doorbell bar. The qedr_mmap function needs to validate
the address and map the doorbell bar accordingly.
The current implementation always checked against dpi=0 doorbell range
leading to a wrong mapping for doorbell bar. (It entered an else case
that mapped the address differently). qedr_mmap should only be used
for doorbells, so the else was actually wrong in the first place.
This only has an affect on arm architecture and not an issue on a
x86 based architecture.
This lead to doorbells not occurring on arm based systems and left
applications that use more than one dpi (or several applications
run simultaneously ) to hang.

Fixes: ac1b36e55a ("qedr: Add support for user context verbs")
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-23 15:12:14 -06:00
Linus Torvalds
bee797529d Single fix correcting the handling for long-running commands; cros_ec_spi
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEdrbJNaO+IJqU8IdIUa+KL4f8d2EFAlsFBrQACgkQUa+KL4f8
 d2HoJA//eG5ACgx8Lmop2y+b37PQ1M4sgQj0sf0POq1aFKxLIotKeK2V9SQE73HQ
 0kcxNx7rXZT5dHBEdycf3W5LbmoK1X/Iz+GxMFR2RQ5plkAyT4h4uUQhRf9/5HVp
 xrXyaJw2twOZInQ0dr1bE8IbDlL8xiH3kkhtzMmeS+xaz2aGsf5gfNq+fzhlyz2K
 L8MTCBujYSR4KY3U8O7ifN3RobHtsYqAJhRej6JP7jlvxHcd3ZaOPQUHLTlVxZ8q
 rUpv8Q9Ay0i/1+Yq/mr9Dj8YdszrtR9HQi0hH/nEO74POkOnrhADZsw8EoSCfRJN
 pP+LNM/OrzCZjnzRwPjdv652DztBWxf9qJABz/F+EFVB5mLJkHNzQgcwMRXCd7VK
 6lftkMzBWm8+uy9nLR9vBMSxifzGuEFARuIomFLM5XuaZoNcpK1xmntuzzPBzINU
 Q8yBU3z9aCyfIBxVQ+dog1gg514aLLdoCK2iWmvAEpwW1T3wwI5972kjdg9pp3t6
 2F0M+v/ClJEkxv9oeblJtiAZMYRXDahJ4uZ+IzbstQ1ouENrXBkr+hcIiMZFJdJB
 8eTPworlU83BlxLvmgKsGXE6VLOpJjFVHcQHhBR1UX0eFnU7/Z3cXzYvSml1EJC6
 kJtK83D28MzcZj1rGZiSwtMAKhyqhaKz1uJHgvpBs2QxE+VP60s=
 =2BAn
 -----END PGP SIGNATURE-----

Merge tag 'mfd-fixes-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD fix from Lee Jones:
 "A single cros_ec_spi fix correcting the handling for long-running
  commands"

* tag 'mfd-fixes-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
  mfd: cros_ec: Retry commands when EC is known to be busy
2018-05-23 08:20:49 -07:00
Linus Torvalds
9ce8654323 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha
Pull alpha fixes from Matt Turner:
 "A few small changes for alpha"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
  alpha: io: reorder barriers to guarantee writeX() and iowriteX() ordering #2
  alpha: simplify get_arch_dma_ops
  alpha: use dma_direct_ops for jensen
2018-05-23 08:18:33 -07:00
Adrian Hunter
22916fdb9c perf kcore_copy: Amend the offset of sections that remap kernel text
x86 PTI entry trampolines all map to the same physical page. If that is
reflected in the program headers of /proc/kcore, then do the same for the
copy of kcore.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-18-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:44 -03:00
Adrian Hunter
a1a3a0624e perf kcore_copy: Copy x86 PTI entry trampoline sections
Identify and copy any sections for x86 PTI entry trampolines.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-17-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:43 -03:00
Adrian Hunter
b4503cdb67 perf kcore_copy: Get rid of kernel_map
In preparation to add more program headers, get rid of kernel_map and
modules_map by moving ->kernel_map and ->modules_map to newly allocated
entries in the ->phdrs list.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-16-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:43 -03:00
Adrian Hunter
d2c959803c perf kcore_copy: Iterate phdrs
In preparation to add more program headers, iterate phdrs instead of
assuming there is only one for the kernel text and one for the modules.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-15-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:42 -03:00
Adrian Hunter
15acef6c37 perf kcore_copy: Layout sections
In preparation to add more program headers, layout the relative offset
of each section.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-14-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:42 -03:00
Adrian Hunter
c9dd1d8949 perf kcore_copy: Calculate offset from phnum
In preparation to add more program headers, calculate offset from the
number of phdrs.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-13-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:41 -03:00
Adrian Hunter
6e97957d3d perf kcore_copy: Keep a count of phdrs
In preparation to add more program headers, keep a count of phdrs.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-12-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:41 -03:00
Adrian Hunter
f683820948 perf kcore_copy: Keep phdr data in a list
Currently, kcore_copy makes 2 program headers, one for the kernel text
(namely kernel_map) and one for the modules (namely modules_map). Now
more program headers are needed, but treating each program header as a
special case results in much more code.

Instead, in preparation to add more program headers, change to keep
program header data (phdr_data) in a list.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-11-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:40 -03:00
Jin Yao
787e4da9f9 perf annotate: Show group event string for stdio
When we enable the group, for tui/stdio2, the output first line includes
the group event string. While for stdio, it will show only one event.

For example,

perf record -e cycles,branches ./div
perf annotate --group --stdio

    Percent |      Source code & Disassembly of div for cycles (44407 samples)
    ......

The first line doesn't include the event 'branches'.

With this patch, it will show the correct group even string.

perf annotate --group --stdio

    Percent |      Source code & Disassembly of div for cycles, branches (44407 samples)
    ......

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1526989115-14435-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:40 -03:00
Adrian Hunter
a8ce99b0ee perf machine: Synthesize and process mmap events for x86 PTI entry trampolines
Like the kernel text, the location of x86 PTI entry trampolines must be
recorded in the perf.data file. Like the kernel, synthesize a mmap event
for that, and add processing for it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-10-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:39 -03:00
Adrian Hunter
1c5aae7710 perf machine: Create maps for x86 PTI entry trampolines
Create maps for x86 PTI entry trampolines, based on symbols found in
kallsyms. It is also necessary to keep track of whether the trampolines
have been mapped particularly when the kernel dso is kcore.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-9-git-send-email-adrian.hunter@intel.com
[ Fix extra_kernel_map_info.cnt designed struct initializer on gcc 4.4.7 (centos:6, etc) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:24:08 -03:00
Brian Norris
11799564fc mfd: cros_ec: Retry commands when EC is known to be busy
Commit 001dde9400 ("mfd: cros ec: spi: Fix "in progress" error
signaling") pointed out some bad code, but its analysis and conclusion
was not 100% correct.

It *is* correct that we should not propagate result==EC_RES_IN_PROGRESS
for transport errors, because this has a special meaning -- that we
should follow up with EC_CMD_GET_COMMS_STATUS until the EC is no longer
busy. This is definitely the wrong thing for many commands, because
among other problems, EC_CMD_GET_COMMS_STATUS doesn't actually retrieve
any RX data from the EC, so commands that expected some data back will
instead start processing junk.

For such commands, the right answer is to either propagate the error
(and return that error to the caller) or resend the original command
(*not* EC_CMD_GET_COMMS_STATUS).

Unfortunately, commit 001dde9400 forgets a crucial point: that for
some long-running operations, the EC physically cannot respond to
commands any more. For example, with EC_CMD_FLASH_ERASE, the EC may be
re-flashing its own code regions, so it can't respond to SPI interrupts.
Instead, the EC prepares us ahead of time for being busy for a "long"
time, and fills its hardware buffer with EC_SPI_PAST_END. Thus, we
expect to see several "transport" errors (or, messages filled with
EC_SPI_PAST_END). So we should really translate that to a retryable
error (-EAGAIN) and continue sending EC_CMD_GET_COMMS_STATUS until we
get a ready status.

IOW, it is actually important to treat some of these "junk" values as
retryable errors.

Together with commit 001dde9400, this resolves bugs like the
following:

1. EC_CMD_FLASH_ERASE now works again (with commit 001dde9400, we
   would abort the first time we saw EC_SPI_PAST_END)
2. Before commit 001dde9400, transport errors (e.g.,
   EC_SPI_RX_BAD_DATA) seen in other commands (e.g.,
   EC_CMD_RTC_GET_VALUE) used to yield junk data in the RX buffer; they
   will now yield -EAGAIN return values, and tools like 'hwclock' will
   simply fail instead of retrieving and re-programming undefined time
   values

Fixes: 001dde9400 ("mfd: cros ec: spi: Fix "in progress" error signaling")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2018-05-23 06:59:00 +01:00
Sinan Kaya
92d7223a74 alpha: io: reorder barriers to guarantee writeX() and iowriteX() ordering #2
memory-barriers.txt has been updated with the following requirement.

"When using writel(), a prior wmb() is not needed to guarantee that the
cache coherent memory writes have completed before writing to the MMIO
region."

Current writeX() and iowriteX() implementations on alpha are not
satisfying this requirement as the barrier is after the register write.

Move mb() in writeX() and iowriteX() functions to guarantee that HW
observes memory changes before performing register operations.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2018-05-22 18:10:36 -07:00
Christoph Hellwig
f5e82fa260 alpha: simplify get_arch_dma_ops
Remove the dma_ops indirection.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2018-05-22 18:10:36 -07:00
Christoph Hellwig
6db615431a alpha: use dma_direct_ops for jensen
The generic dma_direct implementation does the same thing as the alpha
pci-noop implementation, just with more bells and whistles.  And unlike
the current code it at least has a theoretical chance to actually compile.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2018-05-22 18:10:36 -07:00
Adrian Hunter
5759a6820a perf machine: Allow for extra kernel maps
Identify extra kernel maps by name so that they can be distinguished
from the kernel map and module maps.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22 10:59:22 -03:00
Adrian Hunter
4d004365e2 perf machine: Fix map_groups__split_kallsyms() for entry trampoline symbols
When kernel symbols are derived from /proc/kallsyms only (not using
vmlinux or /proc/kcore) map_groups__split_kallsyms() is used. However
that function makes assumptions that are not true with entry trampoline
symbols. For now, remove the entry trampoline symbols at that point, as
they are no longer needed at that point.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22 10:55:59 -03:00
Adrian Hunter
4d99e41365 perf machine: Workaround missing maps for x86 PTI entry trampolines
On x86_64 the PTI entry trampolines are not in the kernel map created by
perf tools. That results in the addresses having no symbols and prevents
annotation.  It also causes Intel PT to have decoding errors at the
trampoline addresses.

Workaround that by creating maps for the trampolines.

At present the kernel does not export information revealing where the
trampolines are.  Until that happens, the addresses are hardcoded.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22 10:54:22 -03:00
Adrian Hunter
9cecca325e perf machine: Add nr_cpus_avail()
Add a function to return the number of the machine's available CPUs.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22 10:52:49 -03:00
Rafael J. Wysocki
c62ec4610c PM / core: Fix direct_complete handling for devices with no callbacks
Commit 08810a4119 (PM / core: Add NEVER_SKIP and SMART_PREPARE
driver flags) inadvertently prevented the power.direct_complete flag
from being set for devices without PM callbacks and with disabled
runtime PM which also prevents power.direct_complete from being set
for their parents.  That led to problems including a resume crash on
HP ZBook 14u.

Restore the previous behavior by causing power.direct_complete to be
set for those devices again, but do that in a more direct way to
avoid overlooking that case in the future.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=199693
Fixes: 08810a4119 (PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags)
Reported-by: Thomas Martitz <kugel@rockbox.org>
Tested-by: Thomas Martitz <kugel@rockbox.org>
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Johan Hovold <johan@kernel.org>
2018-05-22 14:50:11 +02:00
Nicholas Piggin
a048a07d7f powerpc/64s: Add support for a store forwarding barrier at kernel entry/exit
On some CPUs we can prevent a vulnerability related to store-to-load
forwarding by preventing store forwarding between privilege domains,
by inserting a barrier in kernel entry and exit paths.

This is known to be the case on at least Power7, Power8 and Power9
powerpc CPUs.

Barriers must be inserted generally before the first load after moving
to a higher privilege, and after the last store before moving to a
lower privilege, HV and PR privilege transitions must be protected.

Barriers are added as patch sections, with all kernel/hypervisor entry
points patched, and the exit points to lower privilge levels patched
similarly to the RFI flush patching.

Firmware advertisement is not implemented yet, so CPU flush types
are hard coded.

Thanks to Michal Suchánek for bug fixes and review.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michal Suchánek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-21 20:45:31 -07:00
Linus Torvalds
c85061e6e0 SCSI fixes on 20180521
Two driver fixes (zfcp and target core), one information leak in sg
 and one build clean up.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCWwM/JyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishRhZAP4jgQq/
 80r0k5TwaEXxr3zHy+K5ebEQf390FwMxVPGzkQD/RQVOUwzXjDjnd4eIIMMywsj2
 g8TXUIyJeUBUM06XcBc=
 =R7lu
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Two driver fixes (zfcp and target core), one information leak in sg
  and one build clean up"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: sg: allocate with __GFP_ZERO in sg_build_indirect()
  scsi: core: clean up generated file scsi_devinfo_tbl.c
  scsi: target: tcmu: fix error resetting qfull_time_out to default
  scsi: zfcp: fix infinite iteration on ERP ready list
2018-05-21 17:39:32 -07:00
Linus Torvalds
5997aab0a1 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "Assorted fixes all over the place"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  aio: fix io_destroy(2) vs. lookup_ioctx() race
  ext2: fix a block leak
  nfsd: vfs_mkdir() might succeed leaving dentry negative unhashed
  cachefiles: vfs_mkdir() might succeed leaving dentry negative unhashed
  unfuck sysfs_mount()
  kernfs: deal with kernfs_fill_super() failures
  cramfs: Fix IS_ENABLED typo
  befs_lookup(): use d_splice_alias()
  affs_lookup: switch to d_splice_alias()
  affs_lookup(): close a race with affs_remove_link()
  fix breakage caused by d_find_alias() semantics change
  fs: don't scan the inode cache before SB_BORN is set
  do d_instantiate/unlock_new_inode combinations safely
  iov_iter: fix memory leak in pipe_get_pages_alloc()
  iov_iter: fix return type of __pipe_get_pages()
2018-05-21 11:54:57 -07:00
Jeff Layton
eedffa28c9 loop: clear wb_err in bd_inode when detaching backing file
When a loop block device encounters a writeback error, that error will
get propagated to the bd_inode's wb_err field. If we then detach the
backing file from it, attach another and fsync it, we'll get back the
writeback error that we had from the previous backing file.

This is a bit of a grey area as POSIX doesn't cover loop devices, but it
is somewhat counterintuitive.

If we detach a backing file from the loopdev while there are still
unreported errors, take it as a sign that we're no longer interested in
the previous file, and clear out the wb_err in the loop blockdev.

Reported-and-Tested-by: Theodore Y. Ts'o <tytso@mit.edu>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-21 12:36:03 -06:00
Al Viro
baf10564fb aio: fix io_destroy(2) vs. lookup_ioctx() race
kill_ioctx() used to have an explicit RCU delay between removing the
reference from ->ioctx_table and percpu_ref_kill() dropping the refcount.
At some point that delay had been removed, on the theory that
percpu_ref_kill() itself contained an RCU delay.  Unfortunately, that was
the wrong kind of RCU delay and it didn't care about rcu_read_lock() used
by lookup_ioctx().  As the result, we could get ctx freed right under
lookup_ioctx().  Tejun has fixed that in a6d7cff472 ("fs/aio: Add explicit
RCU grace period when freeing kioctx"); however, that fix is not enough.

Suppose io_destroy() from one thread races with e.g. io_setup() from another;
CPU1 removes the reference from current->mm->ioctx_table[...] just as CPU2
has picked it (under rcu_read_lock()).  Then CPU1 proceeds to drop the
refcount, getting it to 0 and triggering a call of free_ioctx_users(),
which proceeds to drop the secondary refcount and once that reaches zero
calls free_ioctx_reqs().  That does
        INIT_RCU_WORK(&ctx->free_rwork, free_ioctx);
        queue_rcu_work(system_wq, &ctx->free_rwork);
and schedules freeing the whole thing after RCU delay.

In the meanwhile CPU2 has gotten around to percpu_ref_get(), bumping the
refcount from 0 to 1 and returned the reference to io_setup().

Tejun's fix (that queue_rcu_work() in there) guarantees that ctx won't get
freed until after percpu_ref_get().  Sure, we'd increment the counter before
ctx can be freed.  Now we are out of rcu_read_lock() and there's nothing to
stop freeing of the whole thing.  Unfortunately, CPU2 assumes that since it
has grabbed the reference, ctx is *NOT* going away until it gets around to
dropping that reference.

The fix is obvious - use percpu_ref_tryget_live() and treat failure as miss.
It's not costlier than what we currently do in normal case, it's safe to
call since freeing *is* delayed and it closes the race window - either
lookup_ioctx() comes before percpu_ref_kill() (in which case ctx->users
won't reach 0 until the caller of lookup_ioctx() drops it) or lookup_ioctx()
fails, ctx->users is unaffected and caller of lookup_ioctx() doesn't see
the object in question at all.

Cc: stable@kernel.org
Fixes: a6d7cff472 "fs/aio: Add explicit RCU grace period when freeing kioctx"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21 14:30:11 -04:00
Al Viro
5aa1437d2d ext2: fix a block leak
open file, unlink it, then use ioctl(2) to make it immutable or
append only.  Now close it and watch the blocks *not* freed...

Immutable/append-only checks belong in ->setattr().
Note: the bug is old and backport to anything prior to 737f2e93b9
("ext2: convert to use the new truncate convention") will need
these checks lifted into ext2_setattr().

Cc: stable@kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21 14:30:11 -04:00
Al Viro
3819bb0d79 nfsd: vfs_mkdir() might succeed leaving dentry negative unhashed
That can (and does, on some filesystems) happen - ->mkdir() (and thus
vfs_mkdir()) can legitimately leave its argument negative and just
unhash it, counting upon the lookup to pick the object we'd created
next time we try to look at that name.

Some vfs_mkdir() callers forget about that possibility...

Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21 14:30:10 -04:00
Al Viro
9c3e9025a3 cachefiles: vfs_mkdir() might succeed leaving dentry negative unhashed
That can (and does, on some filesystems) happen - ->mkdir() (and thus
vfs_mkdir()) can legitimately leave its argument negative and just
unhash it, counting upon the lookup to pick the object we'd created
next time we try to look at that name.

Some vfs_mkdir() callers forget about that possibility...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21 14:30:10 -04:00
Al Viro
7b745a4e40 unfuck sysfs_mount()
new_sb is left uninitialized in case of early failures in kernfs_mount_ns(),
and while IS_ERR(root) is true in all such cases, using IS_ERR(root) || !new_sb
is not a solution - IS_ERR(root) is true in some cases when new_sb is true.

Make sure new_sb is initialized (and matches the reality) in all cases and
fix the condition for dropping kobj reference - we want it done precisely
in those situations where the reference has not been transferred into a new
super_block instance.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21 14:30:09 -04:00
Al Viro
82382acec0 kernfs: deal with kernfs_fill_super() failures
make sure that info->node is initialized early, so that kernfs_kill_sb()
can list_del() it safely.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21 14:30:08 -04:00
Joe Perches
08a8f30868 cramfs: Fix IS_ENABLED typo
There's an extra C here...

Fixes: 99c18ce580 ("cramfs: direct memory access support")
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-21 14:30:08 -04:00