Commit Graph

30384 Commits

Author SHA1 Message Date
Peter Zijlstra
4abff6d48d objtool: Fix code relocs vs weak symbols
Occasionally objtool driven code patching (think .static_call_sites
.retpoline_sites etc..) goes sideways and it tries to patch an
instruction that doesn't match.

Much head-scatching and cursing later the problem is as outlined below
and affects every section that objtool generates for us, very much
including the ORC data. The below uses .static_call_sites because it's
convenient for demonstration purposes, but as mentioned the ORC
sections, .retpoline_sites and __mount_loc are all similarly affected.

Consider:

foo-weak.c:

  extern void __SCT__foo(void);

  __attribute__((weak)) void foo(void)
  {
	  return __SCT__foo();
  }

foo.c:

  extern void __SCT__foo(void);
  extern void my_foo(void);

  void foo(void)
  {
	  my_foo();
	  return __SCT__foo();
  }

These generate the obvious code
(gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -c foo*.c):

foo-weak.o:
0000000000000000 <foo>:
   0:   e9 00 00 00 00          jmpq   5 <foo+0x5>      1: R_X86_64_PLT32       __SCT__foo-0x4

foo.o:
0000000000000000 <foo>:
   0:   48 83 ec 08             sub    $0x8,%rsp
   4:   e8 00 00 00 00          callq  9 <foo+0x9>      5: R_X86_64_PLT32       my_foo-0x4
   9:   48 83 c4 08             add    $0x8,%rsp
   d:   e9 00 00 00 00          jmpq   12 <foo+0x12>    e: R_X86_64_PLT32       __SCT__foo-0x4

Now, when we link these two files together, you get something like
(ld -r -o foos.o foo-weak.o foo.o):

foos.o:
0000000000000000 <foo-0x10>:
   0:   e9 00 00 00 00          jmpq   5 <foo-0xb>      1: R_X86_64_PLT32       __SCT__foo-0x4
   5:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)
   f:   90                      nop

0000000000000010 <foo>:
  10:   48 83 ec 08             sub    $0x8,%rsp
  14:   e8 00 00 00 00          callq  19 <foo+0x9>     15: R_X86_64_PLT32      my_foo-0x4
  19:   48 83 c4 08             add    $0x8,%rsp
  1d:   e9 00 00 00 00          jmpq   22 <foo+0x12>    1e: R_X86_64_PLT32      __SCT__foo-0x4

Noting that ld preserves the weak function text, but strips the symbol
off of it (hence objdump doing that funny negative offset thing). This
does lead to 'interesting' unused code issues with objtool when ran on
linked objects, but that seems to be working (fingers crossed).

So far so good.. Now lets consider the objtool static_call output
section (readelf output, old binutils):

foo-weak.o:

Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foo.o:

Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + d
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foos.o:

Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 1d
000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

So we have two patch sites, one in the dead code of the weak foo and one
in the real foo. All is well.

*HOWEVER*, when the toolchain strips unused section symbols it
generates things like this (using new enough binutils):

foo-weak.o:

Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foo.o:

Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + d
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foos.o:

Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 foo + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 foo + d
000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

And now we can see how that foos.o .static_call_sites goes side-ways, we
now have _two_ patch sites in foo. One for the weak symbol at foo+0
(which is no longer a static_call site!) and one at foo+d which is in
fact the right location.

This seems to happen when objtool cannot find a section symbol, in which
case it falls back to any other symbol to key off of, however in this
case that goes terribly wrong!

As such, teach objtool to create a section symbol when there isn't
one.

Fixes: 44f6a7c075 ("objtool: Fix seg fault with Clang non-section symbols")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20220419203807.655552918@infradead.org
2022-04-22 12:13:55 +02:00
Peter Zijlstra
c087c6e7b5 objtool: Fix type of reloc::addend
Elf{32,64}_Rela::r_addend is of type: Elf{32,64}_Sword, that means
that our reloc::addend needs to be long or face tuncation issues when
we do elf_rebuild_reloc_section():

  - 107:  48 b8 00 00 00 00 00 00 00 00   movabs $0x0,%rax        109: R_X86_64_64        level4_kernel_pgt+0x80000067
  + 107:  48 b8 00 00 00 00 00 00 00 00   movabs $0x0,%rax        109: R_X86_64_64        level4_kernel_pgt-0x7fffff99

Fixes: 627fce1480 ("objtool: Add ORC unwind table generation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20220419203807.596871927@infradead.org
2022-04-22 12:13:55 +02:00
Paolo Abeni
f70925bf99 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ethernet/microchip/lan966x/lan966x_main.c
  d08ed85256 ("net: lan966x: Make sure to release ptp interrupt")
  c834963932 ("net: lan966x: Add FDMA functionality")

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-22 09:56:00 +02:00
Takashi Iwai
bc67cac103 selftests: firmware: Add ZSTD compressed file tests
It's similar like XZ compressed files.  For the simplicity, both XZ
and ZSTD tests are done in a single function.  The format is specified
via $COMPRESS_FORMAT and the compression function is pre-defined.

Link: https://lore.kernel.org/r/20210127154939.13288-5-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220421152908.4718-6-tiwai@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 08:51:17 +02:00
Takashi Iwai
f18b45ff9a selftests: firmware: Simplify test patterns
The test patterns are almost same in three sequential tests.
Make the unified helper function for improving the readability.

Link: https://lore.kernel.org/all/20210127154939.13288-1-tiwai@suse.de/
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220421152908.4718-5-tiwai@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 08:51:17 +02:00
Takashi Iwai
04c826d072 selftests: firmware: Fix the request_firmware_into_buf() test for XZ format
The test uses a different firmware name, and we forgot to adapt for
the XZ compressed file tests.

https://lore.kernel.org/all/20210127154939.13288-1-tiwai@suse.de/

Fixes: 1798045900 ("selftests: firmware: Add request_firmware_into_buf tests")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220421152908.4718-4-tiwai@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 08:51:17 +02:00
Takashi Iwai
b3625b1324 selftests: firmware: Use smaller dictionary for XZ compression
The xz -9 option leads to an unnecessarily too large dictionary that
isn't really suitable for the kernel firmware loader.  Pass the
dictionary size explicitly, instead.

While we're at it, make the xz command call defined in $RUN_XZ for
simplicity.

Fixes: 108ae07c50 ("selftests: firmware: Add compressed firmware tests")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220421152908.4718-3-tiwai@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 08:51:16 +02:00
Sidhartha Kumar
80df2fb95d selftest/vm: add skip support to mremap_test
Allow the mremap test to be skipped due to errors such as failing to
parse the mmap_min_addr sysctl.

Link: https://lkml.kernel.org/r/20220420215721.4868-4-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21 20:01:10 -07:00
Sidhartha Kumar
e5508fc52c selftest/vm: support xfail in mremap_test
Use ksft_test_result_xfail for the tests which are expected to fail.

Link: https://lkml.kernel.org/r/20220420215721.4868-3-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21 20:01:10 -07:00
Sidhartha Kumar
18d609daa5 selftest/vm: verify remap destination address in mremap_test
Because mremap does not have a MAP_FIXED_NOREPLACE flag, it can destroy
existing mappings.  This causes a segfault when regions such as text are
remapped and the permissions are changed.

Verify the requested mremap destination address does not overlap any
existing mappings by using mmap's MAP_FIXED_NOREPLACE flag.  Keep
incrementing the destination address until a valid mapping is found or
fail the current test once the max address is reached.

Link: https://lkml.kernel.org/r/20220420215721.4868-2-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21 20:01:10 -07:00
Sidhartha Kumar
9c85a9bae2 selftest/vm: verify mmap addr in mremap_test
Avoid calling mmap with requested addresses that are less than the
system's mmap_min_addr.  When run as root, mmap returns EACCES when
trying to map addresses < mmap_min_addr.  This is not one of the error
codes for the condition to retry the mmap in the test.

Rather than arbitrarily retrying on EACCES, don't attempt an mmap until
addr > vm.mmap_min_addr.

Add a munmap call after an alignment check as the mappings are retained
after the retry and can reach the vm.max_map_count sysctl.

Link: https://lkml.kernel.org/r/20220420215721.4868-1-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21 20:01:09 -07:00
Paolo Bonzini
e852be8b14 kvm: selftests: introduce and use more page size-related constants
Clean up code that was hardcoding masks for various fields,
now that the masks are included in processor.h.

For more cleanup, define PAGE_SIZE and PAGE_MASK just like in Linux.
PAGE_SIZE in particular was defined by several tests.

Suggested-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-21 15:41:01 -04:00
Paolo Bonzini
f18b4aebe1 kvm: selftests: do not use bitfields larger than 32-bits for PTEs
Red Hat's QE team reported test failure on access_tracking_perf_test:

Testing guest mode: PA-bits:ANY, VA-bits:48,  4K pages
guest physical test memory offset: 0x3fffbffff000

Populating memory             : 0.684014577s
Writing to populated memory   : 0.006230175s
Reading from populated memory : 0.004557805s
==== Test Assertion Failure ====
  lib/kvm_util.c:1411: false
  pid=125806 tid=125809 errno=4 - Interrupted system call
     1  0x0000000000402f7c: addr_gpa2hva at kvm_util.c:1411
     2   (inlined by) addr_gpa2hva at kvm_util.c:1405
     3  0x0000000000401f52: lookup_pfn at access_tracking_perf_test.c:98
     4   (inlined by) mark_vcpu_memory_idle at access_tracking_perf_test.c:152
     5   (inlined by) vcpu_thread_main at access_tracking_perf_test.c:232
     6  0x00007fefe9ff81ce: ?? ??:0
     7  0x00007fefe9c64d82: ?? ??:0
  No vm physical memory at 0xffbffff000

I can easily reproduce it with a Intel(R) Xeon(R) CPU E5-2630 with 46 bits
PA.

It turns out that the address translation for clearing idle page tracking
returned a wrong result; addr_gva2gpa()'s last step, which is based on
"pte[index[0]].pfn", did the calculation with 40 bits length and the
high 12 bits got truncated.  In above case the GPA address to be returned
should be 0x3fffbffff000 for GVA 0xc0000000, but it got truncated into
0xffbffff000 and the subsequent gpa2hva lookup failed.

The width of operations on bit fields greater than 32-bit is
implementation defined, and differs between GCC (which uses the bitfield
precision) and clang (which uses 64-bit arithmetic), so this is a
potential minefield.  Remove the bit fields and using manual masking
instead.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2075036
Reported-by: Nana Liu <nanliu@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-21 15:41:01 -04:00
Linus Torvalds
59f0c2447e Merge tag 'net-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from xfrm and can.

  Current release - regressions:

   - rxrpc: restore removed timer deletion

  Current release - new code bugs:

   - gre: fix device lookup for l3mdev use-case

   - xfrm: fix egress device lookup for l3mdev use-case

  Previous releases - regressions:

   - sched: cls_u32: fix netns refcount changes in u32_change()

   - smc: fix sock leak when release after smc_shutdown()

   - xfrm: limit skb_page_frag_refill use to a single page

   - eth: atlantic: invert deep par in pm functions, preventing null
     derefs

   - eth: stmmac: use readl_poll_timeout_atomic() in atomic state

  Previous releases - always broken:

   - gre: fix skb_under_panic on xmit

   - openvswitch: fix OOB access in reserve_sfa_size()

   - dsa: hellcreek: calculate checksums in tagger

   - eth: ice: fix crash in switchdev mode

   - eth: igc:
      - fix infinite loop in release_swfw_sync
      - fix scheduling while atomic"

* tag 'net-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits)
  drivers: net: hippi: Fix deadlock in rr_close()
  selftests: mlxsw: vxlan_flooding_ipv6: Prevent flooding of unwanted packets
  selftests: mlxsw: vxlan_flooding: Prevent flooding of unwanted packets
  nfc: MAINTAINERS: add Bug entry
  net: stmmac: Use readl_poll_timeout_atomic() in atomic state
  doc/ip-sysctl: add bc_forwarding
  netlink: reset network and mac headers in netlink_dump()
  net: mscc: ocelot: fix broken IP multicast flooding
  net: dsa: hellcreek: Calculate checksums in tagger
  net: atlantic: invert deep par in pm functions, preventing null derefs
  can: isotp: stop timeout monitoring when no first frame was sent
  bonding: do not discard lowest hash bit for non layer3+4 hashing
  net: lan966x: Make sure to release ptp interrupt
  ipv6: make ip6_rt_gc_expire an atomic_t
  net: Handle l3mdev in ip_tunnel_init_flow
  l3mdev: l3mdev_master_upper_ifindex_by_index_rcu should be using netdev_master_upper_dev_get_rcu
  net/sched: cls_u32: fix possible leak in u32_init_knode()
  net/sched: cls_u32: fix netns refcount changes in u32_change()
  powerpc: Update MAINTAINERS for ibmvnic and VAS
  net: restore alpha order to Ethernet devices in config
  ...
2022-04-21 12:29:08 -07:00
Thomas Huth
266a19a0bc KVM: selftests: Silence compiler warning in the kvm_page_table_test
When compiling kvm_page_table_test.c, I get this compiler warning
with gcc 11.2:

kvm_page_table_test.c: In function 'pre_init_before_test':
../../../../tools/include/linux/kernel.h:44:24: warning: comparison of
 distinct pointer types lacks a cast
   44 |         (void) (&_max1 == &_max2);              \
      |                        ^~
kvm_page_table_test.c:281:21: note: in expansion of macro 'max'
  281 |         alignment = max(0x100000, alignment);
      |                     ^~~

Fix it by adjusting the type of the absolute value.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <20220414103031.565037-1-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-21 13:16:14 -04:00
Gaosheng Cui
b71a2ebf74 libbpf: Remove redundant non-null checks on obj_elf
Obj_elf is already non-null checked at the function entry, so remove
redundant non-null checks on obj_elf.

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220421031803.2283974-1-cuigaosheng1@huawei.com
2022-04-21 09:56:26 -07:00
Artem Savkov
c14766a8a8 selftests/bpf: Fix map tests errno checks
Switching to libbpf 1.0 API broke test_lpm_map and test_lru_map as error
reporting changed. Instead of setting errno and returning -1 bpf calls
now return -Exxx directly.
Drop errno checks and look at return code directly.

Fixes: b858ba8c52 ("selftests/bpf: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK")
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/bpf/20220421094320.1563570-1-asavkov@redhat.com
2022-04-21 09:51:57 -07:00
Artem Savkov
6a12b8e20d selftests/bpf: Fix prog_tests uprobe_autoattach compilation error
I am getting the following compilation error for prog_tests/uprobe_autoattach.c:

  tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c: In function ‘test_uprobe_autoattach’:
  ./test_progs.h:209:26: error: pointer ‘mem’ may be used after ‘free’ [-Werror=use-after-free]

The value of mem is now used in one of the asserts, which is why it may be
confusing compilers. However, it is not dereferenced. Silence this by moving
free(mem) after the assert block.

Fixes: 1717e24801 ("selftests/bpf: Uprobe tests should verify param/return values")
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220421132317.1583867-1-asavkov@redhat.com
2022-04-21 18:48:04 +02:00
Artem Savkov
920fd5e177 selftests/bpf: Fix attach tests retcode checks
Switching to libbpf 1.0 API broke test_sock and test_sysctl as they
check for return of bpf_prog_attach to be exactly -1. Switch the check
to '< 0' instead.

Fixes: b858ba8c52 ("selftests/bpf: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK")
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/bpf/20220421130104.1582053-1-asavkov@redhat.com
2022-04-21 16:34:55 +02:00
Grant Seltzer
a66ab9a9e6 libbpf: Add documentation to API functions
This adds documentation for the following API functions:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()
- bpf_program__set_attach_target()
- bpf_program__attach()
- bpf_program__pin()
- bpf_program__unpin()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-3-grantseltzer@gmail.com
2022-04-21 16:31:07 +02:00
Grant Seltzer
df28671632 libbpf: Update API functions usage to check error
This updates usage of the following API functions within
libbpf so their newly added error return is checked:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-2-grantseltzer@gmail.com
2022-04-21 16:28:25 +02:00
Grant Seltzer
93442f132b libbpf: Add error returns to two API functions
This adds an error return to the following API functions:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()

In both cases, the error occurs when the BPF object has
already been loaded when the function is called. In this
case -EBUSY is returned.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-1-grantseltzer@gmail.com
2022-04-21 16:28:11 +02:00
Ammar Faizi
11dbdaeff4 tools/nolibc/string: Implement strdup() and strndup()
These functions are currently only available on architectures that have
my_syscall6() macro implemented. Since these functions use malloc(),
malloc() uses mmap(), mmap() depends on my_syscall6() macro.

On architectures that don't support my_syscall6(), these function will
always return NULL with errno set to ENOSYS.

Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
b26823c19a tools/nolibc/string: Implement strnlen()
size_t strnlen(const char *str, size_t maxlen);

The strnlen() function returns the number of bytes in the string
pointed to by sstr, excluding the terminating null byte ('\0'), but at
most maxlen. In doing this, strnlen() looks only at the first maxlen
characters in the string pointed to by str and never beyond str[maxlen-1].

The first use case of this function is for determining the memory
allocation size in the strndup() function.

Link: https://lore.kernel.org/lkml/CAOG64qMpEMh+EkOfjNdAoueC+uQyT2Uv3689_sOr37-JxdJf4g@mail.gmail.com
Suggested-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
0e0ff63840 tools/nolibc/stdlib: Implement malloc(), calloc(), realloc() and free()
Implement basic dynamic allocator functions. These functions are
currently only available on architectures that have nolibc mmap()
syscall implemented. These are not a super-fast memory allocator,
but at least they can satisfy basic needs for having heap without
libc.

Cc: David Laight <David.Laight@ACULAB.COM>
Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
5a18d07ce3 tools/nolibc/types: Implement offsetof() and container_of() macro
Implement `offsetof()` and `container_of()` macro. The first use case
of these macros is for `malloc()`, `realloc()` and `free()`.

Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
544fa1a2d3 tools/nolibc/sys: Implement mmap() and munmap()
Implement mmap() and munmap(). Currently, they are only available for
architecures that have my_syscall6 macro. For architectures that don't
have, this function will return -1 with errno set to ENOSYS (Function
not implemented).

This has been tested on x86 and i386.

Notes for i386:
 1) The common mmap() syscall implementation uses __NR_mmap2 instead
    of __NR_mmap.

 2) The offset must be shifted-right by 12-bit.

Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
f4738ff74c tools/nolibc: i386: Implement syscall with 6 arguments
On i386, the 6th argument of syscall goes in %ebp. However, both Clang
and GCC cannot use %ebp in the clobber list and in the "r" constraint
without using -fomit-frame-pointer. To make it always available for
any kind of compilation, the below workaround is implemented.

  1) Push the 6-th argument.
  2) Push %ebp.
  3) Load the 6-th argument from 4(%esp) to %ebp.
  4) Do the syscall (int $0x80).
  5) Pop %ebp (restore the old value of %ebp).
  6) Add %esp by 4 (undo the stack pointer).

Cc: x86@kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/lkml/2e335ac54db44f1d8496583d97f9dab0@AcuMS.aculab.com
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
1590c59836 tools/nolibc: Remove .global _start from the entry point code
Building with clang yields the following error:
```
  <inline asm>:3:1: error: _start changed binding to STB_GLOBAL
  .global _start
  ^
  1 error generated.
```
Make sure only specify one between `.global _start` and `.weak _start`.
Remove `.global _start`.

Cc: llvm@lists.linux.dev
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
37d62758e7 tools/nolibc: Replace asm with __asm__
Replace `asm` with `__asm__` to support compilation with -std flag.
Using `asm` with -std flag makes GCC think `asm()` is a function call
instead of an inline assembly.

GCC doc says:

  For the C language, the `asm` keyword is a GNU extension. When
  writing C code that can be compiled with `-ansi` and the `-std`
  options that select C dialects without GNU extensions, use
  `__asm__` instead of `asm`.

Link: https://gcc.gnu.org/onlinedocs/gcc/Basic-Asm.html
Reported-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
5312aaa5d5 tools/nolibc: x86-64: Update System V ABI document link
The old link no longer works, update it.

Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Willy Tarreau
2475d37ac3 tools/nolibc/stdlib: only reference the external environ when inlined
When building with gcc at -O0 we're seeing link errors due to the
"environ" variable being referenced by getenv(). The problem is that
at -O0 gcc will not inline getenv() and will not drop the external
reference. One solution would be to locally declare the variable as
weak, but then it would appear in all programs even those not using
it, and would be confusing to users of getenv() who would forget to
set environ to envp.

An alternate approach used in this patch consists in always inlining
the outer part of getenv() that references this extern so that it's
always dropped when not used. The biggest part of the function was
now moved to a new function called _getenv() that's still not inlined
by default.

Reported-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Tested-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Willy Tarreau
96980b833a tools/nolibc/string: do not use __builtin_strlen() at -O0
clang wants to use strlen() for __builtin_strlen() at -O0. We don't
really care about -O0 but it at least ought to build, so let's make
sure we don't choke on this, by dropping the optimizationn for
constant strings in this case.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Willy Tarreau
0b37dff10b tools/nolibc: add the nolibc subdir to the common Makefile
The Makefile in tools/ is used to forward options to the makefiles
in the various subdirs. Let's add nolibc there so that it becomes
possible to make tools/nolibc_headers_standalone from the main tree
to simply create a completely usable sysroot.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Willy Tarreau
2432616468 tools/nolibc: add a makefile to install headers
This provides a target "headers_standalone" which installs the nolibc's
arch-specific headers with "arch.h" taken from the current arch (or a
concatenation of both i386 and x86_64 for arch=x86), then installs
kernel headers. This creates a convenient sysroot which is directly
usable by a bare-metal compiler to create any executable.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Willy Tarreau
96d2a1313f tools/nolibc/types: add poll() and waitpid() flag definitions
- POLLIN etc were missing, so poll() could only be used with timeouts.
- WNOHANG was not defined and is convenient to check if a child is still
  running

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
54abe3590f tools/nolibc/sys: add syscall definition for getppid()
This is essentially for completeness as it's not the most often used
in regtests.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
0e7b492943 tools/nolibc/string: add strcmp() and strncmp()
We need these functions all the time, including when checking environment
variables and parsing command-line arguments. These implementations were
optimized to show optimal code size on a wide range of compilers (22 bytes
return included for strcmp(), 33 for strncmp()).

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
bd845a193a tools/nolibc/stdio: add support for '%p' to vfprintf()
%p remains quite useful in test code, and the code path can easily be
merged with the existing "%x" thus only adds ~50 bytes, thus let's
add it.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
077d0a3924 tools/nolibc/stdlib: add a simple getenv() implementation
This implementation relies on an extern definition of the environ
variable, that the caller must declare and initialize from envp.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
170b230d22 tools/nolibc/stdio: make printf(%s) accept NULL
It's often convenient to support this, especially in test programs where
a NULL may correspond to an allocation error or a non-existing value.
Let's make printf("%s") support being passed a NULL. In this case it
prints "(null)" like glibc's printf().

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
f0f04f28d5 tools/nolibc/stdlib: implement abort()
libgcc uses it for certain divide functions, so it must be exported. Like
for memset() we do that in its own section so that the linker can strip
it when not needed.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
c4486e9728 tools/nolibc: also mention how to build by just setting the include path
Now that a few basic include files are provided, some simple portable
programs may build, which will save them from having to surround their
includes with #ifndef NOLIBC. This patch mentions how to proceed, and
enumerates the list of files that are covered.

A comprehensive list of required include files is available here:

  https://en.cppreference.com/w/c/header

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
cec1505321 tools/nolibc/time: create time.h with time()
The time() syscall is used by a few simple applications, and is trivial
to implement based on gettimeofday() that we already have. Let's create
the file to ease porting and provide the function. It never returns any
error, though it may segfault in case of invalid pointer, like other
implementations relying on gettimeofday().

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
99cb50ab94 tools/nolibc/signal: move raise() to signal.h
This function is normally found in signal.h, and providing the file
eases porting of existing programs. Let's move it there.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
180a9797b0 tools/nolibc/unistd: add usleep()
This call is trivial to implement based on select() to complete sleep()
and msleep(), let's add it.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
4619de3446 tools/nolibc/unistd: extract msleep(), sleep(), tcsetpgrp() to unistd.h
These functions are normally provided by unistd.h. For ease of porting,
let's create the file and move them there.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
45a794bf7c tools/nolibc/errno: extract errno.h from sys.h
This allows us to provide a minimal errno.h to ease porting applications
that use it.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
8d304a3740 tools/nolibc/string: export memset() and memmove()
"clang -Os" and "gcc -Ofast" without -ffreestanding may ignore memset()
and memmove(), hoping to provide their builtin equivalents, and finally
not find them. Thus we must export these functions for these rare cases.
Note that as they're set in their own sections, they will be eliminated
by the linker if not used. In addition, they do not prevent gcc from
identifying them and replacing them with the shorter "rep movsb" or
"rep stosb" when relevant.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
023033fe34 tools/nolibc/types: define PATH_MAX and MAXPATHLEN
These ones are often used and commonly set by applications to fallback
values. Let's fix them both to agree on PATH_MAX=4096 by default, as is
already present in linux/limits.h.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00