std.crypto has quite a few instances of breaking naming conventions.
This is the beginning of an effort to address that.
Deprecates `std.crypto.utils`.
* Upgrade from u8 to usize element types.
- WebAssembly assumes u64. It should probably try to be target-aware
instead.
* Move the covered PC bits to after the header so it goes on the same
page with the other rapidly changing memory (the header stats).
depends on the semantics of accepted proposal #19755closes#20994
The previous mecanism for linux distributions to delivers debug info into `/usr/lib/debug` no longer seems in use.
the current mecanism often is using `debuginfod` (https://sourceware.org/elfutils/Debuginfod.html)
This commit only tries to load already available debuginfo but does not try to make any download requests.
the user can manually run `debuginfod-find debuginfo PATH` to populate the cache.
- look up the debuglink file in the directory of the executable file (instead of the cwd)
- fix parsing of debuglink section (the 4-byte alignement is within the file, unrelated to the in-memory address)
The signature is documented as:
int link(const char *, const char *);
(see https://man7.org/linux/man-pages/man2/link.2.html or https://man.netbsd.org/link.2)
And its not some Linux extension, the [syscall
implementation](21b136cc63/fs/namei.c (L4794-L4797))
only expects two arguments too.
It probably *should* have a flags parameter, but its too late now.
I am a bit surprised that linking glibc or musl against code that invokes
a 'link' with three parameters doesn't fail (at least, I couldn't get any
local test cases to trigger a compile or link error).
The test case in std/posix/test.zig is currently disabled, but if I
manually enable it, it works with this change.
Due to the `std.crypto.ecdsa.KeyPair.create` taking and optional of seed, even if the seed is generated, cross-compiling to the environments without standard random source (eg. wasm) (`std.crypto.random.bytes`) will fail to compile.
This commit changes the API of the problematic function and moves the random seed generation to a new utility function.
this fix bypasses the slice bounds, reading garbage data for up to the
last 7 bits (which are technically supposed to be ignored). that's going
to need to be fixed, let's fix that along with switching from byte elems
to usize elems.
* libfuzzer: track unique runs instead of deduplicated runs
- easier for consumers to notice when to recheck the covered bits.
* move common definitions to `std.Build.Fuzz.abi`.
build runner sends all the information needed to fuzzer web interface
client needed in order to display inline coverage information along with
source code.
* libfuzzer: close file after mmap
* fuzzer/main.js: connect with EventSource and debug dump the messages.
currently this prints how many fuzzer runs have been attempted to
console.log.
* extract some `std.debug.Info` logic into `std.debug.Coverage`.
Prepares for consolidation across multiple different executables which
share source files, and makes it possible to send all the
PC/SourceLocation mapping data with 4 memcpy'd arrays.
* std.Build.Fuzz:
- spawn a thread to watch the message queue and signal event
subscribers.
- track coverage map data
- respond to /events URL with EventSource messages on a timer
* std.debug.Dwarf: add `sortCompileUnits` along with a field to track
the state for the purpose of assertions and correct API usage.
This makes batch lookups faster.
- in the future, findCompileUnit should be enhanced to rely on sorted
compile units as well.
* implement `std.debug.Dwarf.resolveSourceLocations` as well as
`std.debug.Info.resolveSourceLocations`. It's still pretty slow, since
it calls getLineNumberInfo for each array element, repeating a lot of
work unnecessarily.
* integrate these APIs with `std.Progress` to understand what is taking
so long.
The output I'm seeing from this tool shows a lot of missing source
locations. In particular, the main area of interest is missing for my
tokenizer fuzzing example.
with debug info resolved.
begin efforts of providing `std.debug.Info`, a cross-platform
abstraction for loading debug information into an in-memory format that
supports queries such as "what is the source location of this virtual
memory address?"
Unlike `std.debug.SelfInfo`, this API does not assume the debug
information in question happens to match the host CPU architecture, OS,
or other target properties.
* new .zig-cache subdirectory: 'v'
- stores coverage information with filename of hash of PCs that want
coverage. This hash is a hex encoding of the 64-bit coverage ID.
* build runner
* fixed bug in file system inputs when a compile step has an
overridden zig_lib_dir field set.
* set some std lib options optimized for the build runner
- no side channel mitigations
- no Transport Layer Security
- no crypto fork safety
* add a --port CLI arg for choosing the port the fuzzing web interface
listens on. it defaults to choosing a random open port.
* introduce a web server, and serve a basic single page application
- shares wasm code with autodocs
- assets are created live on request, for convenient development
experience. main.wasm is properly cached if nothing changes.
- sources.tar comes from file system inputs (introduced with the
`--watch` feature)
* receives coverage ID from test runner and sends it on a thread-safe
queue to the WebServer.
* test runner
- takes a zig cache directory argument now, for where to put coverage
information.
- sends coverage ID to parent process
* fuzzer
- puts its logs (in debug mode) in .zig-cache/tmp/libfuzzer.log
- computes coverage_id and makes it available with
`fuzzer_coverage_id` exported function.
- the memory-mapped coverage file is now namespaced by the coverage id
in hex encoding, in `.zig-cache/v`
* tokenizer
- add a fuzz test to check that several properties are upheld
When a unique run is encountered, track it in a bit set memory-mapped
into the fuzz directory so it can be observed by other processes, even
while the fuzzer is running.
This is arbitrary since spirv (as opposed to spirv32/spirv64) refers to the
version with logical memory layout, i.e. no 'real' pointers. This change at
least matches what clang does.
It is impossible to even build projects like glibc when targeting a generic
SPARC v8 CPU; LEON3 is effectively considered the baseline for `sparc-linux-gnu`
now, particularly due to it supporting a CASA instruction similar to the one in
SPARC v9. However, it's slightly incompatible with SPARC v9 due to having a
different ASI tag, so resulting binaries would not be portable to regular SPARC
CPUs. So, as the least bad option, make v9 the baseline for sparc32.
`__xl_a` is just a global variable containing a null function pointer. There's
nothing magical about it or its name at all.
The section names used on `__xl_a` and `__xl_b` (`.CRT$XLA` and `.CRT$XLZ`) are
the real magic here. The compiler emits TLS variables into `.CRT$XL<x>`
sections, where `x` is an uppercase letter between A and Z (exclusive). The
linker then sorts those sections alphabetically (due to the `$`), and the result
is a neat array of TLS initialization callbacks between `__xl_a` and `__xl_z`.
That array is null-terminated, though! Normally, `__xl_z` serves as the null
terminator; however, by pointing `AddressesOfCallBacks` to `__xl_a`, which just
contains a null function pointer, we've effectively made it so that the PE
loader will just immediately stop invoking TLS callbacks. Fix that by pointing
to the first actual TLS callback instead (or `__xl_z` if there are none).
If there is a VDSO, it will have clock_gettime(). The main thing we're concerned
about is architectures that don't have a VDSO at all, of which there are a few.
There are two concepts here: one for whether dwarf supports unwinding on
that target, and another for whether the Zig standard library
implements it yet.
...which have a ucontext_t but not a PC register. The current stack
unwinding implementation does not yet support this architecture.
Also fix name of `std.debug.SelfInfo.openSelf` to remove redundancy.
Also removed this hook into root providing an "openSelfDebugInfo"
function. Sorry, this debugging code is not of sufficient quality to
offer a plugin API right now.
This target triple was weird on multiple levels:
* The `ilp32` ABI is the soft float ABI. This is not the main ABI we want to
support on RISC-V; rather, we want `ilp32d`.
* `gnuilp32` is a bespoke tag that was introduced in Zig. The rest of the world
just uses `gnu` for RISC-V target triples.
* `gnu_ilp32` is already the name of an ILP32 ABI used on AArch64. `gnuilp32` is
too easy to confuse with this.
* We don't use this convention for `riscv64-linux-gnu`.
* Supporting all RISC-V ABIs with this convention will result in combinatorial
explosion; see #20690.
After this commit:
`std.debug.SelfInfo` is a cross-platform abstraction for the current
executable's own debug information, with a goal of minimal code bloat
and compilation speed penalty.
`std.debug.Dwarf` does not assume the current executable is itself the
thing being debugged, however, it does assume the debug info has the
same CPU architecture and OS as the current executable. It is planned to
remove this limitation.
This code has the hard-coded goal of supporting the executable's own
debug information and makes design choices along that goal, such as
memory-mapping the inputs, using dl_iterate_phdr, and doing conditional
compilation on the host target.
A more general-purpose implementation of debug information may be able
to share code with this, but there are some fundamental
incompatibilities. For example, the "SelfInfo" implementation wants to
avoid bloating the binary with PDB on POSIX systems, and likewise DWARF
on Windows systems, while a general-purpose implementation needs to
support both PDB and DWARF from the same binary. It might, for example,
inspect the debug information from a cross-compiled binary.
`SourceLocation` now lives at `std.debug.SourceLocation` and is
documented.
Deprecate `std.debug.runtime_safety` because it returns the optimization
mode of the standard library, when the caller probably wants to use the
optimization mode of their own module.
`std.pdb.Pdb` is moved to `std.debug.Pdb`, mirroring the recent
extraction of `std.debug.Dwarf` from `std.dwarf`.
I have no idea why we have both Module (with a Windows-specific
definition) and WindowsModule. I left some passive aggressive doc
comments to express my frustration.
std.debug.Dwarf is the parsing/decoding logic. std.dwarf remains the
unopinionated types and bits alone.
If you look at this diff you can see a lot less redundancy in
namespaces.
* Rename isPPC() -> isPowerPC32().
* Rename isPPC64() -> isPowerPC64().
* Add new isPowerPC() function which covers both.
There was confusion even in the standard library about what isPPC() meant. This
change makes these functions work how I think most people actually expect them
to work, and makes them consistent with isMIPS(), isSPARC(), etc.
I chose to rename from PPC to PowerPC because 1) it's more consistent with the
other functions, and 2) it'll cause loud rather than silent breakage for anyone
who might have been depending on isPPC() while misunderstanding it.
I pointed a fuzzer at the tokenizer and it crashed immediately. Upon
inspection, I was dissatisfied with the implementation. This commit
removes several mechanisms:
* Removes the "invalid byte" compile error note.
* Dramatically simplifies tokenizer recovery by making recovery always
occur at newlines, and never otherwise.
* Removes UTF-8 validation.
* Moves some character validation logic to `std.zig.parseCharLiteral`.
Removing UTF-8 validation is a regression of #663, however, the existing
implementation was already buggy. When adding this functionality back,
it must be fuzz-tested while checking the property that it matches an
independent Unicode validation implementation on the same file. While
we're at it, fuzzing should check the other properties of that proposal,
such as no ASCII control characters existing inside the source code.
Other changes included in this commit:
* Deprecate `std.unicode.utf8Decode` and its WTF-8 counterpart. This
function has an awkward API that is too easy to misuse.
* Make `utf8Decode2` and friends use arrays as parameters, eliminating a
runtime assertion in favor of using the type system.
After this commit, the crash found by fuzzing, which was
"\x07\xd5\x80\xc3=o\xda|a\xfc{\x9a\xec\x91\xdf\x0f\\\x1a^\xbe;\x8c\xbf\xee\xea"
no longer causes a crash. However, I did not feel the need to add this
test case because the simplified logic eradicates most crashes of this
nature.
This is a fairly small hobby OS that has not seen development in 2 years. Our
current policy is that hobby OSs should use the `other` tag.
https://github.com/zhmu/ananas
What is `sparcel`, you might ask? Good question!
If you take a peek in the SPARC v8 manual, §2.2, it is quite explicit that SPARC
v8 is a big-endian architecture. No little-endian or mixed-endian support to be
found here.
On the other hand, the SPARC v9 manual, in §3.2.1.2, states that it has support
for mixed-endian operation, with big-endian mode being the default.
Ok, so `sparcel` must just be referring to SPARC v9 running in little-endian
mode, surely?
Nope:
* 40b4fd7a3e/llvm/lib/Target/Sparc/SparcTargetMachine.cpp (L226)
* 40b4fd7a3e/llvm/lib/Target/Sparc/SparcTargetMachine.cpp (L104)
So, `sparcel` in LLVM is referring to some sort of fantastical little-endian
SPARC v8 architecture. I've scoured the internet and I can find absolutely no
evidence that such a thing exists or has ever existed. In fact, I can find no
evidence that a little-endian implementation of SPARC v9 ever existed, either.
Or any SPARC version, actually!
The support was added here: https://reviews.llvm.org/D8741
Notably, there is no mention whatsoever of what CPU this might be referring to,
and no justification given for the "but some are little" comment added in the
patch.
My best guess is that this might have been some private exercise in creating a
little-endian version of SPARC that never saw the light of day. Given that SPARC
v8 explicitly doesn't support little-endian operation (let alone little-endian
instruction encoding!), and no CPU is known to be implemented as such, I think
it's very reasonable for us to just remove this support.
* Elaborate on the sub-variants of Variant I.
* Clarify the use of the TCB term.
* Rename a bunch of stuff to be more accurate/descriptive.
* Follow Zig's style around namespacing more.
* Use a structure for the ABI TCB.
No functional change intended.
The code would cause LLVM to emit a jump table for the switch in the loop over
the dynamic tags. That jump table was far enough away that the compiler decided
to go through the GOT, which would of course break at this early stage as we
haven't applied MIPS's local GOT relocations yet, nor can we until we've walked
through the _DYNAMIC array.
The first attempt at rewriting this used code like this:
var sorted_dynv = [_]elf.Addr{0} ** elf.DT_NUM;
But this is also problematic as it results in a memcpy() call. Instead, we
explicitly initialize it to undefined and use a loop of volatile stores to
clear it.
In https://github.com/ziglang/zig/pull/19641, this binding changed from `[*]u16` to `LPWSTR` which made it a sentinel-terminated pointer. This introduced a compiler error in the `std.os.windows.GetModuleFileNameW` wrapper since it takes a `[*]u16` pointer. This commit changes the binding back to what it was before instead of introducing a breaking change to `std.os.windows.GetModuleFileNameW`
Related: https://github.com/ziglang/zig/issues/20858
This does not completely ignore static asserts - they are validated by aro
during parsing; any failures will render an error and non-zero exit code.
Emit a warning comment that _Static_asserts are not translated - this
matches the behavior of the existing clang-based translate-c.
Aro currently does not store source locations for _Static_assert
declarations so I've hard-coded token index 0 for now.
Accesses to this global variable can require relocations on some platforms (e.g.
MIPS). If we do it before PIE relocations have been applied, we'll crash.
It's actually important for the ABI that r25 (t9) contains the address of the
called function, so that this standard prologue sequence works:
lui $2, %hi(_gp_disp)
addiu $2, $2, %lo(_gp_disp)
addu $gp, $2, $t9
(This is a bit similar to the ToC situation on powerpc that was fixed in
7bc78967b400322a0fc5651f37a1b0428c37fb9d.)
statx() is strictly superior to stat() and friends. We can do this because the
standard library declares Linux 4.19 to be the minimum version supported in
std.Target. This is also necessary on riscv32 where there is only statx().
While here, I improved std.fs.File.metadata() to gather as much information as
possible when calling statx() since that is the expectation from this particular
API.
This is kind of a hack because the timespec in UAPI headers is actually still
32-bit while __kernel_timespec is 64-bit. But, importantly, all the syscalls
take __kernel_timespec from the get-go (because riscv32 support is so recent).
Defining our timespec this way will allow all the syscall wrappers in
std.os.linux to do the right thing for riscv32. For other 32-bit architectures,
we have to use the 64-bit time syscalls explicitly to solve the Y2038 problem.
loongarch64 syscalls not updated because it seems like that kernel port hasn't
been working for a year or so:
In file included from arch/loongarch/include/uapi/asm/unistd.h:5:
include/uapi/asm-generic/unistd.h:2:10: fatal error: 'asm/bitsperlong.h' file not found
That file is just missing from the tree. 🤷
Deprecates std.fs.atomicSymLink and removes the allocator requirement
from the new std.fs.Dir.atomicSymLink. Replaces the two usages of this
within std.
I did not include the TODOs from the original code that were based
off of `switch (err) { ..., else => return err }` not having correct
inference that cases handled in `...` are impossible in the error
union return type because these are not specified in many places but
I can add them back if wanted.
Thank you @squeek502 for help with fixing buffer overflows!
The core functionalities are now in two general functions
`extremeInSubtreeOnDirection()` and `nextOnDirection()` so all the other
traversing functions (`getMin()`, `getMax()`, and `InorderIterator`) are
all just trivial calls to these core functions.
The added two functions `Node.next()` and `Node.prev()` are also just
trivial calls to these.
* std.Treap traversal direction: use u1 instead of usize.
* Treap: fix getMin() and getMax(), and add tests for them.
This is a misfeature that we inherited from LLVM:
* https://reviews.llvm.org/D61259
* https://reviews.llvm.org/D61939
(`aarch64_32` and `arm64_32` are equivalent.)
I truly have no idea why this triple passed review in LLVM. It is, to date, the
*only* tag in the architecture component that is not, in fact, an architecture.
In reality, it is just an ILP32 ABI for AArch64 (*not* AArch32).
The triples that use `aarch64_32` look like `aarch64_32-apple-watchos`. Yes,
that triple is exactly what you think; it has no ABI component. They really,
seriously did this.
Since only Apple could come up with silliness like this, it should come as no
surprise that no one else uses `aarch64_32`. Later on, a GNU ILP32 ABI for
AArch64 was developed, and support was added to LLVM:
* https://reviews.llvm.org/D94143
* https://reviews.llvm.org/D104931
Here, sanity seems to have prevailed, and a triple using this ABI looks like
`aarch64-linux-gnu_ilp32` as you would expect.
As can be seen from the diffs in this commit, there was plenty of confusion
throughout the Zig codebase about what exactly `aarch64_32` was. So let's just
remove it. In its place, we'll use `aarch64-watchos-ilp32`,
`aarch64-linux-gnuilp32`, and so on. We'll then translate these appropriately
when talking to LLVM. Hence, this commit adds the `ilp32` ABI tag (we already
have `gnuilp32`).
Contributes to #15607
Although the case is not handled in `openatWasi` (as I could not get a
working wasi environment to test the change) I have added a FIXME
addressing it and linking to the issue.
with this rewrite we can call functions inside of
inline assembly, enabling us to use the default start.zig logic
all that's left is to implement lr/sc loops for atomically manipulating
1 and 2 byte values, after which we can use the segfault handler logic.
the risc-v backend doesn't have `@cmpxchg*` implemented and so it can't use the hint that the current page-allocator uses.
this work-around branch can be removed when I implement the atomic built-in.
The flag makes compiler_rt and libfuzzer be in debug mode.
Also:
* fuzzer: override debug logs and disable debug logs for frequently
called functions
* std.Build.Fuzz: fix bug of rerunning the old unit test binary
* report errors from rebuilding the unit tests better
* link.Elf: additionally add tsan lib and fuzzer lib to the hash
This flag makes the build runner rebuild unit tests after the pipeline
finishes, if it finds any unit tests.
I did not make this integrate with file system watching yet.
The test runner is updated to detect which tests are fuzz tests.
Run step is updated to track which test indexes are fuzz tests.
Switches from using r1 as a temporary to r2. That way, we don't have to set the
`noat` assembler option. (r1 is the scratch register used by the assembler's
pseudoinstructions; the assembler warns when code uses that register explicitly
without `noat` set.)
PR [19271](https://github.com/ziglang/zig/pull/19271) added some static function implementations from kernel32, but some parts of the library still used the dynamically loaded versions.
* Add -f(no-)sanitize-coverage-trace-pc-guard CLI flag which defaults to
off. This value lowers to TracePCGuard = true (LLVM backend) and -Xclang
-fsanitize-coverage-trace-pc-guard. These settings are not
automatically included with -ffuzz.
* Add `Build.Step.Compile` flag for sanitize_coverage_trace_pc_guard
with appropriate documentation.
* Add `zig cc` integration for the respective flags.
* Avoid crashing in ELF linker code when -ffuzz -femit-llvm-ir used
together.
PR https://github.com/ziglang/zig/pull/20679 ("std.c reorganization")
switched feature-detection code to use "T != void" checks in place of
"@hasDecl". However, the std.posix.system struct is empty, so
compile-time feature detection against symbols in there (specifically
`std.posix.system.ucontext_t` in this case), fail at compile time on
freestanding targets.
This PR adds a void ucontext_t into the std.posix.system default.
This PR also adds pseudo-"freestanding" variation of the StackIterator
"unwind" test. It is sort of hacky (its freestanding, but assumes it can
invoke a Linux exit syscall), but it does detect this problem.
Fixes#20710
This prevents it from trying to access thread local storage before it
has set up thread local storage, particularly when code coverage
instrumentation is enabled.
* Add the `-ffuzz` and `-fno-fuzz` CLI arguments.
* Detect fuzz testing flags from zig cc.
* Set the correct clang flags when fuzz testing is requested. It can be
combined with TSAN and UBSAN.
* Compilation: build fuzzer library when needed which is currently an
empty zig file.
* Add optforfuzzing to every function in the llvm backend for modules
that have requested fuzzing.
* In ZigLLVMTargetMachineEmitToFile, add the optimization passes for
sanitizer coverage.
* std.mem.eql uses a naive implementation optimized for fuzzing when
builtin.fuzz is true.
Tracked by #20702
In the case that the allocator is unavailable (OOM, etc.), we can
possibly still output the panic message - so now we stack allocate the
message and copy it to the exit data for passing to boot services.