The core functionalities are now in two general functions
`extremeInSubtreeOnDirection()` and `nextOnDirection()` so all the other
traversing functions (`getMin()`, `getMax()`, and `InorderIterator`) are
all just trivial calls to these core functions.
The added two functions `Node.next()` and `Node.prev()` are also just
trivial calls to these.
* std.Treap traversal direction: use u1 instead of usize.
* Treap: fix getMin() and getMax(), and add tests for them.
This is a misfeature that we inherited from LLVM:
* https://reviews.llvm.org/D61259
* https://reviews.llvm.org/D61939
(`aarch64_32` and `arm64_32` are equivalent.)
I truly have no idea why this triple passed review in LLVM. It is, to date, the
*only* tag in the architecture component that is not, in fact, an architecture.
In reality, it is just an ILP32 ABI for AArch64 (*not* AArch32).
The triples that use `aarch64_32` look like `aarch64_32-apple-watchos`. Yes,
that triple is exactly what you think; it has no ABI component. They really,
seriously did this.
Since only Apple could come up with silliness like this, it should come as no
surprise that no one else uses `aarch64_32`. Later on, a GNU ILP32 ABI for
AArch64 was developed, and support was added to LLVM:
* https://reviews.llvm.org/D94143
* https://reviews.llvm.org/D104931
Here, sanity seems to have prevailed, and a triple using this ABI looks like
`aarch64-linux-gnu_ilp32` as you would expect.
As can be seen from the diffs in this commit, there was plenty of confusion
throughout the Zig codebase about what exactly `aarch64_32` was. So let's just
remove it. In its place, we'll use `aarch64-watchos-ilp32`,
`aarch64-linux-gnuilp32`, and so on. We'll then translate these appropriately
when talking to LLVM. Hence, this commit adds the `ilp32` ABI tag (we already
have `gnuilp32`).
Contributes to #15607
Although the case is not handled in `openatWasi` (as I could not get a
working wasi environment to test the change) I have added a FIXME
addressing it and linking to the issue.
with this rewrite we can call functions inside of
inline assembly, enabling us to use the default start.zig logic
all that's left is to implement lr/sc loops for atomically manipulating
1 and 2 byte values, after which we can use the segfault handler logic.
the risc-v backend doesn't have `@cmpxchg*` implemented and so it can't use the hint that the current page-allocator uses.
this work-around branch can be removed when I implement the atomic built-in.
The flag makes compiler_rt and libfuzzer be in debug mode.
Also:
* fuzzer: override debug logs and disable debug logs for frequently
called functions
* std.Build.Fuzz: fix bug of rerunning the old unit test binary
* report errors from rebuilding the unit tests better
* link.Elf: additionally add tsan lib and fuzzer lib to the hash
This flag makes the build runner rebuild unit tests after the pipeline
finishes, if it finds any unit tests.
I did not make this integrate with file system watching yet.
The test runner is updated to detect which tests are fuzz tests.
Run step is updated to track which test indexes are fuzz tests.
Switches from using r1 as a temporary to r2. That way, we don't have to set the
`noat` assembler option. (r1 is the scratch register used by the assembler's
pseudoinstructions; the assembler warns when code uses that register explicitly
without `noat` set.)
PR [19271](https://github.com/ziglang/zig/pull/19271) added some static function implementations from kernel32, but some parts of the library still used the dynamically loaded versions.
* Add -f(no-)sanitize-coverage-trace-pc-guard CLI flag which defaults to
off. This value lowers to TracePCGuard = true (LLVM backend) and -Xclang
-fsanitize-coverage-trace-pc-guard. These settings are not
automatically included with -ffuzz.
* Add `Build.Step.Compile` flag for sanitize_coverage_trace_pc_guard
with appropriate documentation.
* Add `zig cc` integration for the respective flags.
* Avoid crashing in ELF linker code when -ffuzz -femit-llvm-ir used
together.
PR https://github.com/ziglang/zig/pull/20679 ("std.c reorganization")
switched feature-detection code to use "T != void" checks in place of
"@hasDecl". However, the std.posix.system struct is empty, so
compile-time feature detection against symbols in there (specifically
`std.posix.system.ucontext_t` in this case), fail at compile time on
freestanding targets.
This PR adds a void ucontext_t into the std.posix.system default.
This PR also adds pseudo-"freestanding" variation of the StackIterator
"unwind" test. It is sort of hacky (its freestanding, but assumes it can
invoke a Linux exit syscall), but it does detect this problem.
Fixes#20710
This prevents it from trying to access thread local storage before it
has set up thread local storage, particularly when code coverage
instrumentation is enabled.
* Add the `-ffuzz` and `-fno-fuzz` CLI arguments.
* Detect fuzz testing flags from zig cc.
* Set the correct clang flags when fuzz testing is requested. It can be
combined with TSAN and UBSAN.
* Compilation: build fuzzer library when needed which is currently an
empty zig file.
* Add optforfuzzing to every function in the llvm backend for modules
that have requested fuzzing.
* In ZigLLVMTargetMachineEmitToFile, add the optimization passes for
sanitizer coverage.
* std.mem.eql uses a naive implementation optimized for fuzzing when
builtin.fuzz is true.
Tracked by #20702
In the case that the allocator is unavailable (OOM, etc.), we can
possibly still output the panic message - so now we stack allocate the
message and copy it to the exit data for passing to boot services.
The previous version of this function referenced the argc_argv_ptr global
variable as an inline asm operand. This caused LLVM to generate prologue code to
initialize the ToC so that the global variable can actually be accessed.
Ordinarily, there's nothing wrong with that. But _start() is a naked function!
This makes it actually super surprising that LLVM did this. It also means that
the old version only really worked by accident.
Once the reference to the global variable was removed, no ToC was set up, thus
violating the calling convention once we got to posixCallMainAndExit(). This
then caused any attempt to access global variables here to crash - namely when
setting std.os.linux.elf_aux_maybe.
The fix is to just initialize the ToC manually in _start().
This was added as an architecture to LLVM's target triple parser and the Clang
driver in 2015. No backend ever materialized as far as I can see (same for GCC).
In 2016, other code referring to it started using "Myriad" instead. Ultimately,
all code related to it that isn't in the target triple parser was removed. It
seems to be a real product, just... literally no one seems to know anything
about the ISA. I figure after almost a decade with no public ISA documentation
to speak of, and no LLVM backend to reference, it's probably safe to assume that
we're not going to learn much about this ISA, making it useless for Zig.
See: 1b5767f72b
See: 84a7564b28
See: 8cfe9d8f2a
This is problematic for PIE. There's nothing but luck preventing the accesses to
this global variable from requiring relocations. I've observed this being an
issue on MIPS and PowerPC personally, but others may be affected.
Besides, we're really just passing the initial stack pointer value to
posixCallMainAndExit(), so... just do that.
The set of signals that cannot have their action changed is documented in POSIX,
and any additional, non-standard signals are documented by the specific OS. I
see no valid reason why EINVAL should be considered an unpredictable error here.
`PKG_CONFIG` environment variable is used to override path to
pkg-config executable, for example when it's name is prepended by
target triple for cross-compilation purposes:
```
PKG_CONFIG=/usr/bin/aarch64-unknown-linux-gnu-pkgconf zig build
```
Signed-off-by: Eric Joldasov <bratishkaerik@landless-city.net>
On decryption tls client should remove zero byte padding after the
content type field. This padding is rarely used, the only site (from the
list of top domains) that I found using it is `tutanota.com`.
From [RFC](https://datatracker.ietf.org/doc/html/rfc8446#section-5.4):
> All encrypted TLS records can be padded.
> Padding is a string of zero-valued bytes appended to the ContentType
field before encryption.
> the receiving implementation scans the field from the end toward the
beginning until it finds a non-zero octet. This non-zero octet is the
content type of the message.
Currently we can't connect to that site:
```
$ zig run main.zig -- tutanota.com
error: TlsInitializationFailed
/usr/local/zig/zig-linux-x86_64-0.14.0-dev.208+854e86c56/lib/std/crypto/tls/Client.zig:476:45: 0x121fbed in init__anon_10331 (http_get_std)
if (inner_ct != .handshake) return error.TlsUnexpectedMessage;
^
/usr/local/zig/zig-linux-x86_64-0.14.0-dev.208+854e86c56/lib/std/http/Client.zig:1357:99: 0x1161f0b in connectTcp (http_get_std)
conn.data.tls_client.* = std.crypto.tls.Client.init(stream, client.ca_bundle, host) catch return error.TlsInitializationFailed;
^
/usr/local/zig/zig-linux-x86_64-0.14.0-dev.208+854e86c56/lib/std/http/Client.zig:1492:14: 0x11271e1 in connect (http_get_std)
} orelse return client.connectTcp(host, port, protocol);
^
/usr/local/zig/zig-linux-x86_64-0.14.0-dev.208+854e86c56/lib/std/http/Client.zig:1640:9: 0x111a24e in open (http_get_std)
try client.connect(valid_uri.host.?.raw, uriPort(valid_uri, protocol), protocol);
^
/home/ianic/Code/tls.zig/example/http_get_std.zig:28:19: 0x1118f8c in main (http_get_std)
var req = try client.open(.GET, uri, .{ .server_header_buffer = &server_header_buffer });
^
```
using this example:
```zig
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
const allocator = gpa.allocator();
const args = try std.process.argsAlloc(allocator);
defer std.process.argsFree(allocator, args);
if (args.len > 1) {
const domain = args[1];
var client: std.http.Client = .{ .allocator = allocator };
defer client.deinit();
// Add https:// prefix if needed
const url = brk: {
const scheme = "https://";
if (domain.len >= scheme.len and std.mem.eql(u8, domain[0..scheme.len], scheme))
break :brk domain;
var url_buf: [128]u8 = undefined;
break :brk try std.fmt.bufPrint(&url_buf, "https://{s}", .{domain});
};
const uri = try std.Uri.parse(url);
var server_header_buffer: [16 * 1024]u8 = undefined;
var req = try client.open(.GET, uri, .{ .server_header_buffer = &server_header_buffer });
defer req.deinit();
try req.send();
try req.wait();
}
}
```
This was used for LoongArch64, where:
* `gnuf64` -> `ilp32d` / `lp64d` (full hard float)
* `gnuf32` -> `ilp32f` / `lp64f` (hard float for `f32` only)
* `gnusf` -> `ilp32` / `lp64` (soft float)
But Loongson eventually settled on just `gnu` for the first case since that's
what most people will actually be targeting outside embedded scenarios. The
`gnuf32` and `gnusf` specifiers remain in use.
* common symbols are now public from std.c even if they live in
std.posix
* LOCK is now one of the common symbols since it is the same on 100% of
operating systems.
* flock is now void value on wasi and windows
* std.fs.Dir now uses flock being void as feature detection, avoiding
trying to call it on wasi and windows
It is now composed of these main sections:
* Declarations that are shared among all operating systems.
* Declarations that have the same name, but different type signatures
depending on the operating system. Often multiple operating systems
share the same type signatures however.
* Declarations that are specific to a single operating system.
- These are imported one per line so you can see where they come from,
protected by a comptime block to prevent accessing the wrong one.
Closes#19352 by changing the convention to making types `void` and
functions `{}`, so that it becomes possible to update `@hasDecl` sites
to use `@TypeOf(f) != void` or `T != void`. Happily, this ended up
removing some duplicate logic and update some bitrotted feature
detection checks.
A handful of types have been modified to gain namespacing and type
safety. This is a breaking change.
Oh, and the last usage of `usingnamespace` site is eliminated.
* format: fix default character when no alignment
When no alignment is specified, the character that should be used is the
fill character that is otherwise provided, not space.
This is closer to the default that C programmers (and other languages)
use: "04x" fills with zeroes (in zig as of today x:04 fills with spaces)
Test:
const std = @import("std");
const expectFmt = std.testing.expectFmt;
test "fmt.defaultchar.no-alignment" {
// as of today the following test passes:
try expectFmt("0x00ff", "0x{x:0>4}", .{255});
// as of today the following test fails (returns "0x ff" instead)
try expectFmt("0x00ff", "0x{x:04}", .{255});
}
* non breaking improvement of string formatting
* improved comment
* simplify the code a little
* small improvement around how characters identified as valid are consumed
To facilitate #1840, this commit slims `std.windows.kernel32` to only
have the functions needed by the standard library. Since this will break
projects that relied on these, I offer two solutions:
- Make an argument as to why certain functions should be added back in.
Note that they may just be wrappers around `ntdll` APIs, which would
go against #1840.
If necessary I'll add them back in *and* make wrappers in
`std.windows` for it.
- Maintain your own list of APIs. This is the option taken by bun[1],
where they wrap functions with tracing.
- Use `zigwin32`.
I've also added TODO comments that specify which functions can be
reimplemented using `ntdll` APIs in the future.
Other changes:
- Group functions into groups (I/O, process management etc.).
- Synchronize definitions against Microsoft documentation to use the
proper parameter types/names.
- Break all functions with parameters over multiple lines.
They are implementation-defined and can have values other than
hard-coded here. Also, standard permits other values not mentioned
there:
> Additional macro definitions, beginning with the characters LC_
> and an uppercase letter, may also be specified by the implementation.
Signed-off-by: Eric Joldasov <bratishkaerik@landless-city.net>
According to https://en.cppreference.com/mwiki/index.php?title=c/locale/setlocale&oldid=171500 ,
`setlocale` "returns null value on failure":
> Return value
> pointer to a narrow null-terminated string identifying the C locale
> after applying the changes, if any, or null pointer on failure.
Example program:
```zig
const std = @import("std");
pub fn main() void {
const ptr = std.c.setlocale(.ALL, "non_existent");
std.debug.print("ptr = {d}\n", .{@intFromPtr(ptr)});
}
```
Output:
```console
ptr = 0
```
Signed-off-by: Eric Joldasov <bratishkaerik@landless-city.net>
Instead of calling the dynamically loaded kernel32.GetLastError, we can extract it from the TEB.
As shown by [Wine](34b1606019/include/winternl.h (L439)), the last error lives at offset 0x34 of the TEB in 32-bit Windows and at offset 0x68 in 64-bit Windows.
* the file's doc-comment was misleading and did not focus on the correct aspect of SIMD
* added cpu flag awareness to `suggestVectorLengthForCpu` in order to provide a more accurate vector length
Compilation errors now report a failure on rebuilds triggered by file
system watches.
Compiler crashes now report failure correctly on rebuilds triggered by
file system watches.
The compiler subprocess is restarted if a broken pipe is encountered on
a rebuild.
Remove --debug-incremental
This flag is also added to the build system. Importantly, this tells
Compile step whether or not to keep the compiler running between
rebuilds. It defaults off because it is currently crashing
zirUpdateRefs.
Changes the `make` function signature to take an options struct, which
additionally includes `watch: bool`. I intentionally am not exposing
this information to configure phase logic.
Also adds global zig cache to the compiler cache prefixes.
Closes#20600
Now that we use the PEB to get the precise length of the command line string, there's no need for a multi-item pointer/sliceTo call. This provides a minor speedup:
Benchmark 1 (153 runs): benchargv-before.exe
measurement mean ± σ min … max outliers delta
wall_time 32.7ms ± 429us 32.1ms … 36.9ms 1 ( 1%) 0%
peak_rss 6.49MB ± 5.62KB 6.46MB … 6.49MB 14 ( 9%) 0%
Benchmark 2 (157 runs): benchargv-after.exe
measurement mean ± σ min … max outliers delta
wall_time 31.9ms ± 236us 31.4ms … 32.7ms 4 ( 3%) ⚡- 2.4% ± 0.2%
peak_rss 6.49MB ± 4.77KB 6.46MB … 6.49MB 14 ( 9%) + 0.0% ± 0.0%
Previously, to ensure args were encoded as well-formed WTF-8 (i.e. no encoded surrogate pairs), the code unit would be encoded and then the last 6 emitted bytes would be checked to see if they were a surrogate pair, and this was done for any emitted code unit (although this was not necessary, it should have only been done when emitting a low surrogate).
After this commit, we still want to ensure well-formed WTF-8, but, to do so, the last emitted code point is stored, meaning we can just directly check that the last code unit is a high surrogate and the current code unit is a low surrogate to determine if we have a surrogate pair.
This provides some performance benefit over and above a "use the same strategy as before but only check when we're emitting a low surrogate" implementation:
Benchmark 1 (111 runs): benchargv-master.exe
measurement mean ± σ min … max outliers delta
wall_time 45.2ms ± 532us 44.5ms … 49.4ms 2 ( 2%) 0%
peak_rss 6.49MB ± 3.94KB 6.46MB … 6.49MB 10 ( 9%) 0%
Benchmark 2 (154 runs): benchargv-storelast.exe
measurement mean ± σ min … max outliers delta
wall_time 32.6ms ± 293us 32.2ms … 34.2ms 8 ( 5%) ⚡- 27.8% ± 0.2%
peak_rss 6.49MB ± 5.15KB 6.46MB … 6.49MB 15 (10%) - 0.0% ± 0.0%
Benchmark 3 (131 runs): benchargv-onlylow.exe
measurement mean ± σ min … max outliers delta
wall_time 38.4ms ± 257us 37.9ms … 39.6ms 5 ( 4%) ⚡- 15.1% ± 0.2%
peak_rss 6.49MB ± 5.70KB 6.46MB … 6.49MB 9 ( 7%) - 0.0% ± 0.0%
Before this commit, the WTF-16 command line string would be converted to WTF-8 in `init`, and then a second buffer of the WTF-8 size + 1 would be allocated to store the parsed arguments. The converted WTF-8 command line would then be parsed and the relevant bytes would be copied into the argument buffer before being returned.
After this commit, only the WTF-8 size of the WTF-16 string is calculated (without conversion) which is then used to allocate the buffer for the parsed arguments. Parsing is then done on the WTF-16 slice directly, with the arguments being converted to WTF-8 on-the-fly.
This has a few (minor) benefits:
- Cuts the amount of memory allocated by ArgIteratorWindows in half (or better)
- Makes the total amount of memory allocated by ArgIteratorWindows predictable, since, before, the upfront `wtf16LeToWtf8Alloc` call could end up allocating more-memory-than-necessary temporarily due to its internal use of an ArrayList. Now, the amount of memory allocated is always exactly `calcWtf8Len(cmd_line) + 1`.
This was causing zig2.exe to crash during bootstrap, because there was an atomic
load of read-only memory, and the attempt to write to it as part of the (idempotent)
atomic exchange was invalid.
Aligned reads (of u32 / u64) are atomic on x86 / x64, so this is replaced with an
optimization-proof load (`__iso_volatile_load8*`) and a reordering barrier.
Makes the build runner compile successfully for non-linux targets;
printing an error if you ask for --watch rather than making build
scripts fail to compile.
Updates the build runner to unconditionally require a zig lib directory
parameter. This parameter is needed in order to correctly understand
file system inputs from zig compiler subprocesses, since they will refer
to "the zig lib directory", and the build runner needs to place file
system watches on directories in there.
The build runner's fanotify file watching implementation now accounts
for when two or more Cache.Path instances compare unequal but ultimately
refer to the same directory in the file system.
Breaking change: std.Build no longer has a zig_lib_dir field. Instead,
there is the Graph zig_lib_directory field, and individual Compile steps
can still have their zig lib directories overridden. I think this is
unlikely to break anyone's build in practice.
The compiler now sends a "file_system_inputs" message to the build
runner which shares the full set of files that were added to the cache
system with the build system, so that the build runner can watch
properly and redo the Compile step. This is implemented for whole cache
mode but not yet for incremental cache mode.
and add file system watching integration.
`addDirectoryWatchInput` now returns a `bool` which helps remind the
caller to
1. call addDirectoryWatchInputFromPath on any derived paths
2. but only if the dependency is not already captured by a step
dependency edge.
The make function now recursively walks all directories and adds the
found files to the cache hash rather than incorrectly only adding the
directory name to the cache hash.
closes#20571
and deprecate `addFile`. Part of an effort to move towards using
`std.Build.Cache.Path` abstraction in more places, which makes it easier
to avoid absolute paths and path resolution.
And use it to implement InstallDir Step watch integration.
I'm not seeing any events triggered when I run `mkdir` in the watched
directory, however, and I have not yet figured out why.
The goal is to move towards using `std.Build.Cache.Path` instead of
absolute path names.
This was helpful for implementing file watching integration to
the InstallDir Step
This has been planned for quite some time; this commit finally does it.
Also implements file system watching integration in the make()
implementation for UpdateSourceFiles and fixes the reporting of step
caching for both.
WriteFile does not yet have file system watching integration.
So far, only implemented for InstallFile steps.
Default debounce interval bumped to 50ms. I think it should be
configurable.
Next I have an idea to simplify the fanotify implementation, but other
OS implementations might want to refer back to this commit before I make
those changes.
I'm still learning how the fanotify API works but I think after playing
with it in this commit, I finally know how to implement it, at least on
Linux. This commit does not accomplish the goal but I want to take the
code in a different direction and still be able to reference this point
in time by viewing a source control diff.
I think the move is going to be saving the file_handle for the parent
directory, which combined with the dirent names is how we can correlate
the events back to the Step instances that have registered file system
inputs. I predict this to be similar to implementations on other
operating systems.
When calculating how much ciphertext from the stream can fit into
user and internal buffers we should also take into account ciphertext
data which are already in internal buffer.
Fixes: 15226
Tested with
[this](https://github.com/ziglang/zig/issues/15226#issuecomment-2218809140).
Using client with different read buffers until I, hopefully, understood
what is happening.
Not relevant to this fix, but this
[part](95d9292a7a/lib/std/crypto/tls/Client.zig (L988-L991))
is still mystery to me. Why we don't use free_size in buf_cap
calculation. Seems like rudiment from previous implementation without iovec.
* Update `__chkstk_ms` to have weak linkage
`__chkstk_ms` was causing conflicts during linking in some circumstances (specifically with linking object files from Rust sources). This PR switches `__chkstk_ms` to have weak linkage.
#15107
* Update stack_probe.zig to weak linkage for all symbols
Note that the original `cgroup_storage` MapType has been deprecated,
so renamed to `cgroup_storage_deprecated`.
Signed-off-by: Tw <tw19881113@gmail.com>
This makes comparing host name with dns name from certificate case
insensitive.
I found a few domains (from the
[cloudflare](https://radar.cloudflare.com/domains) list of top domains)
for which tls.Client fails to connect. Error is:
```zig
error: TlsInitializationFailed
Code/zig/lib/std/crypto/Certificate.zig:336:9: 0x1177b1f in verifyHostName (http_get_std)
return error.CertificateHostMismatch;
Code/zig/lib/std/crypto/tls23/handshake_client.zig:461:25: 0x11752bd in parseServerCertificate (http_get_std)
try subject.verifyHostName(opt.host);
```
In its certificate this domains have host names which are not strictly
lower case. This is what checkHostName is comparing:
|host_name | dns_name |
|------------------------------------------------|
|ey.com | EY.COM |
|truist.com | Truist.com |
|wscampanhas.bradesco | WSCAMPANHAS.BRADESCO |
|dell.com | Dell.com |
From
[RFC2818](https://datatracker.ietf.org/doc/html/rfc2818#section-2.4):
> Matching is performed using the matching rules specified by
[RFC2459].
From [RFC2459](https://datatracker.ietf.org/doc/html/rfc2459#section-4.2.1.7):
> When comparing URIs, conforming implementations
> MUST compare the scheme and host without regard to case, but assume
> the remainder of the scheme-specific-part is case sensitive.
Testing with:
```
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
const allocator = gpa.allocator();
const args = try std.process.argsAlloc(allocator);
defer std.process.argsFree(allocator, args);
if (args.len > 1) {
const domain = args[1];
var client: std.http.Client = .{ .allocator = allocator };
defer client.deinit();
// Add https:// prefix if needed
const url = brk: {
const scheme = "https://";
if (domain.len >= scheme.len and std.mem.eql(u8, domain[0..scheme.len], scheme))
break :brk domain;
var url_buf: [128]u8 = undefined;
break :brk try std.fmt.bufPrint(&url_buf, "https://{s}", .{domain});
};
const uri = try std.Uri.parse(url);
var server_header_buffer: [16 * 1024]u8 = undefined;
var req = try client.open(.GET, uri, .{ .server_header_buffer = &server_header_buffer });
defer req.deinit();
try req.send();
try req.wait();
}
}
```
`$ zig run example/main.zig -- truist.com `
The old heuristic of checking only for the number of fields has the
downside of classifying all opaque types, such as `std.c.FILE`, as
"namespaces" rather than "types".