Commit Graph

20483 Commits

Author SHA1 Message Date
Peter Ammon
ff87e2cf0a ast: remove parent pointers
This removes parent back-pointers from ast nodes. Nodes no longer store
references to their parents, resulting in memory size reductions for the ast
(__fish_complete_gpg.fish goes from 508 KB to 198 KB in memory usage).
2025-04-20 17:53:43 -07:00
Peter Ammon
a528567d5c IndentVisitor to stop using parent pointers
We can just store these ourselves; don't have to be baked into the AST.
2025-04-20 17:53:43 -07:00
Peter Ammon
374b504eeb parse_util_detect_errors_in_ast to stop using ast parent pointers
Continue to move away from parent pointers.
2025-04-20 17:53:43 -07:00
Peter Ammon
119716fbf2 extract_tokens::is_command to stop using ast parent pointers
Simple change to remove a user of parent back-references.
2025-04-20 17:53:43 -07:00
Peter Ammon
68a357be3d fish_indent to stop using ast parent pointers
Reimplement fish_indent to discover parents through traversals, not ast
parent pointers.
2025-04-20 17:53:43 -07:00
Peter Ammon
996b34b5cb ast: pretty-print to stop using ast parent pointers
These depths are used in calculating indents for ast pretty-printing.

This moves away from parent back-pointers, so that we can ultimately remove
them.
2025-04-20 17:53:43 -07:00
Peter Ammon
8f46b617db ast: make Traversal more powerful
Prior to this commit, Traversal was a convenient way to walk the nodes of
an ast in pre-order. This worked simply: it kept a stack of Nodes, and then
when a Node was visited, it was popped off and its children added.

Enhance Traversal to track whether each node on the Stack is `NeedsVisit`
or `Visited`. The idea is that, when a Node is yielded from next(), we can
reconstruct its parents as those Visited nodes on the Traversal stack.

This will allow clients to get the parents of Nodes as they are traversed
without each Node needing to explicitly save its parent.
2025-04-20 17:53:43 -07:00
Peter Ammon
f7543dd447 ast: remove certain as_node() function calls
We can remove all of them once MSRV becomes 1.86, which adds support for trait
upcasting coercion; we can remove a few today.
2025-04-20 17:53:43 -07:00
Peter Ammon
7d98cc8850 ast: stop using Node::ptr_eq
This function hid a bug! It converted two Nodes to `*const ()` which discards
the vtable pointer, using only the data pointer; but two nodes can and do have
the same data pointer (e.g. if one node is the first item in another).

Add the (statically dispatched) is_same_node function with a warning comment,
and adopt that instead.
2025-04-20 17:53:43 -07:00
Peter Ammon
b33795533c ast: remove leaf_as_node()
Plain old as_node() is fine.
2025-04-20 17:53:42 -07:00
Peter Ammon
3f4839e5f2 Optimize ParseKeyword::from(&wstr)
This is a hot function and is easy to optimize.
2025-04-20 17:53:42 -07:00
Peter Ammon
cef7c3c5c4 Switch ParseKeyword to Rust naming conventions 2025-04-20 17:53:42 -07:00
Peter Ammon
a874237bff ast: Bravely stop allocating so much in Boxes
Now that we have more confidence in our pointers, we can allocate directly
more often, instead of always through Box. This recovers the performance
lost from the previous commit.
2025-04-20 17:53:42 -07:00
Peter Ammon
0b4883d07f ast: Bravely stop boxing items in lists
Prior to this commit, lists of items (e.g. an argument list to a command) would
each be Boxed, i.e. we had effectively Vec<Box<Item>>. The rationale here is
that we had raw pointers and pointer stability was important to enforce.

But we have fewer raw pointers now - only the parent pointers - and we can be
confident that the Ast will not change or move after construction. So remove
this intermediate Box, simplifying some logic and reducing ast size by ~5%.

This slows down Ast construction because we're still constructing
the Box and moving things in and out of it - that will be addressed in
subsequent commits.
2025-04-20 17:53:42 -07:00
Peter Ammon
3b3063287b ast: Add some comments about raw pointers and stability 2025-04-20 17:53:42 -07:00
Peter Ammon
be88d103ba ast: Use boxed slice instead of vec for list nodes
This saves a decent amount of memory, both because we no longer have excess
capacity sitting around that we'll never use, but also because we no longer need
to store the "capacity" value.
2025-04-20 17:53:42 -07:00
Peter Ammon
271a85571d Add a benchmark for AST construction
Run with `cargo +nightly bench --features=benchmark`
2025-04-20 17:53:42 -07:00
Peter Ammon
5f0584a6e6 Teach fish_indent to emit basic parse tree size metrics
A good basis to begin optimization.
2025-04-20 17:53:42 -07:00
David Adam
22d2dd6c90 Merge branch 'Integration_4.0.2' 2025-04-21 07:55:38 +08:00
Peter Ammon
485a6fa859 Fix a redundant import 2025-04-20 16:34:10 -07:00
Fabian Boehm
ca8416f18d docs/bind: Fix typo
Fixes #11408
2025-04-20 21:54:57 +02:00
David Adam
f1456f9707 Release 4.0.2 4.0.2 2025-04-20 21:11:52 +08:00
Daniel Rainer
9d904e1113 Improve profiling output
Indicate the units of the durations (microseconds).

Right-align the durations for better readability.

Use `format!` instead of `fprintf` for more flexible formatting.

Write to `File` instead of raw fd.

Closes #11396
2025-04-18 20:22:30 +02:00
David Adam
c88e6827b7 CHANGELOG: work on 4.0.2 2025-04-19 00:06:31 +08:00
Lucas Garron
3d7d57d612 Add completions for iconutil (macOS).
Closes #11392
2025-04-18 18:05:49 +02:00
exploide
fb314b28ff completions(tcpdump): suppress stderr + updates 2025-04-18 17:17:39 +02:00
Daniel Rainer
aa01f984b7 Use File as arg to print_profile 2025-04-17 11:46:35 +02:00
Daniel Rainer
b8bd3a25d7 Improve error handling 2025-04-17 11:46:35 +02:00
Daniel Rainer
01e7ba4b3a Reduce scope of raw_fd 2025-04-17 11:46:35 +02:00
Daniel Rainer
e5c953ea92 Replace read_loop with more idiomatic code 2025-04-17 11:46:35 +02:00
Daniel Rainer
70d682a110 Set file permissions via stdlib method 2025-04-17 11:46:35 +02:00
Daniel Rainer
21e284e548 Rename last_read_file to last_read_file_id
This is done to match the field's type.
2025-04-17 11:46:35 +02:00
Daniel Rainer
c2b8ee5554 Replace fstat with File::metadata() where possible 2025-04-17 11:46:35 +02:00
Daniel Rainer
5e8276ed15 Change file_id_for_fd to file_id_for_file 2025-04-17 11:46:35 +02:00
Johannes Altmanninger
bc3e3ae029 builtin read: always handle out-of-range codepoints (Rust port regression)
As mentioned in
https://github.com/fish-shell/fish-shell/pull/9688#discussion_r1155089596,
commit b77d1d0e2b (Stop crashing on invalid Unicode input, 2024-02-27), Rust's
char type doesn't support arbitrary 32-bit values.  Out-of-range Unicode
codepoints would cause crashes.  That commit addressed this by converting
the encoded bytes (e.g. UTF-8) to special private-use-area characters that
fish knows about.  It didn't bother to update the code path in builtin read
that relies on mbrtowc as well.

Fix that. Move and rename parse_codepoint() and rename/reorder its input/output
parameters.

Fixes #11383

(cherry picked from commit d9ba27f58f)
2025-04-16 11:33:15 +02:00
Johannes Altmanninger
3191ac13e5 Reduce parse_codepoint responsibilities, fixing alt in single-byte locale?
This also changes the single-byte locale code path to treat keyboard input
like "\x1ba" as alt-a instead of "escape,a".  I can't off-hand reproduce
a problem with "LC_ALL=C fish_key_reader", I guess we always use a UTF-8
locale if available?

(cherry picked from commit b061178606)
2025-04-16 11:28:28 +02:00
Johannes Altmanninger
4f810809c8 Fix builtin test assigning wrong range to "! -d /" (Rust port regression)
Fixes #11387

(cherry picked from commit c740c656a8)
2025-04-16 11:25:35 +02:00
Johannes Altmanninger
4e85366416 builtin commandline: minor cleanup 2025-04-16 11:24:33 +02:00
Johannes Altmanninger
0284292392 builtin read to pass through invalid UTF-8; reader to ignore invalid codepoints
Two issues:

1. typing the codepoint 0x123456 into fish_key_reader:

	$ fish_key_reader -cV
	# decoded from: \xf4\xa3\x91
	bind \xf4 'do something'
	# decoded from: 
	bind \xa3 'do something'
	# decoded from: 
	bind \x91 'do something'

The invalid codepoint is represented in its original encoding, which leaks
to the UI. This was more or less intentionally added by b77d1d0e2b (Stop
crashing on invalid Unicode input, 2024-02-27).  That commit rendered it
as replacement byte, but that was removed for other reasons in e25a1358e6
(Work around broken rendering of pasted multibyte chars in non-UTF-8-ish
locale, 2024-08-03).

We no longer insert such (PUA) codepoints into the commandline.  The "bind"
comes above would work however.  I don't think this is something we want
to support.  Discard invalid codepoints in the reader, so they can't be
bound and fish_key_reader shows nothing.

2. builtin read silently drops invalid encodings This builtin is not really
suited to read binary data (#11383 is an error scenario), but I guess it can
be bent to do that.  Some of its code paths use str2wcstring which passes
through e.g. invalid UTF-8.  The read-one-char-at-a-time code path doesn't.
Fix this.
2025-04-16 11:24:33 +02:00
Johannes Altmanninger
d9ba27f58f builtin read: always handle out-of-range codepoints (Rust port regression)
As mentioned in
https://github.com/fish-shell/fish-shell/pull/9688#discussion_r1155089596,
commit b77d1d0e2b (Stop crashing on invalid Unicode input, 2024-02-27), Rust's
char type doesn't support arbitrary 32-bit values.  Out-of-range Unicode
codepoints would cause crashes.  That commit addressed this by converting
the encoded bytes (e.g. UTF-8) to special private-use-area characters that
fish knows about.  It didn't bother to update the code path in builtin read
that relies on mbrtowc as well.

Fix that. Move and rename parse_codepoint() and rename/reorder its input/output
parameters.

Note that the behavior is still wrong if builtin read can't decode the
input; see the next commit.

Fixes #11383
2025-04-16 11:24:33 +02:00
Johannes Altmanninger
b061178606 Reduce parse_codepoint responsibilities, fixing alt in single-byte locale?
This also changes the single-byte locale code path to treat keyboard input
like "\x1ba" as alt-a instead of "escape,a".  I can't off-hand reproduce
a problem with "LC_ALL=C fish_key_reader", I guess we always use a UTF-8
locale if available?
2025-04-16 11:24:33 +02:00
Johannes Altmanninger
a63633edea Remove redundant code in parse_codepoint 2025-04-16 11:24:33 +02:00
Johannes Altmanninger
c740c656a8 Fix builtin test assigning wrong range to "! -d /" (Rust port regression)
Fixes #11387
2025-04-16 11:24:33 +02:00
Johannes Altmanninger
721c9a2c14 completions/set: add some special variables 2025-04-16 11:24:33 +02:00
Johannes Altmanninger
7337bfee47 completions/set: sort 2025-04-16 11:24:33 +02:00
Johannes Altmanninger
5076cfbd71 Fix a case where path canonicalization leaks trailing slash
As reported in
https://matrix.to/#/!YLTeaulxSDauOOxBoR:matrix.org/$BDVmBtBgtKCj45dVfS36rP7Y6Fo7E4uBg1vcH9IIIQg

	tmux new-session -c "" fish -C 'echo $PWD'

prints

	/home/fishuser/

This is because our path canonicalization function only
removes trailing slashes if there were duplicate slashes in the string.
Else (for the case above), we end up with "trailing == len"
which means we ignore trailing slashes.

I don't think this was intended by 24f1da7f30 (Add a fancy new
paths_are_equivalent function to test for equivalent paths instead of merely
equal ones, 2013-08-27). Fix it.
2025-04-16 11:24:33 +02:00
Johannes Altmanninger
2d506245be Rename confusing variables in path_make_canonical
The terms leading/trailing for the read-head and write-head are reasonable
but confusing in this context where trailing (slash) has another meaning.
2025-04-16 11:24:33 +02:00
Johannes Altmanninger
afa517d907 doc/terminal-compatibility: document cursor shaping sequence
While at it, also document the command to reset the shape to the default,
which we should probably use.  See foot commit 49034bb7 (csi: let CSI 0 q mean
"switch to user configured cursor style", 2019-07-22).

As of today, the XTerm documentation is a not clear on this; in
XTerm itself, CSI 0 q may actually change the cursor because they have an
additional default cursor is configured differently..
2025-04-16 11:24:33 +02:00
Johannes Altmanninger
123b262e97 fish_jj_prompt: remove not-so-useful bits
Things like branch and tag name can take up a lot of space on the screen. The
empty status may be useful but we're still looking for evidence.  For now let's
keep only the conflict status, which is fairly familiar from the Git prompt.

See also #11183
2025-04-16 11:24:33 +02:00
Johannes Altmanninger
267b16235d Fix transient prompt mode for single-line prompts
Extend a hack multi-line prompts to the new transient prompt code path.
This fixes transient prompt with single-line prompts; added a test case.

While at it, add a test that covers the need for this hack.

Patch-by: kerty <g.kabakov@inbox.ru>

See https://github.com/fish-shell/fish-shell/pull/11153#issuecomment-2801014723
2025-04-16 11:22:14 +02:00