This also cleans up and removes unnecessary usage of FFI-oriented `feature_metadata_t`,
which is only used from Rust code after `builtins/status` was ported.
Largely routine but for the trampolines in iothread.h and iothread.cpp which
were a real PITA to get correct w/ all their variants.
Integration is complete with all old code ripped out and the tests using the
rust version of the code.
The translation is fairly direct though it adds some duplication, for example
there are multiple "match" statements that mimic function overloading.
Rust has no overloading, and we cannot have generic methods in the Node trait
(due to a Rust limitation, the error is like "cannot be made into an object")
so we include the type name in method names.
Give clients like "indent_visitor_t" a Rust companion ("IndentVisitor")
that takes care of the AST traversal while the AST consumption remains
in C++ for now. In future, "IndentVisitor" should absorb the entirety of
"indent_visitor_t". This pattern requires that "fish_indent" be exposed
includable header to the CXX bridge.
Alternatively, we could define FFI wrappers for recursive AST traversal.
Rust requires we separate the AST visitors for "mut" and "const"
scenarios. Take this opportunity to concretize both visitors:
The only client that requires mutable access is the populator. To match the
structure of the C++ populator which makes heavy use of function overloading,
we need to add a bunch of functions to the trait. Since there is no other
mutable visit, this seems acceptable.
The "const" visitors never use "will_visit_fields_of()" or
"did_visit_fields_of()", so remove them (though this is debatable).
Like in the C++ implementation, the AST nodes themselves are largely defined
via macros. Union fields like "Statement" and "ArgumentOrRedirection"
do currently not use macros but may in future.
This commit also introduces a precedent for a type that is defined in one
CXX bridge and used in another one - "ParseErrorList". To make this work
we need to manually define "ExternType".
There is one annoyance with CXX: functions that take explicit lifetime
parameters require to be marked as unsafe. This makes little sense
because functions that return `&Foo` with implicit lifetime can be
misused the same way on the C++ side.
One notable change is that we cannot directly port "find_block_open_keyword()"
(which is used to compute an error) because it relies on the stack of visited
nodes. We cannot modify a stack of node references while we do the "mut"
walk. Happily, an idiomatic solution is easy: we can tell the AST visitor
to backtrack to the parent node and create the error there.
Since "node_t::accept_base" is no longer a template we don't need the
"node_visitation_t" trampoline anymore.
The added copying at the FFI boundary makes things slower (memcpy dominates
the profile) but it's not unusable, which is good news:
$ hyperfine ./fish.{old,new}" -c 'source ../share/completions/git.fish'"
Benchmark 1: ./fish.old -c 'source ../share/completions/git.fish'
Time (mean ± σ): 195.5 ms ± 2.9 ms [User: 190.1 ms, System: 4.4 ms]
Range (min … max): 193.2 ms … 205.1 ms 15 runs
Benchmark 2: ./fish.new -c 'source ../share/completions/git.fish'
Time (mean ± σ): 677.5 ms ± 62.0 ms [User: 665.4 ms, System: 10.0 ms]
Range (min … max): 611.7 ms … 805.5 ms 10 runs
Summary
'./fish.old -c 'source ../share/completions/git.fish'' ran
3.47 ± 0.32 times faster than './fish.new -c 'source ../share/completions/git.fish''
Leftovers:
- Enum variants are still snakecase; I didn't get around to changing this yet.
- "ast_type_to_string()" still returns a snakecase name. This could be
changed since it's not user visible.
wcs2string converts a wide string to a narrow one. The result is
null-terminated and may also contain interior null-characters.
std::string allows this.
Rust's null-terminated string, CString, does not like interior null-characters.
This means we will need to use Vec<u8> or OsString for the places where we
use interior null-characters.
On the other hand, we want to use CString for places that require a
null-terminator, because other Rust types don't guarantee the null-terminator.
Turns out there is basically no overlap between the two use cases, so make
it two functions. Their equivalents in Rust will have the same name, so
we'll only need to adjust the type when porting.
This is early work but I guess there's no harm in pushing it?
Some thoughts on the conventions:
Types that live only inside Rust follow Rust naming convention
("FeatureMetadata").
Types that live on both sides of the language boundary follow the existing
naming ("feature_flag_t").
The alternative is to define a type alias ("using feature_flag_t =
rust::FeatureFlag") but that doesn't seem to be supported in "[cxx::bridge]"
blocks. We could put it in a header ("future_feature_flags.h").
"feature_metadata_t" is a variant of "FeatureMetadata" that can cross
the language boundary. This has the advantage that we can avoid tainting
"FeatureMetadata" with "CxxString" and such. This is an experimental approach,
probably not what we should do in general.
The initial port of feature flags requires a global initialization. Since
fish_indent accesses feature flags, let's make sure to initialize them here.
In future, we can stop initializing things fish_indent doesn't need (like
the topic monitor) but that's no big deal. Global initialization should
always be a benign addition.
This works around an autocxx limitations where different types cannot
have the same name even if they live in different namespace.
ast::job_t conflicts with job_t.
This reverts commit 3d8f98c395.
In addition to the issues mentioned on the GitHub page for this commit,
it also broke the CentOS 7 build.
Note one can locally test the CentOS 7 build via:
./docker/docker_run_tests.sh ./docker/centos7.Dockerfile
Be more careful with sign extension issues stemming from the differences in how
an untyped literal is promoted to an integer vs how a typed (and signed) `char`
is promoted to an integer.
Also convert some `const[expr] static xxx` to `const[expr] xxx` where it makes
sense to let the compiler deduce on its own whether or not to allocate storage
for a constant variable rather than imposing our view that it should have STATIC
storage set aside for it.
A few call sites were not making use of the `XXX_LEN` definitions and were
calling `strlen(XXX)` - these have been updated to use `const_strlen(XXX)`
instead.
I'm not sure if any toolchains will have raise any issues with these changes...
CI will tell!
Let's hope this doesn't causes build failures for e.g. musl: I just
know it's good on macOS and our Linux CI.
It's been a long time.
One fix this brings, is I discovered we #include assert.h or cassert
in a lot of places. If those ever happen to be in a file that doesn't
include common.h, or we are before common.h gets included, we're
unawaringly working with the system 'assert' macro again, which
may get disabled for debug builds or at least has different
behavior on crash. We undef 'assert' and redefine it in common.h.
Those were all eliminated, except in one catch-22 spot for
maybe.h: it can't include common.h. A fix might be to
make a fish_assert.h that *usually* common.h exports.
1. Bravely use a real enum for has_arg, despite the warnings.
2. Use some C++11 initializers so we don't have to pass an int for this
parameter.
No functional change expected here.
This introduces a new variable, $fish_color_option, that can be used
to highlight options differently.
Options are tokens starting with `-`, but only up to (and including!)
the first `--`.
Fixes#8292.
In many cases we currently discard escaped newlines, since they
are often unnecessary (when used around &|;). Escaped newlines
are useful for structuring argument lists. Allow them for variable
assignments since they are similar.
Closes#7955
fish_indent used to increment the indentation level whenever we saw an escaped
newline. This broke because of recent changes to parse_util_compute_indents().
Since parse_util_compute_indents() function already indents continuations
there is not much to do for fish_indent - we can simply query the indentation
level of the newline. Reshuffle the code since we need to pass the offset
of the newline. Maybe this can even be simplified further.
Fixes#7720
This may slightly improve performance by allowing the compiler greater
visibility into what is happing on top of not executing at runtime in
some hot paths, but more importantly, it gets rid of magic constants in a
few different places.
This introduces a new variable $fish_color_keyword that will be used
to highlight keywords. If it's not defined, we fall back on
$fish_color_command as before.
An issue here is that most of our keywords have this weird duality of
also being builtins *if* executed without an argument or with
`--help`.
This means that e.g.
if
is highlighted as a command until you start typing
if t
and then it turns keyword.
Now that we have multiple clients of count_preceding_backslashes, factor
it out from fish_indent into wcstringutil.h, and then use the shared
implementation.
It could be nice to use a heuristic for this in future, but for now let's
stick to the old behavior so we can keep formatting scripts without occasional
bad formatting changes.
A heuristic could also be used to break lines after |, && or || but I don't
think there is much need for that at the moment.
Closes#7252
This used to be used to determine which token contained the cursor, so
as to highlight potential paths. But now we highlight all potential paths,
so we can remove the field.
Also return the number of failed files.
I decided to *just* print the filenames (newline-separated because
NULLs are annoying here) to make it easier to deal with.
See #7251.
Prior to this change, when emitting gap text (comments, newlines, etc),
fish_indent would use the indentation of the text at the end of the gap.
But this has the wrong result for this case:
begin
command
# comment
end
as the comment would get the indent of the 'end'. Instead use the indent
computed for the gap text itself.
Addresses one case of #7252.
Prior to this change, fish would "resolve" highlight specs to rgb colors
right before use. This requires a series of variable lookups; profiling
showed 30% of draw time was spent here.
Switch to caching these (within a single redraw only).
fish_color_match is a variable which controls syntax highlighting for
matching quotes and parens, but only with interactive `read` with shell
highlighting disabled. It seems unlikely that anybody cares about this.
This switches fish_indent from parsing with parse_tree
to the new ast.
This is the most difficult transition because the new ast retains less
lexical information than the old parse tree. The strategy is:
1. Use parse_util_compute_indents to compute indenting for each token.
2. Compute the "gap text" between the text of significant tokens. This
contains whitespace, comments, etc.
3. "Fix up" the gap text while leaving the significant tokens alone.
This is the first commit of a series intended to replace the existing
"parse tree" machinery. It adds a new abstract syntax tree and uses a more
normal recursive descent parser.
Initially there are no users of the new ast. The following commits will
replace parse_tree -> ast for all usages.