mirror of
https://github.com/fish-shell/fish-shell.git
synced 2026-05-30 03:01:15 -03:00
Allow { } for command grouping, like begin / end
For compound commands we already have begin/end but
> it is long, which it is not convenient for the command line
> it is different than {} which shell users have been using for >50 years
The difference from {} can break muscle memory and add extra steps
when I'm trying to write simple commands that work in any shell.
Fix that by embracing the traditional style too.
---
Since { and } have always been special syntax in fish, we can also
allow
{ }
{ echo }
which I find intuitive even without having used a shell that supports
this (like zsh. The downside is that this doesn't work in some other
shells. The upside is in aesthetics and convenience (this is for
interactive use). Not completely sure about this.
---
This implementation adds a hack to the tokenizer: '{' is usually a
brace expansion. Make it compound command when in command position
(not something the tokenizer would normally know). We need to disable
this when parsing a freestanding argument lists (in "complete somecmd
-a "{true,false}"). It's not really clear what "read -t" should do.
For now, keep the existing behavior (don't parse compound statements).
Add another hack to increase backwards compatibility: parse something
like "{ foo }" as brace statement only if it has a space after
the opening brace. This style is less likely to be used for brace
expansion. Perhaps we can change this in future (I'll make a PR).
Use separate terminal token types for braces; we could make the
left brace an ordinary string token but since string tokens undergo
unescaping during expansion etc., every such place would need to know
whether it's dealing with a command or an argument. Certainly possible
but it seems simpler (especially for tab-completions) to strip braces
in the parser. We could change this.
---
In future we could allow the following alternative syntax (which is
invalid today).
if true {
}
if true; {
}
Closes #10895
Closes #10898
This commit is contained in:
@@ -3,9 +3,11 @@ fish 4.1.0 (released ???)
|
||||
|
||||
Notable improvements and fixes
|
||||
------------------------------
|
||||
- Compound commands (``begin; echo 1; echo 2; end``) can now be now be abbreviated using braces (``{ echo1; echo 2 }``), like in other shells.
|
||||
|
||||
Deprecations and removed features
|
||||
---------------------------------
|
||||
- Tokens like `{ echo, echo }`` in command position are no longer interpreted as brace expansion but as compound command.
|
||||
|
||||
Scripting improvements
|
||||
----------------------
|
||||
|
||||
@@ -9,6 +9,7 @@ Synopsis
|
||||
.. synopsis::
|
||||
|
||||
begin; [COMMANDS ...]; end
|
||||
{ [COMMANDS ...] }
|
||||
|
||||
Description
|
||||
-----------
|
||||
@@ -21,6 +22,8 @@ The block is unconditionally executed. ``begin; ...; end`` is equivalent to ``if
|
||||
|
||||
``begin`` does not change the current exit status itself. After the block has completed, ``$status`` will be set to the status returned by the most recent command.
|
||||
|
||||
Some other shells only support the ``{ [COMMANDS ...] ; }`` notation.
|
||||
|
||||
The **-h** or **--help** option displays help about using this command.
|
||||
|
||||
Example
|
||||
|
||||
@@ -53,7 +53,7 @@ lexer_rules = [
|
||||
# Hack: treat the "[ expr ]" alias of builtin test as command token (not as grammar
|
||||
# metacharacter). This works because we write it without spaces in the grammar (like
|
||||
# "[OPTIONS]").
|
||||
(r"\. |! |\[ | \]", Name.Constant),
|
||||
(r"\. |! |\[ | \]|\{ | \}", Name.Constant),
|
||||
# Statement separators.
|
||||
(r"\n", Text.Whitespace),
|
||||
(r";", Punctuation),
|
||||
|
||||
118
src/ast.rs
118
src/ast.rs
@@ -21,7 +21,7 @@
|
||||
use crate::tests::prelude::*;
|
||||
use crate::tokenizer::{
|
||||
variable_assignment_equals_pos, TokFlags, TokenType, Tokenizer, TokenizerError,
|
||||
TOK_ACCEPT_UNFINISHED, TOK_CONTINUE_AFTER_ERROR, TOK_SHOW_COMMENTS,
|
||||
TOK_ACCEPT_UNFINISHED, TOK_ARGUMENT_LIST, TOK_CONTINUE_AFTER_ERROR, TOK_SHOW_COMMENTS,
|
||||
};
|
||||
use crate::wchar::prelude::*;
|
||||
use std::borrow::Cow;
|
||||
@@ -203,6 +203,9 @@ fn as_begin_header(&self) -> Option<&BeginHeader> {
|
||||
fn as_block_statement(&self) -> Option<&BlockStatement> {
|
||||
None
|
||||
}
|
||||
fn as_brace_statement(&self) -> Option<&BraceStatement> {
|
||||
None
|
||||
}
|
||||
fn as_if_clause(&self) -> Option<&IfClause> {
|
||||
None
|
||||
}
|
||||
@@ -321,6 +324,9 @@ fn as_mut_begin_header(&mut self) -> Option<&mut BeginHeader> {
|
||||
fn as_mut_block_statement(&mut self) -> Option<&mut BlockStatement> {
|
||||
None
|
||||
}
|
||||
fn as_mut_brace_statement(&mut self) -> Option<&mut BraceStatement> {
|
||||
None
|
||||
}
|
||||
fn as_mut_if_clause(&mut self) -> Option<&mut IfClause> {
|
||||
None
|
||||
}
|
||||
@@ -1028,6 +1034,9 @@ macro_rules! set_parent_of_union_field {
|
||||
} else if matches!($self.$field_name, StatementVariant::BlockStatement(_)) {
|
||||
$self.$field_name.as_mut_block_statement().parent = Some($self);
|
||||
$self.$field_name.as_mut_block_statement().set_parents();
|
||||
} else if matches!($self.$field_name, StatementVariant::BraceStatement(_)) {
|
||||
$self.$field_name.as_mut_brace_statement().parent = Some($self);
|
||||
$self.$field_name.as_mut_brace_statement().set_parents();
|
||||
} else if matches!($self.$field_name, StatementVariant::IfStatement(_)) {
|
||||
$self.$field_name.as_mut_if_statement().parent = Some($self);
|
||||
$self.$field_name.as_mut_if_statement().set_parents();
|
||||
@@ -1247,11 +1256,12 @@ impl CheckParse for JobConjunction {
|
||||
fn can_be_parsed(pop: &mut Populator<'_>) -> bool {
|
||||
let token = pop.peek_token(0);
|
||||
// These keywords end a job list.
|
||||
token.typ == ParseTokenType::string
|
||||
token.typ == ParseTokenType::left_brace
|
||||
|| (token.typ == ParseTokenType::string
|
||||
&& !matches!(
|
||||
token.keyword,
|
||||
ParseKeyword::kw_case | ParseKeyword::kw_end | ParseKeyword::kw_else
|
||||
)
|
||||
))
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1399,6 +1409,37 @@ fn as_mut_block_statement(&mut self) -> Option<&mut BlockStatement> {
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Default, Debug)]
|
||||
pub struct BraceStatement {
|
||||
parent: Option<*const dyn Node>,
|
||||
/// The opening brace, in command position.
|
||||
pub left_brace: TokenLeftBrace,
|
||||
/// List of jobs in this block.
|
||||
pub jobs: JobList,
|
||||
/// The closing brace.
|
||||
pub right_brace: TokenRightBrace,
|
||||
/// Arguments and redirections associated with the block.
|
||||
pub args_or_redirs: ArgumentOrRedirectionList,
|
||||
}
|
||||
implement_node!(BraceStatement, branch, brace_statement);
|
||||
implement_acceptor_for_branch!(
|
||||
BraceStatement,
|
||||
(left_brace: (TokenLeftBrace)),
|
||||
(jobs: (JobList)),
|
||||
(right_brace: (TokenRightBrace)),
|
||||
(args_or_redirs: (ArgumentOrRedirectionList)),
|
||||
);
|
||||
impl ConcreteNode for BraceStatement {
|
||||
fn as_brace_statement(&self) -> Option<&BraceStatement> {
|
||||
Some(self)
|
||||
}
|
||||
}
|
||||
impl ConcreteNodeMut for BraceStatement {
|
||||
fn as_mut_brace_statement(&mut self) -> Option<&mut BraceStatement> {
|
||||
Some(self)
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Default, Debug)]
|
||||
pub struct IfClause {
|
||||
parent: Option<*const dyn Node>,
|
||||
@@ -1772,7 +1813,10 @@ fn can_be_parsed(pop: &mut Populator<'_>) -> bool {
|
||||
// Check that the argument to and/or is a string that's not help. Otherwise
|
||||
// it's either 'and --help' or a naked 'and', and not part of this list.
|
||||
let next_token = pop.peek_token(1);
|
||||
next_token.typ == ParseTokenType::string && !next_token.is_help_argument
|
||||
matches!(
|
||||
next_token.typ,
|
||||
ParseTokenType::string | ParseTokenType::left_brace
|
||||
) && !next_token.is_help_argument
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1894,7 +1938,7 @@ fn can_be_parsed(pop: &mut Populator<'_>) -> bool {
|
||||
// What is the token after it?
|
||||
match pop.peek_type(1) {
|
||||
// We have `a= cmd` and should treat it as a variable assignment.
|
||||
ParseTokenType::string => true,
|
||||
ParseTokenType::string | ParseTokenType::left_brace => true,
|
||||
// We have `a=` which is OK if we are allowing incomplete, an error otherwise.
|
||||
ParseTokenType::terminate => pop.allow_incomplete(),
|
||||
// We have e.g. `a= >` which is an error.
|
||||
@@ -1966,6 +2010,8 @@ fn can_be_parsed(pop: &mut Populator<'_>) -> bool {
|
||||
define_token_node!(TokenBackground, background);
|
||||
define_token_node!(TokenConjunction, andand, oror);
|
||||
define_token_node!(TokenPipe, pipe);
|
||||
define_token_node!(TokenLeftBrace, left_brace);
|
||||
define_token_node!(TokenRightBrace, right_brace);
|
||||
define_token_node!(TokenRedirection, redirection);
|
||||
|
||||
define_keyword_node!(DecoratedStatementDecorator, kw_command, kw_builtin, kw_exec);
|
||||
@@ -2236,6 +2282,7 @@ pub enum StatementVariant {
|
||||
None,
|
||||
NotStatement(Box<NotStatement>),
|
||||
BlockStatement(Box<BlockStatement>),
|
||||
BraceStatement(Box<BraceStatement>),
|
||||
IfStatement(Box<IfStatement>),
|
||||
SwitchStatement(Box<SwitchStatement>),
|
||||
DecoratedStatement(DecoratedStatement),
|
||||
@@ -2253,6 +2300,7 @@ fn accept<'a>(&'a self, visitor: &mut dyn NodeVisitor<'a>, reversed: bool) {
|
||||
StatementVariant::None => panic!("cannot visit null statement"),
|
||||
StatementVariant::NotStatement(node) => node.accept(visitor, reversed),
|
||||
StatementVariant::BlockStatement(node) => node.accept(visitor, reversed),
|
||||
StatementVariant::BraceStatement(node) => node.accept(visitor, reversed),
|
||||
StatementVariant::IfStatement(node) => node.accept(visitor, reversed),
|
||||
StatementVariant::SwitchStatement(node) => node.accept(visitor, reversed),
|
||||
StatementVariant::DecoratedStatement(node) => node.accept(visitor, reversed),
|
||||
@@ -2265,6 +2313,7 @@ fn accept_mut(&mut self, visitor: &mut dyn NodeVisitorMut, reversed: bool) {
|
||||
StatementVariant::None => panic!("cannot visit null statement"),
|
||||
StatementVariant::NotStatement(node) => node.accept_mut(visitor, reversed),
|
||||
StatementVariant::BlockStatement(node) => node.accept_mut(visitor, reversed),
|
||||
StatementVariant::BraceStatement(node) => node.accept_mut(visitor, reversed),
|
||||
StatementVariant::IfStatement(node) => node.accept_mut(visitor, reversed),
|
||||
StatementVariant::SwitchStatement(node) => node.accept_mut(visitor, reversed),
|
||||
StatementVariant::DecoratedStatement(node) => node.accept_mut(visitor, reversed),
|
||||
@@ -2292,6 +2341,12 @@ pub fn as_block_statement(&self) -> Option<&BlockStatement> {
|
||||
_ => None,
|
||||
}
|
||||
}
|
||||
pub fn as_brace_statement(&self) -> Option<&BraceStatement> {
|
||||
match self {
|
||||
StatementVariant::BraceStatement(node) => Some(node),
|
||||
_ => None,
|
||||
}
|
||||
}
|
||||
pub fn as_if_statement(&self) -> Option<&IfStatement> {
|
||||
match self {
|
||||
StatementVariant::IfStatement(node) => Some(node),
|
||||
@@ -2316,6 +2371,7 @@ fn embedded_node(&self) -> &dyn NodeMut {
|
||||
StatementVariant::None => panic!("cannot visit null statement"),
|
||||
StatementVariant::NotStatement(node) => &**node,
|
||||
StatementVariant::BlockStatement(node) => &**node,
|
||||
StatementVariant::BraceStatement(node) => &**node,
|
||||
StatementVariant::IfStatement(node) => &**node,
|
||||
StatementVariant::SwitchStatement(node) => &**node,
|
||||
StatementVariant::DecoratedStatement(node) => node,
|
||||
@@ -2333,6 +2389,12 @@ fn as_mut_block_statement(&mut self) -> &mut BlockStatement {
|
||||
_ => panic!(),
|
||||
}
|
||||
}
|
||||
fn as_mut_brace_statement(&mut self) -> &mut BraceStatement {
|
||||
match self {
|
||||
StatementVariant::BraceStatement(node) => node,
|
||||
_ => panic!(),
|
||||
}
|
||||
}
|
||||
fn as_mut_if_statement(&mut self) -> &mut IfStatement {
|
||||
match self {
|
||||
StatementVariant::IfStatement(node) => node,
|
||||
@@ -2371,6 +2433,7 @@ pub fn ast_type_to_string(t: Type) -> &'static wstr {
|
||||
Type::function_header => L!("function_header"),
|
||||
Type::begin_header => L!("begin_header"),
|
||||
Type::block_statement => L!("block_statement"),
|
||||
Type::brace_statement => L!("brace_statement"),
|
||||
Type::if_clause => L!("if_clause"),
|
||||
Type::elseif_clause => L!("elseif_clause"),
|
||||
Type::elseif_clause_list => L!("elseif_clause_list"),
|
||||
@@ -2629,13 +2692,17 @@ impl<'a> TokenStream<'a> {
|
||||
// The maximum number of lookahead supported.
|
||||
const MAX_LOOKAHEAD: usize = 2;
|
||||
|
||||
fn new(src: &'a wstr, flags: ParseTreeFlags) -> Self {
|
||||
fn new(src: &'a wstr, flags: ParseTreeFlags, freestanding_arguments: bool) -> Self {
|
||||
let mut flags = TokFlags::from(flags);
|
||||
if freestanding_arguments {
|
||||
flags |= TOK_ARGUMENT_LIST;
|
||||
}
|
||||
Self {
|
||||
lookahead: [ParseToken::new(ParseTokenType::invalid); Self::MAX_LOOKAHEAD],
|
||||
start: 0,
|
||||
count: 0,
|
||||
src,
|
||||
tok: Tokenizer::new(src, TokFlags::from(flags)),
|
||||
tok: Tokenizer::new(src, flags),
|
||||
comment_ranges: vec![],
|
||||
}
|
||||
}
|
||||
@@ -2931,6 +2998,20 @@ fn did_visit_fields_of<'a>(&'a mut self, node: &'a dyn NodeMut, flow: VisitResul
|
||||
return;
|
||||
};
|
||||
|
||||
let token = &error.token;
|
||||
// To-do: maybe extend this to other tokenizer errors?
|
||||
if token.typ == ParseTokenType::tokenizer_error
|
||||
&& token.tok_error == TokenizerError::closing_unopened_brace
|
||||
{
|
||||
parse_error_range!(
|
||||
self,
|
||||
token.range(),
|
||||
ParseErrorCode::unbalancing_brace,
|
||||
"%s",
|
||||
<TokenizerError as Into<&wstr>>::into(token.tok_error)
|
||||
);
|
||||
}
|
||||
|
||||
// We believe the node is some sort of block statement. Attempt to find a source range
|
||||
// for the block's keyword (for, if, etc) and a user-presentable description. This
|
||||
// is used to provide better error messages. Note at this point the parse tree is
|
||||
@@ -2989,7 +3070,7 @@ fn did_visit_fields_of<'a>(&'a mut self, node: &'a dyn NodeMut, flow: VisitResul
|
||||
} else {
|
||||
parse_error!(
|
||||
self,
|
||||
error.token,
|
||||
token,
|
||||
ParseErrorCode::generic,
|
||||
"Expected %ls, but found %ls",
|
||||
keywords_user_presentable_description(error.allowed_keywords),
|
||||
@@ -3095,7 +3176,7 @@ fn new(
|
||||
flags,
|
||||
semis: vec![],
|
||||
errors: vec![],
|
||||
tokens: TokenStream::new(src, flags),
|
||||
tokens: TokenStream::new(src, flags, top_type == Type::freestanding_argument_list),
|
||||
top_type,
|
||||
unwinding: false,
|
||||
any_error: false,
|
||||
@@ -3550,6 +3631,19 @@ fn got_error(slf: &mut Populator<'_>) -> StatementVariant {
|
||||
|
||||
fn new_decorated_statement(slf: &mut Populator<'_>) -> StatementVariant {
|
||||
let embedded = slf.allocate_visit::<DecoratedStatement>();
|
||||
if !slf.unwinding && slf.peek_token(0).typ == ParseTokenType::left_brace {
|
||||
parse_error!(
|
||||
slf,
|
||||
slf.peek_token(0),
|
||||
ParseErrorCode::generic,
|
||||
"Expected %s, but found %ls",
|
||||
token_type_user_presentable_description(
|
||||
ParseTokenType::end,
|
||||
ParseKeyword::none
|
||||
),
|
||||
slf.peek_token(0).user_presentable_description()
|
||||
);
|
||||
}
|
||||
StatementVariant::DecoratedStatement(*embedded)
|
||||
}
|
||||
|
||||
@@ -3557,6 +3651,9 @@ fn new_decorated_statement(slf: &mut Populator<'_>) -> StatementVariant {
|
||||
// This may happen if we just have a 'time' prefix.
|
||||
// Construct a decorated statement, which will be unsourced.
|
||||
self.allocate_visit::<DecoratedStatement>();
|
||||
} else if self.peek_token(0).typ == ParseTokenType::left_brace {
|
||||
let embedded = self.allocate_visit::<BraceStatement>();
|
||||
return StatementVariant::BraceStatement(embedded);
|
||||
} else if self.peek_token(0).typ != ParseTokenType::string {
|
||||
// We may be unwinding already; do not produce another error.
|
||||
// For example in `true | and`.
|
||||
@@ -3957,6 +4054,8 @@ fn from(token_type: TokenType) -> Self {
|
||||
TokenType::oror => ParseTokenType::oror,
|
||||
TokenType::end => ParseTokenType::end,
|
||||
TokenType::background => ParseTokenType::background,
|
||||
TokenType::left_brace => ParseTokenType::left_brace,
|
||||
TokenType::right_brace => ParseTokenType::right_brace,
|
||||
TokenType::redirect => ParseTokenType::redirection,
|
||||
TokenType::error => ParseTokenType::tokenizer_error,
|
||||
TokenType::comment => ParseTokenType::comment,
|
||||
@@ -4042,6 +4141,7 @@ pub enum Type {
|
||||
function_header,
|
||||
begin_header,
|
||||
block_statement,
|
||||
brace_statement,
|
||||
if_clause,
|
||||
elseif_clause,
|
||||
elseif_clause_list,
|
||||
|
||||
@@ -22,6 +22,7 @@
|
||||
use std::ops::Range;
|
||||
|
||||
/// Which part of the comandbuffer are we operating on.
|
||||
#[derive(Eq, PartialEq)]
|
||||
enum TextScope {
|
||||
String,
|
||||
Job,
|
||||
@@ -103,6 +104,7 @@ fn replace_part(
|
||||
fn write_part(
|
||||
parser: &Parser,
|
||||
range: Range<usize>,
|
||||
range_is_single_token: bool,
|
||||
cut_at_cursor: bool,
|
||||
token_mode: Option<TokenMode>,
|
||||
buffer: &wstr,
|
||||
@@ -121,19 +123,8 @@ fn write_part(
|
||||
return;
|
||||
};
|
||||
|
||||
let buff = &buffer[range];
|
||||
let mut tok = Tokenizer::new(buff, TOK_ACCEPT_UNFINISHED);
|
||||
let mut args = vec![];
|
||||
while let Some(token) = tok.next() {
|
||||
if cut_at_cursor && token.end() >= pos {
|
||||
break;
|
||||
}
|
||||
if token.type_ != TokenType::string {
|
||||
continue;
|
||||
}
|
||||
|
||||
let token_text = tok.text_of(&token);
|
||||
|
||||
let mut add_token = |token_text: &wstr| {
|
||||
match token_mode {
|
||||
TokenMode::Expanded => {
|
||||
const COMMANDLINE_TOKENS_MAX_EXPANSION: usize = 512;
|
||||
@@ -175,7 +166,26 @@ fn write_part(
|
||||
args.push(Completion::from_completion(unescaped));
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
let buff = &buffer[range];
|
||||
if range_is_single_token {
|
||||
add_token(buff);
|
||||
} else {
|
||||
let mut tok = Tokenizer::new(buff, TOK_ACCEPT_UNFINISHED);
|
||||
while let Some(token) = tok.next() {
|
||||
if cut_at_cursor && token.end() >= pos {
|
||||
break;
|
||||
}
|
||||
if token.type_ != TokenType::string {
|
||||
continue;
|
||||
}
|
||||
|
||||
let token_text = tok.text_of(&token);
|
||||
add_token(token_text);
|
||||
}
|
||||
};
|
||||
|
||||
for arg in args {
|
||||
streams.out.appendln(arg.completion);
|
||||
}
|
||||
@@ -642,6 +652,7 @@ pub fn commandline(parser: &Parser, streams: &mut IoStreams, args: &mut [&wstr])
|
||||
write_part(
|
||||
parser,
|
||||
range,
|
||||
buffer_part == TextScope::Token,
|
||||
cut_at_cursor,
|
||||
token_mode,
|
||||
current_buffer,
|
||||
|
||||
@@ -76,6 +76,9 @@ struct PrettyPrinterState<'source, 'ast> {
|
||||
// present in the ast.
|
||||
gaps: Vec<SourceRange>,
|
||||
|
||||
// Sorted set of source offsets of brace statements that span multiple lines.
|
||||
multi_line_brace_statement_locations: Vec<usize>,
|
||||
|
||||
// The sorted set of source offsets of nl_semi_t which should be set as semis, not newlines.
|
||||
// This is computed ahead of time for convenience.
|
||||
preferred_semi_locations: Vec<usize>,
|
||||
@@ -120,11 +123,14 @@ fn new(source: &'source wstr, do_indent: bool) -> Self {
|
||||
// Start with true to ignore leading empty lines.
|
||||
gap_text_mask_newline: true,
|
||||
gaps: vec![],
|
||||
multi_line_brace_statement_locations: vec![],
|
||||
preferred_semi_locations: vec![],
|
||||
errors: None,
|
||||
},
|
||||
};
|
||||
zelf.state.gaps = zelf.compute_gaps();
|
||||
zelf.state.multi_line_brace_statement_locations =
|
||||
zelf.compute_multi_line_brace_statement_locations();
|
||||
zelf.state.preferred_semi_locations = zelf.compute_preferred_semi_locations();
|
||||
zelf
|
||||
}
|
||||
@@ -224,6 +230,23 @@ fn compute_preferred_semi_locations(&self) -> Vec<usize> {
|
||||
}
|
||||
}
|
||||
|
||||
// `{ x; y; }` gets semis if the input uses semis and it spans only one line.
|
||||
for node in Traversal::new(self.ast.top()) {
|
||||
let Some(brace_statement) = node.as_brace_statement() else {
|
||||
continue;
|
||||
};
|
||||
if self
|
||||
.state
|
||||
.multi_line_brace_statement_locations
|
||||
.binary_search(&brace_statement.source_range().start())
|
||||
.is_err()
|
||||
{
|
||||
for job in &brace_statement.jobs {
|
||||
job.semi_nl.as_ref().map(&mut mark_semi_from_input);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// `x ; and y` gets semis if it has them already, and they are on the same line.
|
||||
for node in Traversal::new(self.ast.top()) {
|
||||
let Some(job_list) = node.as_job_list() else {
|
||||
@@ -259,9 +282,41 @@ fn compute_preferred_semi_locations(&self) -> Vec<usize> {
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
result.sort_unstable();
|
||||
result
|
||||
}
|
||||
|
||||
fn compute_multi_line_brace_statement_locations(&self) -> Vec<usize> {
|
||||
let mut result = vec![];
|
||||
let newline_offsets: Vec<usize> = self
|
||||
.state
|
||||
.source
|
||||
.char_indices()
|
||||
.filter_map(|(i, c)| (c == '\n').then_some(i))
|
||||
.collect();
|
||||
let mut next_newline = 0;
|
||||
for node in Traversal::new(self.ast.top()) {
|
||||
let Some(brace_statement) = node.as_brace_statement() else {
|
||||
continue;
|
||||
};
|
||||
while next_newline != newline_offsets.len()
|
||||
&& newline_offsets[next_newline] < brace_statement.source_range().start()
|
||||
{
|
||||
next_newline += 1;
|
||||
}
|
||||
let contains_newline = next_newline != newline_offsets.len() && {
|
||||
let newline_offset = newline_offsets[next_newline];
|
||||
assert!(newline_offset >= brace_statement.source_range().start());
|
||||
newline_offset < brace_statement.source_range().end()
|
||||
};
|
||||
if contains_newline {
|
||||
result.push(brace_statement.source_range().start());
|
||||
}
|
||||
}
|
||||
assert!(result.is_sorted_by(|l, r| Some(l.cmp(r))));
|
||||
result
|
||||
}
|
||||
}
|
||||
|
||||
impl<'source, 'ast> PrettyPrinterState<'source, 'ast> {
|
||||
@@ -617,6 +672,42 @@ fn visit_semi_nl(&mut self, node: &dyn ast::Token) {
|
||||
}
|
||||
}
|
||||
|
||||
fn is_multi_line_brace(&self, node: &dyn ast::Token) -> bool {
|
||||
node.parent()
|
||||
.unwrap()
|
||||
.as_brace_statement()
|
||||
.is_some_and(|brace_statement| {
|
||||
self.multi_line_brace_statement_locations
|
||||
.binary_search(&brace_statement.source_range().start())
|
||||
.is_ok()
|
||||
})
|
||||
}
|
||||
fn visit_left_brace(&mut self, node: &dyn ast::Token) {
|
||||
let range = node.source_range();
|
||||
let flags = self.gap_text_flags_before_node(node.as_node());
|
||||
if self.is_multi_line_brace(node) && !self.at_line_start() {
|
||||
self.emit_newline();
|
||||
}
|
||||
self.current_indent = self.indent(range.start());
|
||||
self.emit_space_or_indent(flags);
|
||||
self.output.push('{');
|
||||
}
|
||||
fn visit_right_brace(&mut self, node: &dyn ast::Token) {
|
||||
let range = node.source_range();
|
||||
let flags = self.gap_text_flags_before_node(node.as_node());
|
||||
self.emit_gap_text_before(range, flags);
|
||||
if self.is_multi_line_brace(node) {
|
||||
self.current_indent = self.indent(range.start());
|
||||
if !self.at_line_start() {
|
||||
self.emit_newline();
|
||||
}
|
||||
self.emit_space_or_indent(flags);
|
||||
self.output.push('}');
|
||||
} else {
|
||||
self.emit_node_text(node.as_node());
|
||||
}
|
||||
}
|
||||
|
||||
fn visit_redirection(&mut self, node: &ast::Redirection) {
|
||||
// No space between a redirection operator and its target (#2899).
|
||||
let Some(orange) = node.oper.range() else {
|
||||
@@ -684,11 +775,12 @@ fn visit(&mut self, node: &'_ dyn Node) {
|
||||
return;
|
||||
}
|
||||
if let Some(token) = node.as_token() {
|
||||
if token.token_type() == ParseTokenType::end {
|
||||
self.visit_semi_nl(token);
|
||||
return;
|
||||
match token.token_type() {
|
||||
ParseTokenType::end => self.visit_semi_nl(token),
|
||||
ParseTokenType::left_brace => self.visit_left_brace(token),
|
||||
ParseTokenType::right_brace => self.visit_right_brace(token),
|
||||
_ => self.emit_node_text(node),
|
||||
}
|
||||
self.emit_node_text(node);
|
||||
return;
|
||||
}
|
||||
match node.typ() {
|
||||
|
||||
@@ -20,6 +20,7 @@
|
||||
use crate::reader::{reader_pop, reader_push, reader_readline};
|
||||
use crate::tokenizer::Tokenizer;
|
||||
use crate::tokenizer::TOK_ACCEPT_UNFINISHED;
|
||||
use crate::tokenizer::TOK_ARGUMENT_LIST;
|
||||
use crate::wcstringutil::split_about;
|
||||
use crate::wcstringutil::split_string_tok;
|
||||
use crate::wutil;
|
||||
@@ -644,7 +645,7 @@ pub fn read(parser: &Parser, streams: &mut IoStreams, argv: &mut [&wstr]) -> Opt
|
||||
}
|
||||
|
||||
if opts.tokenize {
|
||||
let mut tok = Tokenizer::new(&buff, TOK_ACCEPT_UNFINISHED);
|
||||
let mut tok = Tokenizer::new(&buff, TOK_ACCEPT_UNFINISHED | TOK_ARGUMENT_LIST);
|
||||
if opts.array {
|
||||
// Array mode: assign each token as a separate element of the sole var.
|
||||
let mut tokens = vec![];
|
||||
|
||||
@@ -13,6 +13,7 @@
|
||||
ast::unescape_keyword,
|
||||
common::charptr2wcstring,
|
||||
reader::{get_quote, is_backslashed},
|
||||
tokenizer::is_brace_statement,
|
||||
util::wcsfilecmp,
|
||||
wutil::sprintf,
|
||||
};
|
||||
@@ -663,7 +664,20 @@ fn perform_for_commandline_impl(&mut self, cmdline: WString) {
|
||||
|
||||
// Get all the arguments.
|
||||
let mut tokens = Vec::new();
|
||||
{
|
||||
let proc_range =
|
||||
parse_util_process_extent(&cmdline, position_in_statement, Some(&mut tokens));
|
||||
let start = proc_range.start;
|
||||
if start != 0
|
||||
&& cmdline.as_char_slice()[start - 1] == '{'
|
||||
&& (start == cmdline.len()
|
||||
|| !is_brace_statement(cmdline.as_char_slice().get(start).copied()))
|
||||
{
|
||||
// We don't want to suggest commands here, since this command line parses as
|
||||
// brace expansion.
|
||||
return;
|
||||
}
|
||||
}
|
||||
let actual_token_count = tokens.len();
|
||||
|
||||
// Hack: fix autosuggestion by removing prefixing "and"s #6249.
|
||||
|
||||
@@ -1,8 +1,9 @@
|
||||
//! Functions for syntax highlighting.
|
||||
use crate::abbrs::{self, with_abbrs};
|
||||
use crate::ast::{
|
||||
self, Argument, Ast, BlockStatement, BlockStatementHeaderVariant, DecoratedStatement, Keyword,
|
||||
Leaf, List, Node, NodeVisitor, Redirection, Token, Type, VariableAssignment,
|
||||
self, Argument, Ast, BlockStatement, BlockStatementHeaderVariant, BraceStatement,
|
||||
DecoratedStatement, Keyword, Leaf, List, Node, NodeVisitor, Redirection, Token, Type,
|
||||
VariableAssignment,
|
||||
};
|
||||
use crate::builtins::shared::builtin_exists;
|
||||
use crate::color::RgbColor;
|
||||
@@ -869,6 +870,9 @@ fn visit_token(&mut self, tok: &dyn Token) {
|
||||
ParseTokenType::end | ParseTokenType::pipe | ParseTokenType::background => {
|
||||
role = HighlightRole::statement_terminator
|
||||
}
|
||||
ParseTokenType::left_brace | ParseTokenType::right_brace => {
|
||||
role = HighlightRole::keyword;
|
||||
}
|
||||
ParseTokenType::andand | ParseTokenType::oror => role = HighlightRole::operat,
|
||||
ParseTokenType::string => {
|
||||
// Assume all strings are params. This handles e.g. the variables a for header or
|
||||
@@ -1063,6 +1067,12 @@ fn visit_block_statement(&mut self, block: &BlockStatement) {
|
||||
self.visit(&block.end);
|
||||
self.pending_variables.truncate(pending_variables_count);
|
||||
}
|
||||
fn visit_brace_statement(&mut self, brace_statement: &BraceStatement) {
|
||||
self.visit(&brace_statement.left_brace);
|
||||
self.visit(&brace_statement.args_or_redirs);
|
||||
self.visit(&brace_statement.jobs);
|
||||
self.visit(&brace_statement.right_brace);
|
||||
}
|
||||
}
|
||||
|
||||
/// Return whether a string contains a command substitution.
|
||||
@@ -1121,6 +1131,7 @@ fn visit(&mut self, node: &'a dyn Node) {
|
||||
self.visit_decorated_statement(node.as_decorated_statement().unwrap())
|
||||
}
|
||||
Type::block_statement => self.visit_block_statement(node.as_block_statement().unwrap()),
|
||||
Type::brace_statement => self.visit_brace_statement(node.as_brace_statement().unwrap()),
|
||||
// Default implementation is to just visit children.
|
||||
_ => self.visit_children(node),
|
||||
}
|
||||
|
||||
@@ -66,6 +66,8 @@ pub enum ParseTokenType {
|
||||
// Terminal types.
|
||||
string,
|
||||
pipe,
|
||||
left_brace,
|
||||
right_brace,
|
||||
redirection,
|
||||
background,
|
||||
andand,
|
||||
@@ -135,6 +137,7 @@ pub enum ParseErrorCode {
|
||||
unbalancing_end, // end outside of block
|
||||
unbalancing_else, // else outside of if
|
||||
unbalancing_case, // case outside of switch
|
||||
unbalancing_brace, // } outside of {
|
||||
bare_variable_assignment, // a=b without command
|
||||
andor_in_pipeline, // "and" or "or" after a pipe
|
||||
}
|
||||
@@ -207,6 +210,8 @@ pub fn to_wstr(self) -> &'static wstr {
|
||||
ParseTokenType::background => L!("ParseTokenType::background"),
|
||||
ParseTokenType::end => L!("ParseTokenType::end"),
|
||||
ParseTokenType::pipe => L!("ParseTokenType::pipe"),
|
||||
ParseTokenType::left_brace => L!("ParseTokenType::lbrace"),
|
||||
ParseTokenType::right_brace => L!("ParseTokenType::rbrace"),
|
||||
ParseTokenType::redirection => L!("ParseTokenType::redirection"),
|
||||
ParseTokenType::string => L!("ParseTokenType::string"),
|
||||
ParseTokenType::andand => L!("ParseTokenType::andand"),
|
||||
@@ -426,6 +431,8 @@ pub fn token_type_user_presentable_description(
|
||||
ParseTokenType::pipe => L!("a pipe").to_owned(),
|
||||
ParseTokenType::redirection => L!("a redirection").to_owned(),
|
||||
ParseTokenType::background => L!("a '&'").to_owned(),
|
||||
ParseTokenType::left_brace => L!("a '{'").to_owned(),
|
||||
ParseTokenType::right_brace => L!("a '}'").to_owned(),
|
||||
ParseTokenType::andand => L!("'&&'").to_owned(),
|
||||
ParseTokenType::oror => L!("'||'").to_owned(),
|
||||
ParseTokenType::end => L!("end of the statement").to_owned(),
|
||||
@@ -529,7 +536,3 @@ pub fn parse_error_offset_source_start(errors: &mut ParseErrorList, amt: usize)
|
||||
/// Error message for a command like `time foo &`.
|
||||
pub const ERROR_TIME_BACKGROUND: &str =
|
||||
"'time' is not supported for background jobs. Consider using 'command time'.";
|
||||
|
||||
/// Error issued on { echo; echo }.
|
||||
pub const ERROR_NO_BRACE_GROUPING: &str =
|
||||
"'{ ... }' is not supported for grouping commands. Please use 'begin; ...; end'";
|
||||
|
||||
@@ -28,9 +28,9 @@
|
||||
use crate::operation_context::OperationContext;
|
||||
use crate::parse_constants::{
|
||||
parse_error_offset_source_start, ParseError, ParseErrorCode, ParseErrorList, ParseKeyword,
|
||||
ParseTokenType, StatementDecoration, CALL_STACK_LIMIT_EXCEEDED_ERR_MSG,
|
||||
ERROR_NO_BRACE_GROUPING, ERROR_TIME_BACKGROUND, FAILED_EXPANSION_VARIABLE_NAME_ERR_MSG,
|
||||
ILLEGAL_FD_ERR_MSG, INFINITE_FUNC_RECURSION_ERR_MSG, WILDCARD_ERR_MSG,
|
||||
ParseTokenType, StatementDecoration, CALL_STACK_LIMIT_EXCEEDED_ERR_MSG, ERROR_TIME_BACKGROUND,
|
||||
FAILED_EXPANSION_VARIABLE_NAME_ERR_MSG, ILLEGAL_FD_ERR_MSG, INFINITE_FUNC_RECURSION_ERR_MSG,
|
||||
WILDCARD_ERR_MSG,
|
||||
};
|
||||
use crate::parse_tree::{LineCounter, NodeRef, ParsedSourceRef};
|
||||
use crate::parse_util::parse_util_unescape_wildcards;
|
||||
@@ -162,6 +162,9 @@ fn eval_statement(
|
||||
StatementVariant::BlockStatement(block) => {
|
||||
self.run_block_statement(ctx, block, associated_block)
|
||||
}
|
||||
StatementVariant::BraceStatement(brace_statement) => {
|
||||
self.run_begin_statement(ctx, &brace_statement.jobs)
|
||||
}
|
||||
StatementVariant::IfStatement(ifstat) => {
|
||||
self.run_if_statement(ctx, ifstat, associated_block)
|
||||
}
|
||||
@@ -363,10 +366,6 @@ fn handle_command_not_found(
|
||||
}
|
||||
}
|
||||
|
||||
if cmd.as_char_slice().first() == Some(&'{' /*}*/) {
|
||||
error.push_utfstr(&wgettext!(ERROR_NO_BRACE_GROUPING));
|
||||
}
|
||||
|
||||
// Here we want to report an error (so it shows a backtrace).
|
||||
// If the handler printed text, that's already shown, so error will be empty.
|
||||
report_error_formatted!(
|
||||
@@ -569,6 +568,7 @@ fn job_is_simple_block(&self, job: &ast::JobPipeline) -> bool {
|
||||
// type safety (in case we add more specific statement types).
|
||||
match &job.statement.contents {
|
||||
StatementVariant::BlockStatement(stmt) => no_redirs(&stmt.args_or_redirs),
|
||||
StatementVariant::BraceStatement(stmt) => no_redirs(&stmt.args_or_redirs),
|
||||
StatementVariant::SwitchStatement(stmt) => no_redirs(&stmt.args_or_redirs),
|
||||
StatementVariant::IfStatement(stmt) => no_redirs(&stmt.args_or_redirs),
|
||||
StatementVariant::NotStatement(_) | StatementVariant::DecoratedStatement(_) => {
|
||||
@@ -688,6 +688,7 @@ fn populate_job_process(
|
||||
self.populate_not_process(ctx, job, proc, not_statement)
|
||||
}
|
||||
StatementVariant::BlockStatement(_)
|
||||
| StatementVariant::BraceStatement(_)
|
||||
| StatementVariant::IfStatement(_)
|
||||
| StatementVariant::SwitchStatement(_) => {
|
||||
self.populate_block_process(ctx, proc, statement, specific_statement)
|
||||
@@ -852,6 +853,7 @@ fn populate_block_process(
|
||||
// TODO: args_or_redirs should be available without resolving the statement type.
|
||||
let args_or_redirs = match specific_statement {
|
||||
StatementVariant::BlockStatement(block_statement) => &block_statement.args_or_redirs,
|
||||
StatementVariant::BraceStatement(brace_statement) => &brace_statement.args_or_redirs,
|
||||
StatementVariant::IfStatement(if_statement) => &if_statement.args_or_redirs,
|
||||
StatementVariant::SwitchStatement(switch_statement) => &switch_statement.args_or_redirs,
|
||||
_ => panic!("Unexpected block node type"),
|
||||
@@ -1593,6 +1595,9 @@ fn run_1_job(
|
||||
StatementVariant::BlockStatement(block_statement) => {
|
||||
self.run_block_statement(ctx, block_statement, associated_block)
|
||||
}
|
||||
StatementVariant::BraceStatement(brace_statement) => {
|
||||
self.run_begin_statement(ctx, &brace_statement.jobs)
|
||||
}
|
||||
StatementVariant::IfStatement(ifstmt) => {
|
||||
self.run_if_statement(ctx, ifstmt, associated_block)
|
||||
}
|
||||
@@ -1923,6 +1928,7 @@ enum Globspec {
|
||||
fn type_is_redirectable_block(typ: ast::Type) -> bool {
|
||||
[
|
||||
ast::Type::block_statement,
|
||||
ast::Type::brace_statement,
|
||||
ast::Type::if_statement,
|
||||
ast::Type::switch_statement,
|
||||
]
|
||||
@@ -1961,6 +1967,9 @@ fn profiling_cmd_name_for_redirectable_block(
|
||||
BlockStatementHeaderVariant::None => panic!("Unexpected block header type"),
|
||||
}
|
||||
}
|
||||
StatementVariant::BraceStatement(brace_statement) => {
|
||||
brace_statement.left_brace.source_range().start()
|
||||
}
|
||||
StatementVariant::IfStatement(ifstmt) => {
|
||||
ifstmt.if_clause.condition.job.source_range().end()
|
||||
}
|
||||
|
||||
@@ -93,6 +93,7 @@ fn from(err: TokenizerError) -> Self {
|
||||
}
|
||||
TokenizerError::unterminated_slice => ParseErrorCode::tokenizer_unterminated_slice,
|
||||
TokenizerError::unterminated_escape => ParseErrorCode::tokenizer_unterminated_escape,
|
||||
// To-do: maybe also unbalancing brace?
|
||||
_ => ParseErrorCode::tokenizer_other,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -415,6 +415,8 @@ fn job_or_process_extent(
|
||||
| TokenType::background
|
||||
| TokenType::andand
|
||||
| TokenType::oror
|
||||
| TokenType::left_brace
|
||||
| TokenType::right_brace
|
||||
if (token.type_ != TokenType::pipe || process) =>
|
||||
{
|
||||
if tok_begin >= pos {
|
||||
@@ -1049,9 +1051,9 @@ fn visit(&mut self, node: &'a dyn Node) {
|
||||
dec = if switchs.end.has_source() { 1 } else { 0 };
|
||||
}
|
||||
Type::token_base => {
|
||||
if node.parent().unwrap().typ() == Type::begin_header
|
||||
&& node.as_token().unwrap().token_type() == ParseTokenType::end
|
||||
{
|
||||
let token_type = node.as_token().unwrap().token_type();
|
||||
let parent_type = node.parent().unwrap().typ();
|
||||
if parent_type == Type::begin_header && token_type == ParseTokenType::end {
|
||||
// The newline after "begin" is optional, so it is part of the header.
|
||||
// The header is not in the indented block, so indent the newline here.
|
||||
if node.source(self.src) == "\n" {
|
||||
@@ -1059,6 +1061,11 @@ fn visit(&mut self, node: &'a dyn Node) {
|
||||
dec = 1;
|
||||
}
|
||||
}
|
||||
// if token_type == ParseTokenType::right_brace && parent_type == Type::brace_statement
|
||||
// {
|
||||
// inc = 1;
|
||||
// dec = 1;
|
||||
// }
|
||||
}
|
||||
_ => (),
|
||||
}
|
||||
@@ -1229,6 +1236,15 @@ pub fn parse_util_detect_errors_in_ast(
|
||||
}
|
||||
errored |=
|
||||
detect_errors_in_block_redirection_list(&block.args_or_redirs, &mut out_errors);
|
||||
} else if let Some(brace_statement) = node.as_brace_statement() {
|
||||
// If our closing brace had no source, we are unsourced.
|
||||
if !brace_statement.right_brace.has_source() {
|
||||
has_unclosed_block = true;
|
||||
}
|
||||
errored |= detect_errors_in_block_redirection_list(
|
||||
&brace_statement.args_or_redirs,
|
||||
&mut out_errors,
|
||||
);
|
||||
} else if let Some(ifs) = node.as_if_statement() {
|
||||
// If our 'end' had no source, we are unsourced.
|
||||
if !ifs.end.has_source() {
|
||||
@@ -1780,15 +1796,28 @@ fn detect_errors_in_block_redirection_list(
|
||||
args_or_redirs: &ast::ArgumentOrRedirectionList,
|
||||
out_errors: &mut Option<&mut ParseErrorList>,
|
||||
) -> bool {
|
||||
if let Some(first_arg) = get_first_arg(args_or_redirs) {
|
||||
let Some(first_arg) = get_first_arg(args_or_redirs) else {
|
||||
return false;
|
||||
};
|
||||
if args_or_redirs
|
||||
.parent()
|
||||
.unwrap()
|
||||
.as_brace_statement()
|
||||
.is_some()
|
||||
{
|
||||
return append_syntax_error!(
|
||||
out_errors,
|
||||
first_arg.source_range().start(),
|
||||
first_arg.source_range().length(),
|
||||
END_ARG_ERR_MSG
|
||||
RIGHT_BRACE_ARG_ERR_MSG
|
||||
);
|
||||
}
|
||||
false
|
||||
append_syntax_error!(
|
||||
out_errors,
|
||||
first_arg.source_range().start(),
|
||||
first_arg.source_range().length(),
|
||||
END_ARG_ERR_MSG
|
||||
)
|
||||
}
|
||||
|
||||
/// Given a string containing a variable expansion error, append an appropriate error to the errors
|
||||
@@ -1898,6 +1927,7 @@ pub fn parse_util_expand_variable_error(
|
||||
|
||||
/// Error message for arguments to 'end'
|
||||
const END_ARG_ERR_MSG: &str = "'end' does not take arguments. Did you forget a ';'?";
|
||||
const RIGHT_BRACE_ARG_ERR_MSG: &str = "'}' does not take arguments. Did you forget a ';'?";
|
||||
|
||||
/// Error message when 'time' is in a pipeline.
|
||||
const TIME_IN_PIPELINE_ERR_MSG: &str =
|
||||
|
||||
@@ -299,6 +299,22 @@ fn detect_argument_errors(src: &str) -> Result<(), ParserTestErrorBits> {
|
||||
detect_errors!("true || \n") == Err(ParserTestErrorBits::INCOMPLETE),
|
||||
"unterminated conjunction not reported properly"
|
||||
);
|
||||
|
||||
assert!(
|
||||
detect_errors!("begin ; echo hi; }") == Err(ParserTestErrorBits::ERROR),
|
||||
"closing of unopened brace statement not reported properly"
|
||||
);
|
||||
|
||||
assert_eq!(
|
||||
detect_errors!("begin {"), // }
|
||||
Err(ParserTestErrorBits::INCOMPLETE),
|
||||
"brace after begin not reported properly"
|
||||
);
|
||||
assert_eq!(
|
||||
detect_errors!("a=b {"), // }
|
||||
Err(ParserTestErrorBits::INCOMPLETE),
|
||||
"brace after variable override not reported properly"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
@@ -604,6 +620,8 @@ macro_rules! validate {
|
||||
validate!("case", ParseErrorCode::unbalancing_case);
|
||||
validate!("if true ; case ; end", ParseErrorCode::unbalancing_case);
|
||||
|
||||
validate!("begin ; }", ParseErrorCode::unbalancing_brace);
|
||||
|
||||
validate!("true | and", ParseErrorCode::andor_in_pipeline);
|
||||
|
||||
validate!("a=", ParseErrorCode::bare_variable_assignment);
|
||||
|
||||
@@ -31,6 +31,43 @@ fn test_tokenizer() {
|
||||
assert!(t.next().is_none());
|
||||
}
|
||||
|
||||
{
|
||||
let s = L!("{ echo");
|
||||
let mut t = Tokenizer::new(s, TokFlags(0));
|
||||
|
||||
let token = t.next(); // {
|
||||
assert!(token.is_some());
|
||||
let token = token.unwrap();
|
||||
assert_eq!(token.type_, TokenType::left_brace);
|
||||
assert_eq!(token.length, 1);
|
||||
assert_eq!(t.text_of(&token), "{");
|
||||
|
||||
let token = t.next(); // echo
|
||||
assert!(token.is_some());
|
||||
let token = token.unwrap();
|
||||
assert_eq!(token.type_, TokenType::string);
|
||||
assert_eq!(token.offset, 2);
|
||||
assert_eq!(token.length, 4);
|
||||
assert_eq!(t.text_of(&token), "echo");
|
||||
|
||||
assert!(t.next().is_none());
|
||||
}
|
||||
|
||||
{
|
||||
let s = L!("{echo, foo}");
|
||||
let mut t = Tokenizer::new(s, TokFlags(0));
|
||||
let token = t.next().unwrap();
|
||||
assert_eq!(token.type_, TokenType::string);
|
||||
assert_eq!(token.length, 11);
|
||||
assert!(t.next().is_none());
|
||||
}
|
||||
{
|
||||
let s = L!("{ echo; foo}");
|
||||
let mut t = Tokenizer::new(s, TokFlags(0));
|
||||
let token = t.next().unwrap();
|
||||
assert_eq!(token.type_, TokenType::left_brace);
|
||||
}
|
||||
|
||||
let s = L!(concat!(
|
||||
"string <redirection 2>&1 'nested \"quoted\" '(string containing subshells ",
|
||||
"){and,brackets}$as[$well (as variable arrays)] not_a_redirect^ ^ ^^is_a_redirect ",
|
||||
|
||||
101
src/tokenizer.rs
101
src/tokenizer.rs
@@ -1,14 +1,16 @@
|
||||
//! A specialized tokenizer for tokenizing the fish language. In the future, the tokenizer should be
|
||||
//! extended to support marks, tokenizing multiple strings and disposing of unused string segments.
|
||||
|
||||
use crate::ast::unescape_keyword;
|
||||
use crate::common::valid_var_name_char;
|
||||
use crate::future_feature_flags::{feature_test, FeatureFlag};
|
||||
use crate::parse_constants::SOURCE_OFFSET_INVALID;
|
||||
use crate::parser_keywords::parser_keywords_is_subcommand;
|
||||
use crate::redirection::RedirectionMode;
|
||||
use crate::wchar::prelude::*;
|
||||
use libc::{STDIN_FILENO, STDOUT_FILENO};
|
||||
use nix::fcntl::OFlag;
|
||||
use std::ops::{BitAnd, BitAndAssign, BitOr, BitOrAssign, Not};
|
||||
use std::ops::{BitAnd, BitAndAssign, BitOr, BitOrAssign, Not, Range};
|
||||
use std::os::fd::RawFd;
|
||||
|
||||
/// Token types. XXX Why this isn't ParseTokenType, I'm not really sure.
|
||||
@@ -26,6 +28,10 @@ pub enum TokenType {
|
||||
oror,
|
||||
/// End token (semicolon or newline, not literal end)
|
||||
end,
|
||||
/// opening brace of a compound statement
|
||||
left_brace,
|
||||
/// closing brace of a compound statement
|
||||
right_brace,
|
||||
/// redirection token
|
||||
redirect,
|
||||
/// send job to bg token
|
||||
@@ -146,6 +152,10 @@ fn bitor_assign(&mut self, rhs: Self) {
|
||||
/// Make an effort to continue after an error.
|
||||
pub const TOK_CONTINUE_AFTER_ERROR: TokFlags = TokFlags(8);
|
||||
|
||||
/// Consumers want to treat all tokens as arguments, so disable special handling at
|
||||
/// command-position.
|
||||
pub const TOK_ARGUMENT_LIST: TokFlags = TokFlags(16);
|
||||
|
||||
impl From<TokenizerError> for &'static wstr {
|
||||
fn from(err: TokenizerError) -> Self {
|
||||
match err {
|
||||
@@ -178,7 +188,7 @@ fn from(err: TokenizerError) -> Self {
|
||||
wgettext!("Unexpected '[' at this location")
|
||||
}
|
||||
TokenizerError::closing_unopened_brace => {
|
||||
wgettext!("Unexpected '}' for unopened brace expansion")
|
||||
wgettext!("Unexpected '}' for unopened brace")
|
||||
}
|
||||
TokenizerError::unterminated_brace => {
|
||||
wgettext!("Unexpected end of string, incomplete parameter expansion")
|
||||
@@ -234,6 +244,9 @@ pub fn set_length(&mut self, value: usize) {
|
||||
pub fn end(&self) -> usize {
|
||||
self.offset() + self.length()
|
||||
}
|
||||
pub fn range(&self) -> Range<usize> {
|
||||
self.offset()..self.end()
|
||||
}
|
||||
pub fn set_error_offset_within_token(&mut self, value: usize) {
|
||||
self.error_offset_within_token = value.try_into().unwrap();
|
||||
}
|
||||
@@ -248,6 +261,11 @@ pub fn set_error_length(&mut self, value: usize) {
|
||||
}
|
||||
}
|
||||
|
||||
struct BraceStatementParser {
|
||||
at_command_position: bool,
|
||||
unclosed_brace_statements: usize,
|
||||
}
|
||||
|
||||
/// The tokenizer struct.
|
||||
pub struct Tokenizer<'c> {
|
||||
/// A pointer into the original string, showing where the next token begins.
|
||||
@@ -256,6 +274,8 @@ pub struct Tokenizer<'c> {
|
||||
start: &'c wstr,
|
||||
/// Whether we have additional tokens.
|
||||
has_next: bool,
|
||||
/// Parser state regarding brace statements. None if reading an argument list.
|
||||
brace_statement_parser: Option<BraceStatementParser>,
|
||||
/// Whether incomplete tokens are accepted.
|
||||
accept_unfinished: bool,
|
||||
/// Whether comments should be returned.
|
||||
@@ -270,6 +290,10 @@ pub struct Tokenizer<'c> {
|
||||
on_quote_toggle: Option<&'c mut dyn FnMut(usize)>,
|
||||
}
|
||||
|
||||
pub(crate) fn is_brace_statement(next_char: Option<char>) -> bool {
|
||||
next_char.map_or(true, |next| next.is_ascii_whitespace() || next == ';')
|
||||
}
|
||||
|
||||
impl<'c> Tokenizer<'c> {
|
||||
/// Constructor for a tokenizer. b is the string that is to be tokenized. It is not copied, and
|
||||
/// should not be freed by the caller until after the tokenizer is destroyed.
|
||||
@@ -297,6 +321,12 @@ fn new_impl(
|
||||
token_cursor: 0,
|
||||
start,
|
||||
has_next: true,
|
||||
brace_statement_parser: (!(flags & TOK_ARGUMENT_LIST)).then_some(
|
||||
BraceStatementParser {
|
||||
at_command_position: true,
|
||||
unclosed_brace_statements: 0,
|
||||
},
|
||||
),
|
||||
accept_unfinished: flags & TOK_ACCEPT_UNFINISHED,
|
||||
show_comments: flags & TOK_SHOW_COMMENTS,
|
||||
show_blank_lines: flags & TOK_SHOW_BLANK_LINES,
|
||||
@@ -368,7 +398,8 @@ fn next(&mut self) -> Option<Self::Item> {
|
||||
.get(self.token_cursor + 1)
|
||||
.copied();
|
||||
let buff = &self.start[self.token_cursor..];
|
||||
match this_char {
|
||||
let mut at_cmd_pos = false;
|
||||
let token = match this_char {
|
||||
'\0'=> {
|
||||
self.has_next = false;
|
||||
None
|
||||
@@ -380,6 +411,7 @@ fn next(&mut self) -> Option<Self::Item> {
|
||||
result.offset = start_pos as u32;
|
||||
result.length = 1;
|
||||
self.token_cursor += 1;
|
||||
at_cmd_pos = true;
|
||||
// Hack: when we get a newline, swallow as many as we can. This compresses multiple
|
||||
// subsequent newlines into a single one.
|
||||
if !self.show_blank_lines {
|
||||
@@ -393,6 +425,38 @@ fn next(&mut self) -> Option<Self::Item> {
|
||||
}
|
||||
Some(result)
|
||||
}
|
||||
'{' if self.brace_statement_parser.as_ref()
|
||||
.is_some_and(|parser| parser.at_command_position)
|
||||
&& is_brace_statement(self.start.as_char_slice().get(self.token_cursor + 1).copied())
|
||||
=>
|
||||
{
|
||||
self.brace_statement_parser.as_mut().unwrap().unclosed_brace_statements += 1;
|
||||
let mut result = Tok::new(TokenType::left_brace);
|
||||
result.offset = start_pos as u32;
|
||||
result.length = 1;
|
||||
self.token_cursor += 1;
|
||||
at_cmd_pos = true;
|
||||
Some(result)
|
||||
}
|
||||
'}' => {
|
||||
let brace_count = self.brace_statement_parser.as_mut()
|
||||
.map(|parser| &mut parser.unclosed_brace_statements);
|
||||
if brace_count.as_ref().map_or(true, |count| **count == 0) {
|
||||
return Some(self.call_error(
|
||||
TokenizerError::closing_unopened_brace,
|
||||
self.token_cursor,
|
||||
self.token_cursor,
|
||||
Some(1),
|
||||
1,
|
||||
));
|
||||
}
|
||||
brace_count.map(|count| *count -= 1);
|
||||
let mut result = Tok::new(TokenType::right_brace);
|
||||
result.offset = start_pos as u32;
|
||||
result.length = 1;
|
||||
self.token_cursor += 1;
|
||||
Some(result)
|
||||
}
|
||||
'&'=> {
|
||||
if next_char == Some('&') {
|
||||
// && is and.
|
||||
@@ -400,6 +464,7 @@ fn next(&mut self) -> Option<Self::Item> {
|
||||
result.offset = start_pos as u32;
|
||||
result.length = 2;
|
||||
self.token_cursor += 2;
|
||||
at_cmd_pos = true;
|
||||
Some(result)
|
||||
} else if next_char == Some('>') || next_char == Some('|') {
|
||||
// &> and &| redirect both stdout and stderr.
|
||||
@@ -409,12 +474,14 @@ fn next(&mut self) -> Option<Self::Item> {
|
||||
result.offset = start_pos as u32;
|
||||
result.length = redir.consumed as u32;
|
||||
self.token_cursor += redir.consumed;
|
||||
at_cmd_pos = next_char == Some('|');
|
||||
Some(result)
|
||||
} else {
|
||||
let mut result = Tok::new(TokenType::background);
|
||||
result.offset = start_pos as u32;
|
||||
result.length = 1;
|
||||
self.token_cursor += 1;
|
||||
at_cmd_pos = true;
|
||||
Some(result)
|
||||
}
|
||||
}
|
||||
@@ -425,6 +492,7 @@ fn next(&mut self) -> Option<Self::Item> {
|
||||
result.offset = start_pos as u32;
|
||||
result.length = 2;
|
||||
self.token_cursor += 2;
|
||||
at_cmd_pos = true;
|
||||
Some(result)
|
||||
} else if next_char == Some('&') {
|
||||
// |& is a bashism; in fish it's &|.
|
||||
@@ -437,6 +505,7 @@ fn next(&mut self) -> Option<Self::Item> {
|
||||
result.offset = start_pos as u32;
|
||||
result.length = pipe.consumed as u32;
|
||||
self.token_cursor += pipe.consumed;
|
||||
at_cmd_pos = true;
|
||||
Some(result)
|
||||
}
|
||||
}
|
||||
@@ -489,16 +558,31 @@ fn next(&mut self) -> Option<Self::Item> {
|
||||
result.offset = start_pos as u32;
|
||||
result.length = redir_or_pipe.consumed as u32;
|
||||
self.token_cursor += redir_or_pipe.consumed;
|
||||
at_cmd_pos = redir_or_pipe.is_pipe;
|
||||
Some(result)
|
||||
}
|
||||
}
|
||||
None => {
|
||||
// Not a redirection or pipe, so just a string.
|
||||
Some(self.read_string())
|
||||
let s = self.read_string();
|
||||
at_cmd_pos = self.brace_statement_parser.as_ref()
|
||||
.is_some_and(|parser| parser.at_command_position) && {
|
||||
let text = self.text_of(&s);
|
||||
parser_keywords_is_subcommand(&unescape_keyword(
|
||||
TokenType::string,
|
||||
text)
|
||||
) ||
|
||||
variable_assignment_equals_pos(text).is_some()
|
||||
};
|
||||
Some(s)
|
||||
}
|
||||
}
|
||||
}
|
||||
};
|
||||
if let Some(parser) = self.brace_statement_parser.as_mut() {
|
||||
parser.at_command_position = at_cmd_pos;
|
||||
}
|
||||
token
|
||||
}
|
||||
}
|
||||
|
||||
@@ -675,13 +759,8 @@ fn process_opening_quote(
|
||||
);
|
||||
}
|
||||
if brace_offsets.pop().is_none() {
|
||||
return self.call_error(
|
||||
TokenizerError::closing_unopened_brace,
|
||||
self.token_cursor,
|
||||
self.token_cursor,
|
||||
Some(1),
|
||||
1,
|
||||
);
|
||||
// Let the caller throw an error.
|
||||
break;
|
||||
}
|
||||
if brace_offsets.is_empty() {
|
||||
mode &= !TOK_MODE_CURLY_BRACES;
|
||||
|
||||
@@ -51,3 +51,158 @@ end
|
||||
|
||||
echo {a(echo ,)b}
|
||||
#CHECK: {a,b}
|
||||
|
||||
e{cho,cho,cho}
|
||||
# CHECK: echo echo
|
||||
|
||||
## Compound commands
|
||||
|
||||
{ echo compound; echo command; }
|
||||
# CHECK: compound
|
||||
# CHECK: command
|
||||
|
||||
{;echo -n start with\ ; echo semi; }
|
||||
# CHECK: start with semi
|
||||
|
||||
{ echo no semi }
|
||||
# CHECK: no semi
|
||||
|
||||
# Ambiguous cases
|
||||
|
||||
{ echo ,comma;}
|
||||
# CHECK: ,comma
|
||||
|
||||
PATH= {echo no space}
|
||||
# CHECKERR: fish: Unknown command: '{echo no space}'
|
||||
# CHECKERR: {{.*}}/braces.fish (line {{\d+}}):
|
||||
# CHECKERR: PATH= {echo no space}
|
||||
# CHECKERR: ^~~~~~~~~~~~~~^
|
||||
|
||||
PATH= {echo comma, no space;}
|
||||
# CHECKERR: fish: Unknown command: 'echo comma'
|
||||
# CHECKERR: {{.*}}/braces.fish (line {{\d+}}):
|
||||
# CHECKERR: PATH= {echo comma, no space;}
|
||||
# CHECKERR: ^~~~~~~~~~~~~~~~~~~~~~^
|
||||
|
||||
# Ambiguous case with no space
|
||||
{echo,hello}
|
||||
# CHECK: hello
|
||||
|
||||
# Trailing tokens
|
||||
set -l fish (status fish-path)
|
||||
$fish -c '{ :; } true'
|
||||
# CHECKERR: fish: '}' does not take arguments. Did you forget a ';'?
|
||||
# CHECKERR: { :; } true
|
||||
# CHECKERR: ^~~^
|
||||
|
||||
; { echo semi; }
|
||||
# CHECK: semi
|
||||
|
||||
a=b { echo $a; }
|
||||
# CHECK: b
|
||||
|
||||
time { :; }
|
||||
# CHECKERR:
|
||||
# CHECKERR: {{_+}}
|
||||
# CHECKERR: Executed in {{.*}}
|
||||
# CHECKERR: usr time {{.*}}
|
||||
# CHECKERR: sys time {{.*}}
|
||||
|
||||
true & { echo background; }
|
||||
# CHECK: background
|
||||
|
||||
true && { echo conjunction; }
|
||||
# CHECK: conjunction
|
||||
|
||||
true; and { echo and; }
|
||||
# CHECK: and
|
||||
|
||||
true | { echo pipe; }
|
||||
# CHECK: pipe
|
||||
|
||||
true 2>| { echo stderrpipe; }
|
||||
# CHECK: stderrpipe
|
||||
|
||||
false || { echo disjunction; }
|
||||
# CHECK: disjunction
|
||||
|
||||
false; or { echo or; }
|
||||
# CHECK: or
|
||||
|
||||
begin { echo begin }
|
||||
end
|
||||
# CHECK: begin
|
||||
|
||||
not { false; true }
|
||||
echo $status
|
||||
# CHECK: 1
|
||||
|
||||
! { false }
|
||||
echo $status
|
||||
# CHECK: 0
|
||||
|
||||
if { set -l a true; $a && true }
|
||||
echo if-true
|
||||
end
|
||||
# CHECK: if-true
|
||||
|
||||
{
|
||||
set -l condition true
|
||||
while $condition
|
||||
{
|
||||
echo while
|
||||
set condition false
|
||||
}
|
||||
end
|
||||
}
|
||||
# CHECK: while
|
||||
|
||||
{ { echo inner}
|
||||
echo outer}
|
||||
# CHECK: inner
|
||||
# CHECK: outer
|
||||
|
||||
{
|
||||
|
||||
echo leading blank lines
|
||||
}
|
||||
# CHECK: leading blank lines
|
||||
|
||||
complete foo -a '123 456'
|
||||
complete -C 'foo {' | sed 1q
|
||||
# CHECK: {{\{.*}}
|
||||
|
||||
complete -C '{'
|
||||
echo nothing
|
||||
# CHECK: nothing
|
||||
complete -C '{ ' | grep ^if\t
|
||||
# CHECK: if{{\t}}Evaluate block if condition is true
|
||||
|
||||
$fish -c '{'
|
||||
# CHECKERR: fish: Expected a '}', but found end of the input
|
||||
|
||||
PATH= "{"
|
||||
# CHECKERR: fish: Unknown command: '{'
|
||||
# CHECKERR: {{.*}}/braces.fish (line {{\d+}}):
|
||||
# CHECKERR: PATH= "{"
|
||||
# CHECKERR: ^~^
|
||||
|
||||
$fish -c 'builtin {'
|
||||
# CHECKERR: fish: Expected end of the statement, but found a '{'
|
||||
# CHECKERR: builtin {
|
||||
# CHECKERR: ^
|
||||
|
||||
$fish -c 'command {'
|
||||
# CHECKERR: fish: Expected end of the statement, but found a '{'
|
||||
# CHECKERR: command {
|
||||
# CHECKERR: ^
|
||||
|
||||
$fish -c 'exec {'
|
||||
# CHECKERR: fish: Expected end of the statement, but found a '{'
|
||||
# CHECKERR: exec {
|
||||
# CHECKERR: ^
|
||||
|
||||
$fish -c 'begin; }'
|
||||
# CHECKERR: fish: Unexpected '}' for unopened brace
|
||||
# CHECKERR: begin; }
|
||||
# CHECKERR: ^
|
||||
|
||||
@@ -25,13 +25,6 @@ command -v nonexistent-command-1234
|
||||
echo $status
|
||||
#CHECK: 127
|
||||
|
||||
|
||||
{ echo; echo }
|
||||
# CHECKERR: {{.*}}: Unknown command: '{ echo; echo }'
|
||||
# CHECKERR: {{.*}}: '{ ... }' is not supported for grouping commands. Please use 'begin; ...; end'
|
||||
# CHECKERR: { echo; echo }
|
||||
# CHECKERR: ^~~~~~~~~~~~~^
|
||||
|
||||
set -g PATH .
|
||||
echo banana > foobar
|
||||
foobar --banana
|
||||
|
||||
@@ -321,7 +321,7 @@ $fish -c 'echo {'
|
||||
#CHECKERR: echo {
|
||||
#CHECKERR: ^
|
||||
$fish -c 'echo {}}'
|
||||
#CHECKERR: fish: Unexpected '}' for unopened brace expansion
|
||||
#CHECKERR: fish: Unexpected '}' for unopened brace
|
||||
#CHECKERR: echo {}}
|
||||
#CHECKERR: ^
|
||||
printf '<%s>\n' ($fish -c 'command (asd)' 2>&1)
|
||||
|
||||
@@ -424,6 +424,66 @@ echo 'begin
|
||||
# CHECK: {{^}} first-indented-word \
|
||||
# CHECK: {{^}} second-indented-word
|
||||
|
||||
{
|
||||
echo '{ no semi }'
|
||||
# CHECK: { no semi }
|
||||
echo '{ semi; }'
|
||||
# CHECK: { semi; }
|
||||
|
||||
echo '{ multi; no semi }'
|
||||
# CHECK: { multi; no semi }
|
||||
echo '{ multi; semi; }'
|
||||
# CHECK: { multi; semi; }
|
||||
|
||||
echo '{ conj && no semi }'
|
||||
# CHECK: { conj && no semi }
|
||||
echo '{ conj && semi; }'
|
||||
# CHECK: { conj && semi; }
|
||||
|
||||
echo '{ }'
|
||||
# CHECK: { }
|
||||
echo '{ ; }'
|
||||
# CHECK: { }
|
||||
|
||||
echo '
|
||||
{
|
||||
echo \\
|
||||
# continuation comment
|
||||
}'
|
||||
# CHECK: {
|
||||
# CHECK: {{^ }}echo \
|
||||
# CHECK: {{^ }}# continuation comment
|
||||
# TODO: This is currently broken; so this the begin/end equivalent.
|
||||
# CHECK: {{^ [}]}}
|
||||
|
||||
echo '{ { } }'
|
||||
# CHECK: { { } }
|
||||
|
||||
echo '
|
||||
{
|
||||
|
||||
{
|
||||
}
|
||||
|
||||
}
|
||||
'
|
||||
# CHECK: {{^\{$}}
|
||||
# CHECK: {{^ \{$}}
|
||||
# CHECK: {{^ \}$}}
|
||||
# CHECK: {{^\}$}}
|
||||
|
||||
echo '
|
||||
{ level 1; {
|
||||
level 2 } }
|
||||
'
|
||||
# TODO Should add a line break here.
|
||||
# CHECK: {{^{ level 1$}}
|
||||
# CHECK: {{^ \{$}}
|
||||
# CHECK: {{^ level 2$}}
|
||||
# CHECK: {{^ \}$}}
|
||||
# CHECK: {{^\}$}}
|
||||
} | $fish_indent
|
||||
|
||||
echo 'multiline-\\
|
||||
-word' | $fish_indent --check
|
||||
echo $status #CHECK: 0
|
||||
|
||||
Reference in New Issue
Block a user