builtin read: --tokenize-raw option

Users have tried to get a list of all tokens -- including operators
-- using "commandline --tokens-raw".  That one has been deprecated
by cc2ca60baa (commandline.rst: deprecate --tokens-raw option,
2025-05-05).  Part of the reason is that the above command is broken
for multi-line tokens.

Let's support this use case in a way that's less ambiguous.

Closes #11084
This commit is contained in:
Johannes Altmanninger
2025-05-05 14:49:01 +02:00
parent f3b27e8d11
commit c3626a3031
12 changed files with 159 additions and 37 deletions

View File

@@ -83,8 +83,12 @@ The following options control how much is read and how it is stored:
**-n** or **--nchars** *NCHARS*
Makes ``read`` return after reading *NCHARS* characters or the end of the line, whichever comes first.
**-t** -or **--tokenize**
Causes read to split the input into variables by the shell's tokenization rules. This means it will honor quotes and escaping. This option is of course incompatible with other options to control splitting like **--delimiter** and does not honor :envvar:`IFS` (like fish's tokenizer). It saves the tokens in the manner they'd be passed to commands on the commandline, so e.g. ``a\ b`` is stored as ``a b``. Note that currently it leaves command substitutions intact along with the parentheses.
**-t**, **--tokenize** or **--tokenize-raw**
Causes read to split the input into variables by the shell's tokenization rules.
This means it will honor quotes and escaping.
This option is of course incompatible with other options to control splitting like **--delimiter** and does not honor :envvar:`IFS` (like fish's tokenizer).
The **-t** -or **--tokenize** variants perform quote removal, so e.g. ``a\ b`` is stored as ``a b``.
However variables and command substitutions are not expanded.
**-a** or **--list**
Stores the result as a list in a single variable. This option is also available as **--array** for backwards compatibility.