fish-shell

mirror of https://github.com/fish-shell/fish-shell.git synced 2026-06-12 23:01:16 -03:00

Files

Fabian Boehm 7988cff6bd Increase the string chunk size to increase performance

This is a *tiny* commit code-wise, but the explanation is a bit
longer.

When I made string read in chunks, I picked a chunk size from bash's
read, under the assumption that they had picked a good one.

It turns out, on the (linux) systems I've tested, that's simply not
true.

My tests show that a bigger chunk size of up to 4096 is better *across
the board*:

- It's better with very large inputs
- It's equal-to-slightly-better with small inputs
- It's equal-to-slightly-better even if we quit early

My test setup:

0. Create various fish builds with various sizes for
STRING_CHUNK_SIZE, name them "fish-$CHUNKSIZE".
1. Download the npm package names from
https://github.com/nice-registry/all-the-package-names/blob/master/names.json (I
used commit 87451ea77562a0b1b32550124e3ab4a657bf166c, so it's 46.8MB)
2. Extract the names so we get a line-based version:

```fish
jq '.[]' names.json | string trim -c '"' >/tmp/all
```

3. Create various sizes of random extracts:

```fish
for f in 10000 1000 500 50
    shuf /tmp/all | head -n $f > /tmp/$f
end
```

(the idea here is to defeat any form of pattern in the input).

4. Run benchmarks:

hyperfine -w 3 ./fish-{128,512,1024,2048,4096}"
    -c 'for i in (seq 1000)
            string match -re foot < $f
        end; true'"

(reduce the seq size for the larger files so you don't have to wait
for hours - the idea here is to have some time running string and not
just fish startup time)

This shows results pretty much like

```
Summary
'./fish-2048     -c 'for i in (seq 1000)
          string match -re foot < /tmp/500
      end; true'' ran
  1.01 ± 0.02 times faster than './fish-4096     -c 'for i in (seq 1000)
          string match -re foot < /tmp/500
      end; true''
  1.02 ± 0.03 times faster than './fish-1024     -c 'for i in (seq 1000)
          string match -re foot < /tmp/500
      end; true''
  1.08 ± 0.03 times faster than './fish-512     -c 'for i in (seq 1000)
          string match -re foot < /tmp/500
      end; true''
  1.47 ± 0.07 times faster than './fish-128     -c 'for i in (seq 1000)
          string match -re foot < /tmp/500
      end; true''
```

So we see that up to 1024 there's a difference, and after that the
returns are marginal. So we stick with 1024 because of the memory
trade-off.

----

Fun extra:

Comparisons with `grep` (GNU grep 3.7) are *weird*. Because you both
get

```
'./fish-4096 -c 'for i in (seq 100); string match -re foot < /tmp/500; end; true'' ran
11.65 ± 0.23 times faster than 'fish -c 'for i in (seq 100); command grep foot /tmp/500; end''
```

and

```
'fish -c 'for i in (seq 2); command grep foot /tmp/all; end'' ran
66.34 ± 3.00 times faster than './fish-4096 -c 'for i in (seq 2);
string match -re foot < /tmp/all; end; true''
100.05 ± 4.31 times faster than './fish-128 -c 'for i in (seq 2);
string match -re foot < /tmp/all; end; true''
```

Basically, if you *can* give grep a lot of work at once (~40MB in this
case), it'll churn through it like butter. But if you have to call it
a lot, string beats it by virtue of cheating.

2022-08-15 20:16:12 +02:00

builtins

Increase the string chunk size to increase performance

2022-08-15 20:16:12 +02:00

widecharwidth

Update widecharwidth

2022-02-14 22:19:28 +01:00

ast_node_types.inc

Introduce a new fish ast

2020-07-04 14:58:02 -07:00

ast.cpp

Make tokenizer delimiter errors one long

2022-08-12 18:38:47 +02:00

ast.h

Correct a misleading comment

2022-07-02 11:30:59 -07:00

autoload.cpp

Make ESCAPE_ALL the default and call its inverse ESCAPE_NO_PRINTABLES

2022-07-27 11:24:35 +02:00

autoload.h

Reimplement autosuggestion-triggered completion loading

2022-06-19 15:15:17 -07:00

builtin.cpp

Remove the intern'd strings component

2022-08-13 12:51:36 -07:00

builtin.h

Remove the intern'd strings component

2022-08-13 12:51:36 -07:00

color.cpp

Simplify ASSERT_SORT_ORDER

2021-07-15 13:15:24 -07:00

color.h

Refactor color.h/color.cpp

2021-02-08 15:16:21 -06:00

common.cpp

Use Unicode symbols for rendering control characters in pager

2022-08-13 21:11:31 +02:00

common.h

Switch filenames from intern'd strings to shared_ptr

2022-08-13 12:51:36 -07:00

complete.cpp

complete: Don't load completions if command isn't in $PATH

2022-08-11 17:05:32 +02:00

complete.h

Highlight shell commands in history pager

2022-08-13 21:11:31 +02:00

enum_map.h

enum_map stuff to enum_map.h

2021-10-01 03:39:43 -07:00

enum_set.h

Tighten up includes, some typedefs -> using

2021-09-21 18:05:53 -07:00

env_dispatch.cpp

Reset the read byte limit to the default when unset

2022-08-09 19:59:10 +02:00

env_dispatch.h

Try to rationalize universal variable syncing

2022-05-30 14:09:06 -07:00

env_universal_common.cpp

Stop migrating legacy uvar paths

2022-03-17 18:15:11 +01:00

env_universal_common.h

Stop migrating legacy uvar paths

2022-03-17 18:15:11 +01:00

env.cpp

clang-format C++ files

2022-07-27 10:05:41 +02:00

env.h

set --show: Show the originally inherited value, if any

2022-06-27 20:33:26 +02:00

event.cpp

Make ESCAPE_ALL the default and call its inverse ESCAPE_NO_PRINTABLES

2022-07-27 11:24:35 +02:00

event.h

Stop removing unfired one-shot handlers

2022-06-06 12:18:29 -07:00

exec.cpp

Switch filenames from intern'd strings to shared_ptr

2022-08-13 12:51:36 -07:00

exec.h

exec.h: remove unused declaration

2021-09-24 09:30:25 -07:00

expand.cpp

Add command substitution error length

2022-08-12 18:38:47 +02:00

expand.h

Rationalize tilde unexpansion

2022-04-10 13:41:21 -07:00

fallback.cpp

Add wcwidth non_characters

2022-08-12 17:25:31 +02:00

fallback.h

Remove wcsndup and wcslcpy

2022-03-17 18:15:11 +01:00

fd_monitor.cpp

Correct bug causing early teardown of fd_monitor

2022-03-31 20:41:58 -07:00

fd_monitor.h

Correct bug causing early teardown of fd_monitor

2022-03-31 20:41:58 -07:00

fds.cpp

Allow for EWOULDBLOCK instead of EAGAIN

2022-07-23 23:16:44 +02:00

fds.h

Allow using poll() to check for readability

2022-01-02 16:36:33 -08:00

fish_indent.cpp

Clean up woption

2022-04-02 11:28:30 -07:00

fish_key_reader.cpp

Clean up woption

2022-04-02 11:28:30 -07:00

fish_test_helper.cpp

Teach fish_test_helper to sigint_self

2022-05-28 16:08:17 -07:00

fish_tests.cpp

Remove the intern'd strings component

2022-08-13 12:51:36 -07:00

fish_version.cpp

Fix build

2021-09-21 18:33:14 -07:00

fish_version.h

Revert "Generate FISH_BUILD_VERSION info for cmake builds"

2018-01-08 22:28:10 -08:00

fish.cpp

Remove the intern'd strings component

2022-08-13 12:51:36 -07:00

flog.cpp

Migrate remaining calls from debug_safe to FLOGF_SAFE

2021-07-05 15:47:56 -07:00

flog.h

Migrate remaining calls from debug_safe to FLOGF_SAFE

2021-07-05 15:47:56 -07:00

function.cpp

Remove the intern'd strings component

2022-08-13 12:51:36 -07:00

function.h

Switch filenames from intern'd strings to shared_ptr

2022-08-13 12:51:36 -07:00

future_feature_flags.cpp

clang-format C++ files

2022-06-01 10:02:09 -07:00

future_feature_flags.h

Force stderr-nocaret feature flag on

2022-04-15 13:42:38 +02:00

global_safety.h

Replace a bunch of ASSERT_IS_MAIN_THREAD

2022-06-20 12:31:36 -07:00

highlight.cpp

Initialize variable

2022-05-11 21:28:26 +02:00

highlight.h

clang-format C++ files

2022-06-01 10:02:09 -07:00

history_file.cpp

Correct a cast when measuring history file size

2022-04-01 10:25:05 -07:00

history_file.h

Introduce noncopyable_t and nonmovable_t

2021-07-23 11:19:42 -07:00

history.cpp

Advance pager history search with Control-R/Control-S

2022-07-30 23:27:24 +02:00

history.h

Advance pager history search with Control-R/Control-S

2022-07-30 23:27:24 +02:00

input_common.cpp

Replace a bunch of ASSERT_IS_MAIN_THREAD

2022-06-20 12:31:36 -07:00

input_common.h

Add Control+R incremental history search in pager

2022-07-30 23:27:24 +02:00

input.cpp

Add Control+R incremental history search in pager

2022-07-30 23:27:24 +02:00

input.h

Use dedicated variable to configure selection size

2022-07-30 09:49:07 -07:00

io.cpp

Allow for EWOULDBLOCK instead of EAGAIN

2022-07-23 23:16:44 +02:00

io.h

clang-format C++ files

2022-06-01 10:02:09 -07:00

iothread.cpp

Remove iothread drain flag

2022-06-19 15:15:20 -07:00

iothread.h

Remove iothread drain flag

2022-06-19 15:15:20 -07:00

job_group.cpp

Remove cancellation groups

2022-03-20 14:39:00 -07:00

job_group.h

Remove cancellation groups

2022-03-20 14:39:00 -07:00

kill.cpp

Make the kill ring thread-safe

2021-04-21 17:37:44 -07:00

kill.h

Implementation of variable with killring entries

2021-04-21 16:39:29 -07:00

lru.h

Reimplement autosuggestion-triggered completion loading

2022-06-19 15:15:17 -07:00

maybe.h

maybe.h: reference header new

2021-08-17 18:57:16 -05:00

null_terminated_array.cpp

Rework null terminated arrays

2021-03-28 15:31:25 -07:00

null_terminated_array.h

Introduce noncopyable_t and nonmovable_t

2021-07-23 11:19:42 -07:00

operation_context.cpp

Allow specifying a limit on number of expansion in operation_context

2020-12-22 12:38:51 -08:00

operation_context.h

Expand more when performing history path detection

2021-01-08 12:58:34 -08:00

output.cpp

Fix tparm kludge

2022-03-14 15:36:17 +01:00

output.h

Tighten up includes, some typedefs -> using

2021-09-21 18:05:53 -07:00

pager.cpp

Highlight shell commands in history pager

2022-08-13 21:11:31 +02:00

pager.h

Highlight shell commands in history pager

2022-08-13 21:11:31 +02:00

parse_constants.h

Highlight history searches correctly (#9066 )

2022-07-13 16:48:04 +02:00

parse_execution.cpp

Pass location of the *command* node without decorators

2022-08-12 18:38:47 +02:00

parse_execution.h

parse_execution: remove unused 'job' parameters

2022-04-07 09:36:54 -07:00

parse_tree.cpp

Don't skip caret for some errors

2022-08-12 18:38:47 +02:00

parse_tree.h

Rename EXEC_ERR_MSG to INVALID_PIPELINE_CMD_ERR_MSG

2022-03-31 15:49:15 -07:00

parse_util.cpp

Add length to the parse_util syntax errors

2022-08-12 18:38:47 +02:00

parse_util.h

Fix regression expanding \$()

2022-04-03 15:54:08 +02:00

parser_keywords.cpp

docs: list reserved keywords

2022-06-16 19:45:55 +10:00

parser_keywords.h

Remove unused functions, members (and a variable)

2022-04-09 10:10:44 -07:00

parser.cpp

Remove the intern'd strings component

2022-08-13 12:51:36 -07:00

parser.h

Switch filenames from intern'd strings to shared_ptr

2022-08-13 12:51:36 -07:00

path.cpp

If relative path was used, use it

2022-08-15 20:01:50 +02:00

path.h

Rationalize path-getting

2022-04-23 15:24:27 -07:00

postfork.cpp

clang-format C++ files

2022-07-27 10:05:41 +02:00

postfork.h

Refactor tty transfer to be more deliberate

2022-03-19 14:48:36 -07:00

print_help.cpp

Run clang-format on all files

2019-10-13 15:50:48 -07:00

print_help.h

restyle proc module to match project style

2016-05-02 22:07:58 -07:00

proc.cpp

Generate job & process exit events for background jobs

2022-07-30 10:06:33 -07:00

proc.h

Check for waitstatus orientation via cmake

2022-07-24 16:40:33 +02:00

re.cpp

Remove usage of PCRE2_SUBSTITUTE_LITERAL

2022-07-10 11:17:19 -07:00

re.h

Remove usage of PCRE2_SUBSTITUTE_LITERAL

2022-07-10 11:17:19 -07:00

reader.cpp

Remove the intern'd strings component

2022-08-13 12:51:36 -07:00

reader.h

Use env_dispatch to update cursor selection mode

2022-07-30 09:49:07 -07:00

redirection.cpp

Collapse io_data switch statements

2019-12-29 15:51:22 -08:00

redirection.h

Introduce noncopyable_t and nonmovable_t

2021-07-23 11:19:42 -07:00

screen.cpp

trivial cleanup

2022-04-08 17:59:09 -07:00

screen.h

Restyle codebase with clang-format

2021-11-08 12:21:11 -08:00

signal.cpp

Allow trapping SIGINT and SIGTERM in scripts

2022-05-28 17:44:13 -07:00

signal.h

Restyle codebase with clang-format

2021-11-08 12:21:11 -08:00

termsize.cpp

Include <termios.h> instead of <sys/termios.h>.

2021-03-02 12:05:07 +01:00

termsize.h

Eliminate the termsize handling from common.h

2020-06-07 20:00:42 -07:00

timer.cpp

clang-format C++ files

2022-06-01 10:02:09 -07:00

timer.h

Pass some parameters by reference/move

2021-03-21 19:41:36 +01:00

tinyexpr.cpp

Merge branch 'master' into te-refactor

2022-03-13 11:24:31 +01:00

tinyexpr.h

math: Use wchar

2020-12-14 22:54:53 +01:00

tokenizer.cpp

Make tokenizer delimiter errors one long

2022-08-12 18:38:47 +02:00

tokenizer.h

Make tokenizer delimiter errors one long

2022-08-12 18:38:47 +02:00

topic_monitor.cpp

Allow for EWOULDBLOCK instead of EAGAIN

2022-07-23 23:16:44 +02:00

topic_monitor.h

Revert "Replace some simple loops with STL algorithms"

2022-04-09 12:12:16 -07:00

trace.cpp

Make ESCAPE_ALL the default and call its inverse ESCAPE_NO_PRINTABLES

2022-07-27 11:24:35 +02:00

trace.h

Cache if tracing is enabled

2021-10-28 19:39:30 +02:00

utf8.cpp

Change C casts to C++ ones

2020-05-01 13:30:56 -07:00

utf8.h

[clang-tidy] Fix inconsistent declarations

2019-11-25 14:13:33 -08:00

util.cpp

wcsfilecmp: Stop actually computing the numbers

2021-10-07 17:57:52 +02:00

util.h

Return glob ordering to pre-3.1 state

2020-02-14 19:06:19 +01:00

wait_handle.cpp

Refactor wait handles

2021-05-17 15:25:21 -07:00

wait_handle.h

Mild refactoring of wait handles

2021-10-28 10:37:43 -07:00

wcstringutil.cpp

Remove unused functions, members (and a variable)

2022-04-09 10:10:44 -07:00

wcstringutil.h

Fix compile error on OpenBSD

2022-08-04 08:13:19 +02:00

wgetopt.cpp

Make arguments to builtins const

2021-03-28 15:31:25 -07:00

wgetopt.h

Clean up woption

2022-04-02 11:28:30 -07:00

wildcard.cpp

Clean up wildcard_has

2021-11-27 12:48:04 -08:00

wildcard.h

Clean up wildcard_has

2021-11-27 12:48:04 -08:00

wutil.cpp

Remove sys/mount.h include

2022-07-24 12:24:42 +02:00

wutil.h

Just remove the dumb comment.

2022-07-17 14:41:35 -07:00