Skip to content

builtin commandline: --tokenize to expand strings, to avoid need for eval#10212

Merged
krobelus merged 3 commits into
fish-shell:masterfrom
krobelus:completions-to-use-expanded-tokens
Jan 27, 2024
Merged

builtin commandline: --tokenize to expand strings, to avoid need for eval#10212
krobelus merged 3 commits into
fish-shell:masterfrom
krobelus:completions-to-use-expanded-tokens

Conversation

@krobelus
Copy link
Copy Markdown
Contributor

@krobelus krobelus commented Jan 14, 2024

NOTE for 3rd party completion scripts: you should migrate from commandline -o to commandline -x. For example use this pattern to support both fish before and after this change:

set -l tokens
if commandline -x >/dev/null 2>&1
    set tokens (commandline -xpc)
else
    set tokens (commandline -opc)
end

Issue #10194 reports Cobra completions do

set -l args (commandline -opc)
eval $args[1] __complete $args[2..] (commandline -ct | string escape)

The intent behind "eval" is to expand variables and tildes in "$args".
Fair enough. Several of our own completions do the same, see below.

The problem with "commandline -o" + "eval" is that the former already removes
quotes that are significant for "eval". This becomes a problem if $args
contains quoted () or {}, for example this command will wrongly execute a
command substituion:

git --work-tree='(launch-missiles)' <TAB>

It is possible to escape the string the tokens before running eval, but
then there will be no expansion of variables etc. The problem is that
"commandline -o" only unescapes tokens so they end up in a weird state
in-between source and expanded version.

Remove the need for "eval" by making "commandline -o" expand things like
variables and braces I don't think there is any use case where not expanding is
better. This enables custom completion scripts to be aware of shell variables
without eval, see the added test for completions to "make -C $var/some/dir ".

Some completion scripts use eval without escaping the tokens. This is
already broken in some ways but this changes adds more cases of breakage :
when a variable expansion contains spaces or other shell metacharacters,
"eval" will do the wrong thing.
There are some third-party uses of "eval (commandline -opc)". Besides Cobra
that's at least pip.

The new expansion skips command substitutions, in the hope that this matches
user expectation better than running them. They are passed through as-is
(instead of cancelling or expanding to nothing) to make custom completion
scripts work well in the common case.

@faho
Copy link
Copy Markdown
Member

faho commented Jan 14, 2024

I'm in favor of having a way to get the expanded tokens, but I'm not sure

  1. this is correct in general for all uses of commandline --tokenize - consider bindings
  2. this is okay for completions that currently use eval - like our git completions (tho they do try to avoid some obvious issues)

The latter one is what you're aluding to with if a variable expansion contains spaces or other shell metacharacters, "eval" will do the wrong thing.

The change here is that now you get double-expansion - $bar"foo" with $bar set to (echo hahaha) will turn into (echo hahaha)foo and then pass to eval and actually run the echo hahaha that wasn't run before.

The obvious alternative is to add a second option, commandline --expanded-tokens, or commandline --expand --tokenize.

@krobelus
Copy link
Copy Markdown
Contributor Author

I'm in favor of having a way to get the expanded tokens, but I'm not sure

  1. this is correct in general for all uses of commandline --tokenize - consider bindings

I don't think we can find a case where the current behavior is more useful.

  1. this is okay for completions that currently use eval - like our git completions (tho they do try to avoid some obvious issues)

The latter one is what you're aluding to with if a variable expansion contains spaces or other shell metacharacters, "eval" will do the wrong thing.

The change here is that now you get double-expansion - $bar"foo" with $bar set to (echo hahaha) will turn into (echo hahaha)foo and then pass to eval and actually run the echo hahaha that wasn't run before.

I think eval (commandline -opc) is not a very common idiom because eval is bad practice here. It was always a workaround.
Completions that use eval were already vulnerable but yeah, this change amplifies it.
But pushing people to get off eval for completions is a good thing.

The obvious alternative is to add a second option, commandline --expanded-tokens, or commandline --expand --tokenize.

FWIW, if we want two modes, then, besides an expansion mode we could have one that gives raw tokens (without today's bogus unescaping). But I wouldn't add that without a use case and I don't think we can find one.

My hope is that changing existing behavior will be a net positive because it gives the improved behavior to third party completions for free.
Most "forwarding" uses of commandline -opc do not call eval but forward them as unexpanded argv, like myprogram complete (commandline -opc).
The expansion behavior is a strict improvement here.
For example Rust's clap library provides dynamic completions with

r#"complete -x -c {bin} -a "("'{COMPLETER}'" complete --shell fish -- (commandline --current-process --tokenize --cut-at-cursor) (commandline --current-token))""#

Essentially everything that is broken by this change was already broken in similar ways; we can fix those first.
So far we only know of Cobra. Once they merge https://github.com/spf13/cobra/pull/2095/files, their logic will be safe both before and after this change.

I'm not really sure how to search for third party code. What command line completion frameworks other than clap/Cobra, or other notable third party scripts can we find?

@faho
Copy link
Copy Markdown
Member

faho commented Jan 17, 2024

What command line completion frameworks other than clap/Cobra, or other notable third party scripts can we find?

The pip completions also eval the output of commandline -o.

(for the record, pipenv does not, at least the version I have)

rclone currently does, but it uses cobra. I'm not sure what updating them would entail.

stack doesn't, starship doesn't.

Those are the first five of our | source completions I could get my hands on.

@krobelus krobelus force-pushed the completions-to-use-expanded-tokens branch from 30ab47b to a825123 Compare January 22, 2024 06:12
@krobelus
Copy link
Copy Markdown
Contributor Author

I prefer to break users here (short sharp shock) but it sounds like there's an objection so I'll add a separate flag?

@krobelus krobelus force-pushed the completions-to-use-expanded-tokens branch 2 times, most recently from 64dbed3 to 183e699 Compare January 22, 2024 06:52
@faho
Copy link
Copy Markdown
Member

faho commented Jan 22, 2024

I prefer to break users here (short sharp shock) but it sounds like there's an objection so I'll add a separate flag?

I would prefer that, yes. The issue is that it wouldn't be a "short sharp shock", it would appear to work in most cases, and then in edge cases you get a face full of eval.

And that's not just for completion generation projects, but also e.g. users who keep a modified version of our own git completions.

@faho
Copy link
Copy Markdown
Member

faho commented Jan 23, 2024

Note: For bindings we probably need a "raw" mode, where you get the full token with quotes and such intact.

Otherwise if you try to make a binding act on tokens it will be wrong as soon as any quotes enter the picture.

Now we have three possible modes, which makes me think we might want it to be one option with arguments.

@krobelus
Copy link
Copy Markdown
Contributor Author

Technically it's three but the current behavior should never be needed, I'd hide it from documentation, only mention it in the changelog.

I don't know of a use case for getting the list of raw tokens but it does make much more sense than unescaped ones.
I wonder how bindings would want to use them.
commandline -ct is raw but that's kind of unavoidable. Orthodox completion scripts don't look at that token.. but sometimes it's a necessity.


I realized read --tokenize is weird in the same way as commandline --tokenize; it does unescaping but leave variables.
We should probably update it the same way (add a new flag).

@faho
Copy link
Copy Markdown
Member

faho commented Jan 23, 2024

I don't know of a use case for getting the list of raw tokens but it does make much more sense than unescaped ones.

"Insert the last token again".

Currently my best attempt is

function last_arg
    set -l arg (commandline -opc)[-1]
    commandline -i -- $arg
end

bind \cg last_arg

but that fails if the last token is quoted, like "foo bar". Add string escape and then it'll turn it into 'foo bar' with single-quotes.

Generally any time you want to iterate over the commandline to transform it you need it not unescaped.

I realized read --tokenize is weird in the same way as commandline --tokenize; it does unescaping but leave variables.

read --tokenize is mainly useful for reading quotes - turn "foo bar" baz into foo bar and baz. Picture reading from an ini-style file, or we use it for figuring out $EDITOR
or $PAGER, where you might have escaped spaces (non-standard but they do work in some tools, I think systemd had some pretty complicated logic here).

In those cases you typically do not want variables and globs to be expanded.

@krobelus
Copy link
Copy Markdown
Contributor Author

both makes sense, thanks.
So commandline flags --expand-tokens and --raw-tokens or --tokens=expanded and --tokens=raw. But if we do the latter -x should still be an abbreviation for --tokens=expanded

@faho
Copy link
Copy Markdown
Member

faho commented Jan 23, 2024

So commandline flags --expand-tokens and --raw-tokens or --tokens=expanded and --tokens=raw. But if we do the latter -x should still be an abbreviation for --tokens=expanded

The latter sounds fine. Closest short opt we can do for "raw" is -R or -w.

Issue fish-shell#10194 reports Cobra completions do

    set -l args (commandline -opc)
    eval $args[1] __complete $args[2..] (commandline -ct | string escape)

The intent behind "eval" is to expand variables and tildes in "$args".
Fair enough. Several of our own completions do the same, see the next commit.

The problem with "commandline -o" + "eval" is that the former already
removes quotes that are  relevant for "eval". This becomes a problem if $args
contains quoted () or {}, for example this command will wrongly execute a
command substituion:

    git --work-tree='(launch-missiles)' <TAB>

It is possible to escape the string the tokens before running eval, but
then there will be no expansion of variables etc.  The problem is that
"commandline -o" only unescapes tokens so they end up in a weird state
somewhere in-between what the user typed and the expanded version.

Remove the need for "eval" by introducing "commandline -x" which expands
things like variables and braces. This enables custom completion scripts to
be aware of shell variables without eval, see the added test for completions
to "make -C $var/some/dir ".

This means that essentially all third party scripts should migrate from
"commandline -o" to "commandline -x". For example

    set -l tokens
    if commandline -x >/dev/null 2>&1
        set tokens (commandline -xpc)
    else
        set tokens (commandline -opc)
    end

Since this is mainly used for completions, the expansion skips command
substitutions.  They are passed through as-is (instead of cancelling or
expanding to nothing) to make custom completion scripts work reasonably well
in the common case. Of course there are cases where we would want to expand
command substitutions here, so I'm not sure.
Fix cases like

    eval my-cmd (commandline -o)
    complete -C "my-cmd $(commandline -o)"

In both cases, we spuriously evaluate tokens like "(inside-quoted-string)"
as command substitutions. Fix this by escaping the strings.  The momentarily
regresses the intended purpose of "eval" -- to expand variables -- but the
next commit will fix that.
This gives us more accurate completions because completion scripts get
expanded paths
@krobelus krobelus force-pushed the completions-to-use-expanded-tokens branch from 183e699 to 29f35d6 Compare January 27, 2024 08:31
@krobelus krobelus merged commit a961847 into fish-shell:master Jan 27, 2024
@krobelus krobelus added this to the fish next-3.x milestone Jan 27, 2024
krobelus added a commit to krobelus/cobra that referenced this pull request Apr 26, 2024
We capture the commandline tokens using fish's "commandline --tokenize" (-o).
Given a commandline of

	some-cobra-command 'some argument $(123)' "$HOME"

into three arguments while removing the quotes

	some-cobra-command
	some argument $(123)
	$HOME

Later we pass "some argument $(123)" without quotes to the shell's
"eval". This is wrong and causes spurious evaluation of the parenthesis
as command substitution.

Fix this by escaping the arguments.

The upcoming fish 3.8 has a new flag, "commandline -x"
which will expand variables like $HOME properly, see
fish-shell/fish-shell#10212 Use that if
available.

Reproduce the issue by pasting the completion script at
fish-shell/fish-shell#10194 (comment)
to a file "grafana-manager.fish" and running

	function grafana-manager; end
	source grafana-manager.fish

Then type (without pressing Enter)

	grafana-manager public-dashboards delete --organization-id 3 --dashboard-name "k8s (public)" <TAB>

Fixes fish-shell/fish-shell#10194
krobelus added a commit to krobelus/clap that referenced this pull request Jan 19, 2025
In fish, the command line

	cargo b --manifest-path $PWD/clap_complete/Cargo.toml --example

wrongly gets example-name completion $PWD/Cargo.toml instead of
$PWD/clap_complete/Cargo.toml.

This is because fish's "commandline --tokenize" option produces tokens
like "$PWD/clap_complete/Cargo.toml", which clap cannot see through.

Starting in the upcoming fish version 4, there is a new option,
"--tokens-expanded"[^1] to output the expanded arguments, matching
what the program would see during execution.

If an expansion of a token like {1,2,3}{1,2,3} would produce too
many results, the unexpanded token is forwarded, thus falling back to
existing behavior. Similarly, command substitutions ("$()") present
on the command line are forwarded as-is, because they are not deemed
safe to expand given that the user hasn't agreed to running them.

Use this option if available, to fix the above example.

While at it, break up this overlong line.  Add redundant semicolons,
in case a preexisting completion accidentally joins these lines with
spaces[^2], like

	if set -l completions (COMPLETE=fish myapp 2>/dev/null)
		echo "$completions" | source
	end

I have not updated clap_complete/tests/snapshots/register_dynamic.fish
(not sure how?) or tested this but I'm happy to do so.

[^1]: fish-shell/fish-shell#10212
[^2]: fish-shell/fish-shell#11046 (comment)
krobelus added a commit to krobelus/clap that referenced this pull request Jan 19, 2025
In fish, the command line

	cargo b --manifest-path $PWD/clap_complete/Cargo.toml --example

wrongly gets example-name completions from $PWD/Cargo.toml instead of
$PWD/clap_complete/Cargo.toml.

This is because fish's "commandline --tokenize" option produces tokens
like "$PWD/clap_complete/Cargo.toml", which clap cannot see through.

Starting in the upcoming fish version 4, there is a new option,
"--tokens-expanded"[^1] to output the expanded arguments, matching
what the program would see during execution.

If an expansion of a token like {1,2,3}{1,2,3} would produce too
many results, the unexpanded token is forwarded, thus falling back to
existing behavior. Similarly, command substitutions ("$()") present
on the command line are forwarded as-is, because they are not deemed
safe to expand given that the user hasn't agreed to running them.

Use this option if available, to fix the above example.

I have not checked if any existing clap user tries to expand variables
on their own. If this is a real possibility, we could let users
opt into the new behavior.

While at it, break up this overlong line.  Add redundant semicolons,
in case a preexisting completion accidentally joins these lines with
spaces[^2], like

	if set -l completions (COMPLETE=fish myapp 2>/dev/null)
		echo "$completions" | source
	end

I have not updated clap_complete/tests/snapshots/register_dynamic.fish
(not sure how?) or tested this but I'm happy to do so.

[^1]: fish-shell/fish-shell#10212
[^2]: fish-shell/fish-shell#11046 (comment)
@github-actions github-actions Bot locked as resolved and limited conversation to collaborators Jan 19, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants