create_manpage_completions: handle groff \X'...' device control escapes

help2man 1.50 added \X'tty: link URL' hyperlink escapes to generated
man pages. coreutils 9.10 is the first widely-deployed package to ship
these, and it broke completion generation for most of its commands
(only 17/106 man pages parsed successfully).

The escape wraps option text like this:

  \X'tty: link https://example.com/a'\fB\-a, \-\-all\fP\X'tty: link'

Two places needed fixing:

- remove_groff_formatting() didn't strip \X'...', so Type1-4 parsers
  extracted garbage option names like "--all\X'tty"

- Deroffer.esc_char_backslash() didn't recognize \X, falling through
  to the generic single-char escape which stripped only the \, leaving
  "X'tty: link ...'" as literal text. Option lines then started with
  X instead of -, so TypeDeroffManParser's is_option() check failed.

Also handle \Z'...' (zero-width string) which has identical syntax.

Closes #12578
This commit is contained in:
r-vdp
2026-03-28 11:23:54 +01:00
committed by Johannes Altmanninger
parent 14ce56d2a5
commit 7bd37dfe55
3 changed files with 60 additions and 0 deletions

View File

@@ -501,6 +501,19 @@ class Deroffer:
return True
return False
def device_control(self):
# groff \X'...' device control escape (and \Z'...' zero-width).
# help2man 1.50+ uses \X'tty: link URL' for hyperlinks.
# We just skip the entire escape.
if self.str_at(1) in "XZ" and self.str_at(2) == "'":
self.skip_char(3)
while self.str_at(0) and self.str_at(0) != "'":
self.skip_char()
if self.str_at(0) == "'":
self.skip_char()
return True
return False
def var(self):
reg = ""
s0s1 = self.s[0:2]
@@ -650,6 +663,8 @@ class Deroffer:
return self.size()
elif c in "hvwud":
return self.numreq()
elif c in "XZ":
return self.device_control()
elif c in "n*":
return self.var()
elif c == "(":
@@ -1314,6 +1329,9 @@ def built_command(options, description):
def remove_groff_formatting(data):
# Strip groff \X'...' device control escapes (help2man 1.50+ hyperlinks)
# and \Z'...' zero-width escapes.
data = re.sub(r"\\[XZ]'[^']*'", "", data)
data = data.replace("\\fI", "")
data = data.replace("\\fP", "")
data = data.replace("\\f1", "")