why does -vi- set the hi bit when expanding `%' and `#'?
Dominic Dunlop
domo at riddle.UUCP
Mon Jan 16 20:09:39 AEST 1989
[Already it's hard to keep track of who's quoting whom in this thread.
Sorry if I've got it wrong...]
In article <450 at oglvee.UUCP> norm at oglvee.UUCP (Norman Joseph) writes:
[Stuff about vi setting the high bit of each character in the filenames it
produces when expanding `%' an `#' on shell command lines omited.]
>In article <15219 at mimsy.UUCP>, chris at mimsy.UUCP (Chris Torek) writes:
>> vi believes that by setting bit 7, it is quoting the file name,
>> so that if you are editing the file `foo*bar.c', the command
>>
>> !echo %
>>
>> produces [in effect]
>>
>> !echo \f\o\o\*\b\a\r\.\c
>>
>> in shell-internal-quoting format (bit 7 set).
>
>Maybe I'm just thick, or maybe I was home sick the day they explained
>``shell-internal-quoting format'' to everyone, but would some kind
>soul who knows what Chris is talking about care to fill me in? (E-mail
>would be fine. I'm sure people are falling asleep even as we speak :^).
>Is this the same as quoting sh meta-characters with '\'?>
^^^^
Yes, except that, strictly, the backslash can be used to quote any character:
it's just that the quoting is a no-op on any character other than a
metacharacter. (Yes, this topic has scope for soporific semantic pedantry.)
>Is this
>something I need to care about beyond being curious?
No. Apart from anything else, it's obsolescent, and its use by
applications software has been deprecated for A Long Time (this deprecation
having been broadcast in the same way as information about the `feature'
itself -- that is, by word of mouth). As I understand it, we finally get
to say goodbye to bit seven internal quoting with the System V, release 4
version of the shell. It's possible that it's been eliminated in V.3.1 and
later as well. Comments, anybody?
Why has it gone? Because it's a real pain in the butt for users of
character sets which require all eight bits of a byte in order to represent
all alphabetic characters. This turns out to mean most Europeans.
(Asian character sets are something else again.) Having the shell
interpret that eighth bit as a quote, then clear it, mangles text which
includes characters (usually accented letters) which ANSI didn't think
of all those years ago.
The 1003.2 working group of the IEEE is drafting a standard for the shell
command language. I don't have it to hand, but, as I recall, it
effectively outlaws eighth bit quoting in the shell.
More information about the Comp.unix.questions
mailing list