Home |
Licence |
FAQ |
Docs |
Download |
Keys |
Links
Mirrors |
Updates |
Feedback |
Changes |
Wishlist |
Team
Josef Hinteregger reports that the CF_UNICODETEXT
data
stored by Windows in the clipboard is encoded in UTF-16; that is, it
can contain surrogate pairs representing characters outside the BMP.
PuTTY does not currently interpret those pairs. Of course, if it
did, it would also have to be able to write them out again when
copying to the clipboard itself or talking to the Windows GDI text
display primitives, etc.
I can see two sensible ways to solve this. One is to do translation to and from UTF-16 on the platform side of the front end interface, if necessary for a given platform. So:
wchar_t
anywhere in the front
end interface: change the prototypes of get_clip()
,
write_clip()
, luni_send()
and
do_text()
(and any others I've missed) so that they use
arrays of unsigned int
(containing UTF-32) rather than
wchar_t
.
The other option is to do the UTF-32/UTF-16 translation on the
cross-platform side (terminal.c
and
ldisc.c
), and have it enabled or disabled by a
#define
in each platform header file. (The
#define
is certainly necessary: partly because we
shouldn't accept UTF-16 surrogate pairs coming from front ends which
are supposed to be speaking UTF-32, on the usual grounds of not
accepting redundant encodings, but mostly because
terminal.c
will obviously need to know whether to
send surrogate pairs when sending Unicode data back to the
front end!)
I'm currently inclined to the first option, because UTF-16 is nasty and I'd rather it complicated only those front ends that had to deal with it than that it complicated the core code.
Update, 2009-03: fixed (using the second option).
Audit trail for this wish.