=head1 NAME
perlunicode - Unicode support in Perl
=head1 DESCRIPTION
If you haven't already, before reading this document, you should become
familiar with both L and L.
Unicode aims to B-fy the en-B-ings of all the world's
character sets into a single Standard. For quite a few of the various
coding standards that existed when Unicode was first created, converting
from each to Unicode essentially meant adding a constant to each code
point in the original standard, and converting back meant just
subtracting that same constant. For ASCII and ISO-8859-1, the constant
is 0. For ISO-8859-5, (Cyrillic) the constant is 864; for Hebrew
(ISO-8859-8), it's 1488; Thai (ISO-8859-11), 3424; and so forth. This
made it easy to do the conversions, and facilitated the adoption of
Unicode.
And it worked; nowadays, those legacy standards are rarely used. Most
everyone uses Unicode.
Unicode is a comprehensive standard. It specifies many things outside
the scope of Perl, such as how to display sequences of characters. For
a full discussion of all aspects of Unicode, see
L.
=head2 Important Caveats
Even though some of this section may not be understandable to you on
first reading, we think it's important enough to highlight some of the
gotchas before delving further, so here goes:
Unicode support is an extensive requirement. While Perl does not
implement the Unicode standard or the accompanying technical reports
from cover to cover, Perl does support many Unicode features.
Also, the use of Unicode may present security issues that aren't
obvious, see L below.
=over 4
=item Safest if you C