=encoding utf8
=head1 NAME
perl5180delta - what is new for perl v5.18.0
=head1 DESCRIPTION
This document describes differences between the v5.16.0 release and the v5.18.0
release.
If you are upgrading from an earlier release such as v5.14.0, first read
L, which describes differences between v5.14.0 and v5.16.0.
=head1 Core Enhancements
=head2 New mechanism for experimental features
Newly-added experimental features will now require this incantation:
no warnings "experimental::feature_name";
use feature "feature_name"; # would warn without the prev line
There is a new warnings category, called "experimental", containing
warnings that the L pragma emits when enabling experimental
features.
Newly-added experimental features will also be given special warning IDs,
which consist of "experimental::" followed by the name of the feature. (The
plan is to extend this mechanism eventually to all warnings, to allow them
to be enabled or disabled individually, and not just by category.)
By saying
no warnings "experimental::feature_name";
you are taking responsibility for any breakage that future changes to, or
removal of, the feature may cause.
Since some features (like C<~~> or C) now emit experimental warnings,
and you may want to disable them in code that is also run on perls that do not
recognize these warning categories, consider using the C pragma like this:
no if $] >= 5.018, warnings => "experimental::feature_name";
Existing experimental features may begin emitting these warnings, too. Please
consult L for information on which features are considered
experimental.
=head2 Hash overhaul
Changes to the implementation of hashes in perl v5.18.0 will be one of the most
visible changes to the behavior of existing code.
By default, two distinct hash variables with identical keys and values may now
provide their contents in a different order where it was previously identical.
When encountering these changes, the key to cleaning up from them is to accept
that B and to act accordingly.
=head3 Hash randomization
The seed used by Perl's hash function is now random. This means that the
order which keys/values will be returned from functions like C,
C, and C will differ from run to run.
This change was introduced to make Perl's hashes more robust to algorithmic
complexity attacks, and also because we discovered that it exposes hash
ordering dependency bugs and makes them easier to track down.
Toolchain maintainers might want to invest in additional infrastructure to
test for things like this. Running tests several times in a row and then
comparing results will make it easier to spot hash order dependencies in
code. Authors are strongly encouraged not to expose the key order of
Perl's hashes to insecure audiences.
Further, every hash has its own iteration order, which should make it much
more difficult to determine what the current hash seed is.
=head3 New hash functions
Perl v5.18 includes support for multiple hash functions, and changed
the default (to ONE_AT_A_TIME_HARD), you can choose a different
algorithm by defining a symbol at compile time. For a current list,
consult the F document. Note that as of Perl v5.18 we can
only recommend use of the default or SIPHASH. All the others are
known to have security issues and are for research purposes only.
=head3 PERL_HASH_SEED environment variable now takes a hex value
C no longer accepts an integer as a parameter;
instead the value is expected to be a binary value encoded in a hex
string, such as "0xf5867c55039dc724". This is to make the
infrastructure support hash seeds of arbitrary lengths, which might
exceed that of an integer. (SipHash uses a 16 byte seed.)
=head3 PERL_PERTURB_KEYS environment variable added
The C environment variable allows one to control the level of
randomization applied to C and friends.
When C is 0, perl will not randomize the key order at all. The
chance that C changes due to an insert will be the same as in previous
perls, basically only when the bucket size is changed.
When C is 1, perl will randomize keys in a non-repeatable
way. The chance that C changes due to an insert will be very high. This
is the most secure and default mode.
When C is 2, perl will randomize keys in a repeatable way.
Repeated runs of the same program should produce the same output every time.
C implies a non-default C setting. Setting
C (exactly one 0) implies C (hash key
randomization disabled); setting C to any other value implies
C (deterministic and repeatable hash key randomization).
Specifying C explicitly to a different level overrides this
behavior.
=head3 Hash::Util::hash_seed() now returns a string
Hash::Util::hash_seed() now returns a string instead of an integer. This
is to make the infrastructure support hash seeds of arbitrary lengths
which might exceed that of an integer. (SipHash uses a 16 byte seed.)
=head3 Output of PERL_HASH_SEED_DEBUG has been changed
The environment variable PERL_HASH_SEED_DEBUG now makes perl show both the
hash function perl was built with, I the seed, in hex, in use for that
process. Code parsing this output, should it exist, must change to accommodate
the new format. Example of the new format:
$ PERL_HASH_SEED_DEBUG=1 ./perl -e1
HASH_FUNCTION = MURMUR3 HASH_SEED = 0x1476bb9f
=head2 Upgrade to Unicode 6.2
Perl now supports Unicode 6.2. A list of changes from Unicode
6.1 is at L.
=head2 Character name aliases may now include non-Latin1-range characters
It is possible to define your own names for characters for use in
C<\N{...}>, C, etc. These names can now be
comprised of characters from the whole Unicode range. This allows for
names to be in your native language, and not just English. Certain
restrictions apply to the characters that may be used (you can't define
a name that has punctuation in it, for example). See L.
=head2 New DTrace probes
The following new DTrace probes have been added:
=over 4
=item *
C
=item *
C
=item *
C
=back
=head2 C<${^LAST_FH}>
This new variable provides access to the filehandle that was last read.
This is the handle used by C<$.> and by C and C without
arguments.
=head2 Regular Expression Set Operations
This is an B feature to allow matching against the union,
intersection, etc., of sets of code points, similar to
L. It can also be used to extend C processing
to [bracketed] character classes, and as a replacement of user-defined
properties, allowing more complex expressions than they do. See
L.
=head2 Lexical subroutines
This new feature is still considered B. To enable it:
use 5.018;
no warnings "experimental::lexical_subs";
use feature "lexical_subs";
You can now declare subroutines with C, C, and
C. (C requires that the "state" feature be
enabled, unless you write it as C.)
C creates a subroutine visible within the lexical scope in which
it is declared. The subroutine is shared between calls to the outer sub.
C declares a lexical subroutine that is created each time the
enclosing block is entered. C is generally slightly faster than
C.
C declares a lexical alias to the package subroutine of the same
name.
For more information, see L.
=head2 Computed Labels
The loop controls C, C and C, and the special C
operator, now allow arbitrary expressions to be used to compute labels at run
time. Previously, any argument that was not a constant was treated as the
empty string.
=head2 More CORE:: subs
Several more built-in functions have been added as subroutines to the
CORE:: namespace - namely, those non-overridable keywords that can be
implemented without custom parsers: C, C, C,
C, C, C, C, C, C, and C.
As some of these have prototypes, C has been
changed to not make a distinction between overridable and non-overridable
keywords. This is to make C consistent with
C.
=head2 C with negative signal names
C has always allowed a negative signal number, which kills the
process group instead of a single process. It has also allowed signal
names. But it did not behave consistently, because negative signal names
were treated as 0. Now negative signals names like C<-INT> are supported
and treated the same way as -2 [perl #112990].
=head1 Security
=head2 See also: hash overhaul
Some of the changes in the L were made to
enhance security. Please read that section.
=head2 C security warning in documentation
The documentation for C now includes a section which warns readers
of the danger of accepting Storable documents from untrusted sources. The
short version is that deserializing certain types of data can lead to loading
modules and other code execution. This is documented behavior and wanted
behavior, but this opens an attack vector for malicious entities.
=head2 C allowed code injection via a malicious template
If users could provide a translation string to Locale::Maketext, this could be
used to invoke arbitrary Perl subroutines available in the current process.
This has been fixed, but it is still possible to invoke any method provided by
C itself or a subclass that you are using. One of these
methods in turn will invoke the Perl core's C subroutine.
In summary, allowing users to provide translation strings without auditing
them is a bad idea.
This vulnerability is documented in CVE-2012-6329.
=head2 Avoid calling memset with a negative count
Poorly written perl code that allows an attacker to specify the count to perl's
C string repeat operator can already cause a memory exhaustion
denial-of-service attack. A flaw in versions of perl before v5.15.5 can escalate
that into a heap buffer overrun; coupled with versions of glibc before 2.16, it
possibly allows the execution of arbitrary code.
The flaw addressed to this commit has been assigned identifier CVE-2012-5195
and was researched by Tim Brown.
=head1 Incompatible Changes
=head2 See also: hash overhaul
Some of the changes in the L are not fully
compatible with previous versions of perl. Please read that section.
=head2 An unknown character name in C<\N{...}> is now a syntax error
Previously, it warned, and the Unicode REPLACEMENT CHARACTER was
substituted. Unicode now recommends that this situation be a syntax
error. Also, the previous behavior led to some confusing warnings and
behaviors, and since the REPLACEMENT CHARACTER has no use other than as
a stand-in for some unknown character, any code that has this problem is
buggy.
=head2 Formerly deprecated characters in C<\N{}> character name aliases are now errors.
Since v5.12.0, it has been deprecated to use certain characters in
user-defined C<\N{...}> character names. These now cause a syntax
error. For example, it is now an error to begin a name with a digit,
such as in
my $undraftable = "\N{4F}"; # Syntax error!
or to have commas anywhere in the name. See L.
=head2 C<\N{BELL}> now refers to U+1F514 instead of U+0007
Unicode 6.0 reused the name "BELL" for a different code point than it
traditionally had meant. Since Perl v5.14, use of this name still
referred to U+0007, but would raise a deprecation warning. Now, "BELL"
refers to U+1F514, and the name for U+0007 is "ALERT". All the
functions in L have been correspondingly updated.
=head2 New Restrictions in Multi-Character Case-Insensitive Matching in Regular Expression Bracketed Character Classes
Unicode has now withdrawn their previous recommendation for regular
expressions to automatically handle cases where a single character can
match multiple characters case-insensitively, for example, the letter
LATIN SMALL LETTER SHARP S and the sequence C. This is because
it turns out to be impracticable to do this correctly in all
circumstances. Because Perl has tried to do this as best it can, it
will continue to do so. (We are considering an option to turn it off.)
However, a new restriction is being added on such matches when they
occur in [bracketed] character classes. People were specifying
things such as C[\0-\xff]/i>, and being surprised that it matches the
two character sequence C (since LATIN SMALL LETTER SHARP S occurs in
this range). This behavior is also inconsistent with using a
property instead of a range: C<\p{Block=Latin1}> also includes LATIN
SMALL LETTER SHARP S, but C[\p{Block=Latin1}]/i> does not match C.
The new rule is that for there to be a multi-character case-insensitive
match within a bracketed character class, the character must be
explicitly listed, and not as an end point of a range. This more
closely obeys the Principle of Least Astonishment. See
L. Note that a bug [perl
#89774], now fixed as part of this change, prevented the previous
behavior from working fully.
=head2 Explicit rules for variable names and identifiers
Due to an oversight, single character variable names in v5.16 were
completely unrestricted. This opened the door to several kinds of
insanity. As of v5.18, these now follow the rules of other identifiers,
in addition to accepting characters that match the C<\p{POSIX_Punct}>
property.
There is no longer any difference in the parsing of identifiers
specified by using braces versus without braces. For instance, perl
used to allow C<${foo:bar}> (with a single colon) but not C<$foo:bar>.
Now that both are handled by a single code path, they are both treated
the same way: both are forbidden. Note that this change is about the
range of permissible literal identifiers, not other expressions.
=head2 Vertical tabs are now whitespace
No one could recall why C<\s> didn't match C<\cK>, the vertical tab.
Now it does. Given the extreme rarity of that character, very little
breakage is expected. That said, here's what it means:
C<\s> in a regex now matches a vertical tab in all circumstances.
Literal vertical tabs in a regex literal are ignored when the C
modifier is used.
Leading vertical tabs, alone or mixed with other whitespace, are now
ignored when interpreting a string as a number. For example:
$dec = " \cK \t 123";
$hex = " \cK \t 0xF";
say 0 + $dec; # was 0 with warning, now 123
say int $dec; # was 0, now 123
say oct $hex; # was 0, now 15
=head2 C(?{})/> and C(??{})/> have been heavily reworked
The implementation of this feature has been almost completely rewritten.
Although its main intent is to fix bugs, some behaviors, especially
related to the scope of lexical variables, will have changed. This is
described more fully in the L section.
=head2 Stricter parsing of substitution replacement
It is no longer possible to abuse the way the parser parses C like
this:
%_=(_,"Just another ");
$_="Perl hacker,\n";
s//_}->{_/e;print
=head2 C now aliases the global C<$_>
Instead of assigning to an implicit lexical C<$_>, C now makes the
global C<$_> an alias for its argument, just like C. However, it
still uses lexical C<$_> if there is lexical C<$_> in scope (again, just like
C) [perl #114020].
=head2 The smartmatch family of features are now experimental
Smart match, added in v5.10.0 and significantly revised in v5.10.1, has been
a regular point of complaint. Although there are a number of ways in which
it is useful, it has also proven problematic and confusing for both users and
implementors of Perl. There have been a number of proposals on how to best
address the problem. It is clear that smartmatch is almost certainly either
going to change or go away in the future. Relying on its current behavior
is not recommended.
Warnings will now be issued when the parser sees C<~~>, C, or C.
To disable these warnings, you can add this line to the appropriate scope:
no if $] >= 5.018, warnings => "experimental::smartmatch";
Consider, though, replacing the use of these features, as they may change
behavior again before becoming stable.
=head2 Lexical C<$_> is now experimental
Since it was introduced in Perl v5.10, it has caused much confusion with no
obvious solution:
=over
=item *
Various modules (e.g., List::Util) expect callback routines to use the
global C<$_>. C