What’s the safest way to iterate through the keys of a Perl hash?
The rule of thumb is to use the function most suited to your needs.
If you just want the keys and do not plan to ever read any of the values, use keys():
foreach my $key (keys %hash) { ... }
If you just want the values, use values()
:
foreach my $val (values %hash) { ... }
If you need the keys and the values, use each()
:
keys %hash; # reset the internal iterator so a prior each() doesn't affect the loop
while(my($k, $v) = each %hash) { ... }
If you plan to change the keys of the hash in any way except for deleting the current key during the iteration, then you must not use each()
. For example, this code to create a new set of uppercase keys with doubled values works fine using keys()
:
%h = (a => 1, b => 2);
foreach my $k (keys %h)
{
$h{uc $k} = $h{$k} * 2;
}
producing the expected resulting hash:
(a => 1, A => 2, b => 2, B => 4)
But using each()
to do the same thing:
%h = (a => 1, b => 2);
keys %h;
while(my($k, $v) = each %h)
{
$h{uc $k} = $h{$k} * 2; # BAD IDEA!
}
produces incorrect results in hard-to-predict ways. For example:
(a => 1, A => 2, b => 2, B => 8)
This, however, is safe:
keys %h;
while(my($k, $v) = each %h)
{
if(...)
{
delete $h{$k}; # This is safe
}
}
All of this is described in the perl documentation:
% perldoc -f keys
% perldoc -f each
There is another caveat with each. The iterator is bound to the hash, not the context, which means it is not re-entrant. For example if you loop over a hash, and print the hash perl will internally reset the iterator, making this code loop endlessly:
my %hash = ( a => 1, b => 2, c => 3, );
while ( my ($k, $v) = each %hash ) {
print %hash;
}