What’s the difference between iterating over a file with foreach or while in Perl?
I have a file handle FILE in Perl, and I want to iterate over all the lines in the file. Is there a difference between the following?
while (<FILE>) { # do something }
and
foreach (<FILE>) { # do something }
For most purposes, you probably won’t notice a difference. However, foreach reads each line into a list (not an array) before going through it line by line, whereas while reads one line at a time. As foreach will use more memory and require processing time upfront, it is generally recommended to use while to iterate through lines of a file.
EDIT (via Schwern): The foreach loop is equivalent to this:
my @lines = <$fh>; for my $line (@lines) { ... }
It’s unfortunate that Perl doesn’t optimize this special case as it does with the range operator (1..10).
For example, if I read /usr/share/dict/words with a for loop and a while loop and have them sleep when they’re done I can use ps to see how much memory the process is consuming. As a control I’ve included a program that opens the file but does nothing with it.
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND schwern 73019 0.0 1.6 625552 33688 s000 S 2:47PM 0:00.24 perl -wle open my $fh, shift; for(<$fh>) { 1 } print "Done"; sleep 999 /usr/share/dict/words schwern 73018 0.0 0.1 601096 1236 s000 S 2:46PM 0:00.09 perl -wle open my $fh, shift; while(<$fh>) { 1 } print "Done"; sleep 999 /usr/share/dict/words schwern 73081 0.0 0.1 601096 1168 s000 S 2:55PM 0:00.00 perl -wle open my $fh, shift; print "Done"; sleep 999 /usr/share/dict/words
The for program is consuming almost 32 megs of real memory (the RSS column) to store the contents of my 2.4 meg /usr/share/dict/words. The while loop only stores one line at a time consuming just 70k for line buffering.