When we are speaking we often use words such as "it" to refer to the thing we are currently talking about. For example, "My computer has just had a PSU failure.
It is on fire." You could say that "it" refers to the current topic, which we assigned in the previous sentence.
Anyone who's read many of my ramblings will know that one thing that interests me is the use of features of natural language in programming languages. What if we could express the idea of the current topic in a programming language, though?
Turns out that is exactly what the $_ variable in Perl is for. It is sometimes also known as the "default variable". If you have read many Perl scripts you will probably have come across things like this:
chomp;
s/\[b\](.*?)\[\/b\]/<b>$1<\/b>/;
print;
The question that people often ask on seeing this is - chomp what? Bind a substitution to what? Print what? A fairly substantial number of Perl built-ins, when invoked a parameter missing, will use the default variable $_ instead. You could re-write the above as:
chomp $_;
$_ =~ s/\[b\](.*?)\[\/b\]/<b>$1<\/b>/;
print $_;
The next question is, how do you set the current topic? Since $_ is a variable, you can assign to it:
$_ = "[b]badger[/b]\n";
However, there are some constructs that will assign a value to it automatically. For example:
for (A..Z) {
print;
}
Will print the alphabet in uppercase. What is going on? Basically, (A..Z) creates a list containing all the letters from A to Z. You could have put a number of values separated by commas, an array or a mixture of these between the brackets. Once this list has been created, the loop now iterates through the elements. The first element of the list is "A", the second is "B" and so on. Perl automatically assigns the current element to $_ for each iteration. Note you could have used foreach here too.
There is, however, a neater way to write the above. As what is inside the for construct is very simple, we can get away with writing it like this:
print for (A..Z);
## OR ##
print foreach (A..Z);
This also works when using the diamond operator (which returns one line from a file at a time) in a while loop. Imagine $fh contains a file handle. We could chomp and print every line in the file like this:
while (<$fh>) {
chomp;
print;
}
If you want to perform the same operation on many variables, you can use a similar trick:
for ($a, $b, $c, $d) {
s/'/'/g;
}
This escapes all ' characters, which is a good idea you're going to be feeding any of those variables into an SQL query. You wouldn't want to open a hole for an SQL injection attack now, would you? (An SQL injection attack exploits programs that place user-supplied data into an SQL query without validating or sanitizing it by supplying data that changes the meaning of the query.)
$_ is lexically scoped and works a like a variable declared with my. It exists within the block it's created in only, and outside that block it's gone (and $_ refers to the enclosing block's topic). For example:
for (2..20) {
for (1..10) {
print;
}
print;
}
Will print the numbers between 1 and 10 followed by a 2, then 1 to 10 again, followed by 3 and so on.
Using $_ and its related tricks is a good way to shorten and neaten up your code, and saves you a little typing. However, remember to take readability into account. If it is a script that you will be doing a lot of work with in the future, consider whether using $_ now will save you time in the long run. You may be better assigning to a "named" variable that you know won't be over-ridden in a deeper scope accidentally during alterations. As always, there's more than one way to do it. Choose the appropriate one.