Perl one-liners
Perl provides several command line option that can be combined (see Perl as a command line tool
for details):
- -e option (execute) instructs Perl interpreter that the next
argument is a Perl statement to be compiled and run. If -e is given, Perl will not look for a script filename in the
argument list in will take the argument that follows -e as the text of the script.
- -n causes Perl to assume the following loop around your script, which makes it iterate over filename arguments
somewhat like sed -n
or awk:
while (<>) { ... # your script goes here )
You need to supply you own print statement(s) to have output printed, Nothing is printed by default. See -p to have
lines printed. If a file named by an argument cannot be opened for some reason, Perl warns you about it, and moves on to the
next file.
- -p is essentially -n with print statement added. Presuppose -n. This option
causes Perl to assume the following loop around your script, which makes it iterate over filename arguments somewhat like sed:
while (<>) {
... # your script
} continue {
print or die "-p destination: $!\n";
}
-
-a turns on autosplit mode using the default pattern / /, which is not that
useful. -a implicitly sets -n.. It provides a split command to the @F array is done as the
first thing inside the while loop produced by the option -n
or -p. For example:
perl -ane 'print pop(@F), "\n";'
is equivalent to
while (<>) { @F = split(' '); print pop(@F), "\n";}
It is important to know that an alternate delimiter (which can be any
regulqr expression, unlike Unix cut) for split may be specified
using option -F.
- -F pattern specifies the pattern to split on for -a. -F implicitly sets both -a and -n. The pattern may be surrounded
by
//
, ""
, or ''
, otherwise it will
be put in single quotes. You can't use literal whitespace in the pattern.
- -i[extension] specifies that files processed by the
<>
construct are to be
edited in-place. It does this by renaming the input file, opening the output
file by the original name, and selecting that output file as the default for
print() statements. The extension, if supplied, is used to modify the name of
the old file to make a backup copy, following these rules:
- If no extension is supplied, and your system supports it, the original file is kept open without a name while the output is redirected to a new
file with the original filename. When perl exits, cleanly or not, the
original file is unlinked.
- If the extension doesn't contain a
*
, then it is appended to
the end of the current filename as a suffix. If the extension does contain one
or more *
characters, then each *
is replaced with
the current filename. In Perl terms, you could think of this as:
($backup = $extension)=~ s/\*/$file_name/g;
- If a file named by an argument cannot be opened for some reason, Perl
warns you about it, and moves on to the next file. Note that the lines are
printed automatically. An error occurring during printing is treated as
fatal.
Some one-liners are so useful that they essentially convert Perl versatile command line utility (see
Perl as
a command line tool for additional information), for example cut
command replacement, which is more powerful and flexible then cut command. Often they can be put in your profile as aliases or
functions.
Many one-liners are interesting as an art of creating powerful and elegant
Perl regular expressions that solve important task while being very compact.
And creating useful one-liners is an art, in the same sense that
Donald Knuth considers programming
to be an art (see introduction to
TAOCP and his Turing lecture).
Some of them demonstrate great inventiveness and take language constructs to
its limits. See
Here is a dozen of one liners that I have found especially useful:
- Replace a pattern pattern1 with another (pattern2) globally inside the file and create a backup
perl -i.bak -pe 's/pattern1/pattern2/g' inputFile
- dos2unix emulation using tr function
perl -i.bak -pe 'tr/\r//d' filename
This one line strips the carriage returns out of a file, turning a DOS
file (which ends in both carriage returns and linefeeds) into a Unix file
(which includes only linefeeds). It is basically the equivalent to Unix
command:
tr -d "\015"
This functionality is useful on SLES 10 and SLES 11 as it does not include
dos2unix utility by default): See more information at
Conversion of files from Windows to Unix format
unix2dos (conversion from Unix to Windows format). You can convert
Unix file back to Windows format too:
perl -i.bak -pe 's/\n/\r\n/' filename
- Delete particular line of set of lines (like in grep) in the file
(poor man editor):
perl -i.bak -ne 'next if ($_ =~/pattern_for_deletion/); print;' filename
- Delete several lines from the beginning of the script of file. For
example one liner below deletes first 10 lines:
perl -i.bak -ne 'print unless 1 .. 10' foo.txt
- Delete the last line from the script
perl -i.bak -e '@x=<>; pop(@x); print @x'
- Select only lines between a pattern pattern1
and pattern2
perl -i.bak -ne 'print unless /pattern1/ .. /pattern2/' filename
for example
perl -i.bak -ne 'print unless /^START$/ .. /^END$/' filename
- Print file with line numbers
perl -ne 'print "$.\t$_";' filename
- Print balance of quotes in each line (useful for finding missing
quotes in Perl and other scripts)
perl -ne '$q=($_=~tr/"//); print"$.\t$q\t$_";' filename
- Same thing but with open and close curvy brackets (might be helpful
for finding missing brackets):
perl -ne '$o+=($_=~tr/{//);$c+=($_=~tr/}//); $b=$o-$c; print"$.\t$b\t$_";' < myjoin3_lines.pl
- Emulation of Unix cut utility in Perl. Option -a (
autosplit mode) converts each line into array @F. The pattern
for splitting a whitespace (space or \t) and unlike cut it accommodates
consecutive spaces of \t mixes with spaces. Here's a simple one-line
script that will print out the first word of every line, but also skip
any line beginning with a # because it's a comment line.
perl -nae 'next if /^#/; print "$F[0]\n"'
Here is example on how to print first and the last fields:
perl -lane 'print "$F[0]:$F[-1]\n"'
See below (More on emulation of Unix
cut utility in Perl) for more information. See also cut command.
- To capitalize the first letter on the line and convert the
other letters to small case. Here are two variants, one simple
and one idiomatic:
perl -pe 's/(\w)(.*)$/\U$1\L$2/'
perl -pe 's/\w.+/\u\L$&/'
The second one belongs to
Matz Kindahl and simultaneously belongs to the class of Perl idioms.
It is more difficult to understand so the first version is preferable. Beware
Perl authors who prefer the second variant to the first ;-)
-
In-place editing one-liner:
# in-place edit of *.c files changing all [instances of the word] foo to bar
perl -p -i.bak -e 's/\bfoo\b/bar/g' *.c
# change all the isolated oldvar occurrences to newvar
perl -i.bak -pe 's{\boldvar\b}{newvar}g' *.[chy]
Tom Christiansen once posted a list of one line perl programs to do many common command-line
tasks.
# simplest one-liner program:
perl -e 'print "hello world!\n"'
# add first and penultimate columns
perl -lane 'print $F[0] + $F[-2]'
# print just lines 15 to 17
perl -ne 'print if 15 .. 17' *.pod
# in-place edit of *.c files changing all foo to bar
perl -p -i.bak -e 's/\bfoo\b/bar/g' *.c
# command-line that prints the first 50 lines (cheaply)
perl -pe 'exit if $. > 50' f1 f2 f3 ...
# delete first 10 lines
perl -i.old -ne 'print unless 1 .. 10' foo.txt
# change all the isolated oldvar occurrences to newvar
perl -i.old -pe 's{\boldvar\b}{newvar}g' *.[chy]
# command-line that reverses the whole file by lines
perl -e 'print reverse <>' file1 file2 file3 ....
# find palindromes
perl -lne 'print if $_ eq reverse' /usr/dict/words
# command-line that reverse all the bytes in a file
perl -0777e 'print scalar reverse <>' f1 f2 f3 ...
# command-line that reverses the whole file by paragraphs
perl -00 -e 'print reverse <>' file1 file2 file3 ....
# increment all numbers found in these files
perl i.tiny -pe 's/(\d+)/ 1 + $1 /ge' file1 file2 ....
# command-line that shows each line with its characters backwards
perl -nle 'print scalar reverse $_' file1 file2 file3 ....
# delete all but lines between START and END
perl -i.old -ne 'print unless /^START$/ .. /^END$/' foo.txt
# binary edit (careful!)
perl -i.bak -pe 's/Mozilla/Slopoke/g' /usr/local/bin/netscape
# look for dup words
perl -0777 -ne 'print "$.: doubled $_\n" while /\b(\w+)\b\s+\b\1\b/gi'
# command-line that prints the last 50 lines (expensively)
perl -e 'lines = <>; print @@lines[ $#lines .. $#lines-50' f1 f2 f3 ...
In 2003 Theodor Zlatanov, who authored a series of interesting articles about Perl on IBM DeveloperWorks
site published an article
Cultured
Perl One-liners 102. Among other interesting one-liners he provided a
collection of useful one-liners of using ranges
in Perl one-lines as well as in place editing:
Listing 3: Printing a range of lines
# 1. just lines 15 to 17
perl -ne 'print if 15 .. 17'
# 2. just lines NOT between line 10 and 20
perl -ne 'print unless 10 .. 20'
# 3. lines between START and END
perl -ne 'print if /^START$/ .. /^END$/'
# 4. lines NOT between START and END
perl -ne 'print unless /^START$/ .. /^END$/'
A problem with the first one-liner in Listing 3 is that it will go through the whole file, even if the necessary range has already been covered. The third one-liner does not have that problem, because it will print all the lines between the START and
END markers. If there are eight sets of START/END markers, the third one-liner
will print the lines inside all eight sets.
Preventing the inefficiency of the first one-liner is easy: just use the $. variable,
which tells you the current line. Start printing if $. is over 15 and exit if
$. is greater than 17.
... ... ...
Listing 5: In-place editing
# 1. in-place edit of *.c files changing all foo to bar
perl -p -i.bak -e 's/\bfoo\b/bar/g' *.c
# 2. delete first 10 lines
perl -i.old -ne 'print unless 1 .. 10' foo.txt
# 3. change all the isolated oldvar occurrences to newvar
perl -i.old -pe 's{\boldvar\b}{newvar}g' *.[chy]
# 4. increment all numbers found in these files
perl -i.tiny -pe 's/(\d+)/ 1 + $1 /ge' file1 file2 ....
# 5. delete all but lines between START and END
perl -i.old -ne 'print unless /^START$/ .. /^END$/' foo.txt
# 6. binary edit (careful!)
perl -i.bak -pe 's/Mozilla/Slopoke/g' /usr/local/bin/netscape
Matz Kindahl - Collection of Perl programs
These are one liners that might be of use. Some of them are from the net and some are one that I have had to use for
some simple task. If Perl 5 is required, the perl5
is used.
perl -ne '$n += $_; print $n if eof'
perl5 -ne '$n += $_; END { print "$n\n" }'
To sum numbers on a stream, where each number appears on a line by itself. That kind of output is what you get
from
cut(1)
, if you cut out a numerical field from an output.
There is also a C program called
sigma that does this faster.
- perl5 -pe 's/(\w)(.*)$/\U$1\L$2/'
perl5 -pe 's/\w.+/\u\L$&/'
- To capitalize the first letter on the line and convert the other letters to small case. The last one
is much nicer, and also faster.
-
perl -e 'dbmopen(%H,".vacation",0666);printf("%-50s: %s\n",$K,scalar(localtime(unpack("L",$V)))while($K,$V)=each(%H)'
- Well, it is a one-liner. :)
You can use it to examine who wrote you a letter while you were on vacation. It examines the file that
vacation(1) produces.
-
perl5 -p000e 'tr/ \t\n\r/ /;s/(.{50,72})\s/$1\n/g;$_.="\n"x2'
- This piece will read paragraphs from the standard input and reformat them in such a manner that every line is
between 50 and 72 characters wide. It will only break a line at a whitespace and not in the middle of a word.
perl5 -pe 's#\w+#ucfirst lc reverse $&#eg'
- This piece will read lines from the standard input and transform them into the Zafir language used by Zafirs
troops, i.e. "Long Live Zafir!" becomes "Gnol Evil Rifaz!" (for some reason they always talk using capital letters).
Andrew Johnson and I posted slightly different versions, and we both split the string unnecessarily. This one avoids
splitting the string.
-
perl -pe '$_ = " $_ "; tr/ \t/ /s; $_ = substr($_,1,-1)'
- This piece will remove spaces at the beginning and end of a line and squeeze all other sequences of spaces into
one single space.
This was one of the "challenges" from
comp.lang.perl.misc
that occurs frequently; I am just unable to resist those. :)
This sample is deliberately made of pretty exotic one :-)
- Printing all capitalized words
perl -ne 'push@w,/(\b[A-Z]\S*?\b)/g;END{print"@w"}' file
- Separate the header and body of a mail message into strings:
while (<>) {
$in_header = 1 ../^$/;
$in_body = /^$/ ..eof();
}
- Count the lines of pod and code in a Perl program:
@a=(0,0);while(<>){++$a[not m/
^=\w+/s .. m/^=cut/s]} printf"%d
pod lines, %d code lines\n",@a;
- Transpose a two-dimensional array:
@matrix_t = map{my$x=$_;[map {$matrix[$_][$x]} 0..$#matrix]}0..$#{$matrix[0]};
- 20170320 : Cultured Perl One-liners 102 ( Mar 20, 2017 , www.ibm.com )
- 20170320 : The Birth of a One-Liner ( The Perl Journal, Fall 1998 )
- 20120118 : I am looking forward to learn perl ( LinkedIn )
- 20120118 : Famous Perl One-Liners Explained, Part VII Handy Regular Expressions ( www.catonmat.net )
- 20060521 : Matz Kindahl - Collection of Perl programs ( Matz Kindahl - Collection of Perl programs, May 21, 2006 )
- 20041115 : Perl one-liners for the bioinformatician ( Perl one-liners for the bioinformatician, Nov 15, 2004 )
- 20041008 : Re perl one-liner for grep ( Re perl one-liner for grep, Oct 8, 2004 )
- 20040917 : chmod all files in a directory ( chmod all files in a directory, Sep 17, 2004 )
- 20040316 : Comma-delimit your output ( Comma-delimit your output, Mar 16, 2004 )
One-liners 102
More one-line Perl scripts
Teodor Zlatanov
Published on March 12, 2003
Share this page
Facebook
Twitter
Linked
In
Google+
E-mail
this page
0
This article, as regular readers may have guessed, is the sequel to "
One-liners
101
," which appeared in a previous installment of "Cultured Perl".
The earlier article is an absolute requirement for understanding the
material here, so please take a look at it before you continue.
The goal of this article, as with its predecessor, is to show legible
and reusable code, not necessarily the shortest or most efficient version
of a program. With that in mind, let's get to the code!
Tom Christiansen's list
Tom Christiansen posted a list of one-liners on Usenet years ago, and
that list is still interesting and useful for any Perl programmer. We
will look at the more complex one-liners from the list; the full list is
available in the file tomc.txt (see
Related topics
to download this file). The list overlaps slightly
with the "
One-liners
101
" article, and I will try to point out those intersections.
Awk is commonly used for basic tasks such as breaking up text into
fields; Perl excels at text manipulation by design. Thus, we come to our
first one-liner, intended to add two columns in the text input to the
script.
Listing 1. Like awk?
1
2
3
4
|
# add first and
penultimate columns
# NOTE the equivalent
awk script:
# awk '{i = NF - 1;
print $1 + $i}'
perl -lane 'print
$F[0] + $F[-2]'
|
So what does it do? The magic is in the switches. The
-n
and
-a
switches make the script a wrapper around input that
splits the input on whitespace into the
@F
array; the
-e
switch adds an extra statement into the wrapper. The code of
interest actually produced is:
Listing 2: The full Monty
1
2
3
4
5
|
while (<>)
{
@F
= split(' ');
print
$F[0] + $F[-2]; # offset -2 means "2nd to last
element of the array"
}
|
Another common task is to print the contents of a file between two
markers or between two line numbers.
Listing 3: Printing a range of lines
1
2
3
4
5
6
7
8
9
10
11
|
# 1. just lines 15 to
17
perl -ne 'print if 15
.. 17'
# 2. just lines NOT
between line 10 and 20
perl -ne 'print
unless 10 .. 20'
# 3. lines between
START and END
perl -ne 'print if
/^START$/ .. /^END$/'
# 4. lines NOT
between START and END
perl -ne 'print
unless /^START$/ .. /^END$/'
|
A problem with the first one-liner in Listing 3 is that it will go
through the
whole
file, even if the necessary range has already
been covered. The third one-liner does
not
have that problem,
because it will print all the lines between the
START
and
END
markers. If there are eight sets of
START/END
markers, the third one-liner will print the lines inside all eight sets.
Preventing the inefficiency of the first one-liner is easy: just use
the
$.
variable, which tells you the current line. Start
printing if
$.
is over 15 and exit if
$.
is
greater than 17.
Listing 4: Printing a numeric range of
lines more efficiently
1
2
|
# just lines 15 to
17, efficiently
perl -ne 'print if $.
>= 15; exit if $. >= 17;'
|
Enough printing, let's do some editing. Needless to say, if you are
experimenting with one-liners, especially ones
intended
to
modify data, you should keep backups. You wouldn't be the first
programmer to think a minor modification couldn't possibly make a
difference to a one-liner program; just don't make that assumption while
editing the Sendmail configuration or your mailbox.
Listing 5: In-place editing
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
# 1. in-place edit of
*.c files changing all foo to bar
perl -p -i.bak -e
's/\bfoo\b/bar/g' *.c
# 2. delete first 10
lines
perl -i.old -ne
'print unless 1 .. 10' foo.txt
# 3. change all the
isolated oldvar occurrences to newvar
perl -i.old -pe
's{\boldvar\b}{newvar}g' *.[chy]
# 4. increment all
numbers found in these files
perl -i.tiny -pe
's/(\d+)/ 1 + $1 /ge' file1 file2 ....
# 5. delete all but
lines between START and END
perl -i.old -ne
'print unless /^START$/ .. /^END$/' foo.txt
# 6. binary edit
(careful!)
perl -i.bak -pe
's/Mozilla/Slopoke/g' /usr/local/bin/netscape
|
Why does
1 .. 10
specify line numbers 1 through 10? Read
the "perldoc perlop" manual page. Basically, the
..
operator
iterates through a range. Thus, the script does not count 10
lines
,
it counts 10 iterations of the loop generated by the
-n
switch (see "perldoc perlrun" and Listing 2 for an example of that loop).
The magic of the
-i
switch is that it replaces each file
in
@ARGV
with the version produced by the script's output on
that file. Thus, the
-i
switch makes Perl into an editing
text filter. Do
not
forget to use the backup option to the
-i
switch. Following the
i
with an extension will
make a backup of the edited file using that extension.
Note how the
-p
and
-n
switch are used. The
-n
switch is used when you want explicitly to print out
data. The
-p
switch implicitly inserts a
print $_
statement in the loop produced by the
-n
switch. Thus, the
-p
switch is better for
full
processing of a file,
while the
-n
switch is better for
selective
file
processing, where only specific data needs to be printed.
Examples of in-place editing can also be found in the "
One-liners
101
" article.
Reversing the contents of a file is not a common task, but the
following one-liners show than the
-n
and
-p
switches are not always the best choice when processing an entire file.
Listing 6: Reversal of files' fortunes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
# 1. command-line
that reverses the whole input by lines
# (printing each
line in reverse order)
perl -e 'print
reverse <>' file1 file2 file3 ....
# 2. command-line
that shows each line with its characters backwards
perl -nle 'print
scalar reverse $_' file1 file2 file3 ....
# 3. find palindromes
in the /usr/dict/words dictionary file
perl -lne '$_ = lc
$_; print if $_ eq reverse' /usr/dict/words
# 4. command-line
that reverses all the bytes in a file
perl -0777e 'print
scalar reverse <>' f1 f2 f3 ...
# 5. command-line
that reverses each paragraph in the file but prints
# them in order
perl -00 -e 'print
reverse <>' file1 file2 file3 ....
|
The
-0
(zero) flag is very useful if you want to read a
full paragraph or a full file into a single string. (It also works with
any character number, so you can use a special character as a marker.) Be
careful when reading a full file in one command (
-0777
),
because a large file will use up all your memory. If you need to read the
contents of a file backwards (for instance, to analyze a log in reverse
order), use the CPAN module File::ReadBackwards. Also see "
One-liners
101
," which shows an example of log analysis with
File::ReadBackwards.
Note the similarity between the first and second scripts in Listing 6.
The first one, however, is completely different from the second one. The
difference lies in using <> in scalar context (as
-n
does in
the second script) or list context (as the first script does).
The third script, the palindrome detector, did not originally have the
$_ = lc $_;
segment. I added that to catch those palindromes
like "Bob" that are not the same backwards.
My addition can be written as
$_ = lc;
as well, but
explicitly stating the subject of the
lc()
function makes
the one-liner more legible, in my opinion.
Paul Joslin's list
Paul Joslin was kind enough to send me some of his one-liners for this
article.
Listing 7: Rewrite with a random number
1
2
|
# replace string XYZ
with a random number less than 611 in these files
perl -i.bak -pe
"s/XYZ/int rand(611)/e" f1 f2 f3
|
This is a filter that replaces
XYZ
with a random number
less than 611 (that number is arbitrarily chosen). Remember the
rand()
function returns a random number between 0 and its
argument.
Note that
XYZ
will be replaced by a
different
random number every time, because the substitution evaluates "int
rand(611)" every time.
Listing 8: Revealing the files' base nature
1
2
3
4
5
6
7
8
9
10
11
|
# 1. Run basename on
contents of file
perl -pe "s@.*/@@gio"
INDEX
# 2. Run dirname on
contents of file
perl -pe
's@^(.*/)[^/]+@$1\n@' INDEX
# 3. Run basename on
contents of file
perl -MFile::Basename
-ne 'print basename $_' INDEX
# 4. Run dirname on
contents of file
perl -MFile::Basename
-ne 'print dirname $_' INDEX
|
One-liners 1 and 2 came from Paul, while 3 and 4 were my rewrites of
them with the File::Basename module. Their purpose is simple, but any
system administrator will find these one-liners useful.
Listing 9: Moving or renaming, it's all the
same in UNIX
1
2
3
4
5
6
|
# 1. write command to
mv dirs XYZ_asd to Asd
# (you may have to
preface each '!' with a '\' depending on your shell)
ls | perl -pe
's!([^_]+)_(.)(.*)!mv $1_$2$3 \u$2\E$3!gio'
# 2. Write a shell
script to move input from xyz to Xyz
ls | perl -ne 'chop;
printf "mv $_ %s\n", ucfirst $_;'
|
For regular users or system administrators, renaming files based on a
pattern is a very common task. The scripts above will do two kinds of
job: either remove the file name portion up to the
_
character, or change each filename so that its first letter is uppercased
according to the Perl
ucfirst()
function.
There is a UNIX utility called "mmv" by Vladimir Lanin that may also
be of interest. It allows you to rename files based on simple patterns,
and it's surprisingly powerful. See the
Related topics
section for a link to this utility.
Some of mine
The following is not a one-liner, but it's a pretty useful script that
started as a one-liner. It is similar to Listing 7 in that it replaces a
fixed string, but the trick is that the replacement itself for the fixed
string becomes the fixed string the next time.
The idea came from a newsgroup posting a long time ago, but I haven't
been able to find original version. The script is useful in case you need
to replace one IP address with another in all your system files -- for
instance, if your default router has changed. The script includes
$0
(in UNIX, usually the name of the script) in the list of files
to rewrite.
As a one-liner it ultimately proved too complex, and the messages
regarding what is about to be executed are necessary when system files
are going to be modified.
Listing 10: Replace one IP address with
another one
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
#!/usr/bin/perl -w
use Regexp::Common qw/net/;
# provides the regular expressions for IP matching
my $replacement =
shift @ARGV; # get the new IP address
die "You must provide
$0 with a replacement string for the IP
111.111.111.111"
unless
$replacement;
# we require that
$replacement be JUST a valid IP address
die "Invalid IP
address provided: [$replacement]"
unless
$replacement =~ m/^$RE{net}{IPv4}$/;
# replace the string
in each file
foreach my $file ($0,
qw[/etc/hosts /etc/defaultrouter /etc/ethers], @ARGV)
{
#
note that we know $replacement is a valid IP
address, so this is
#
not a dangerous invocation
my
$command = "perl -p -i.bak -e
's/111.111.111.111/$replacement/g' $file";
print
"Executing [$command]\n";
system($command);
}
|
Note the use of the Regexp::Common module, an indispensable resource
for any Perl programmer today. Without Regexp::Common, you will be
wasting a lot of time trying to match a number or other common patterns
manually, and you're likely to get it wrong.
Conclusion
Thanks to Paul Joslin for sending me his list of one-liners. And in
the spirit of conciseness that one-liners inspire, I'll refer you to "
One-liners
101
" for some closing thoughts on one-line Perl scripts.
Articles by Teodor Zlatanov
Git gets
demystified and Subversion control
(Aug 27, 2009)
Build simple
photo-sharing with Amazon cloud and Perl
(Apr 06, 2009)
developerWorks: Use IMAP with Perl, Part 2
(May 26, 2005)
developerWorks:
Complex Layered Configurations with AppConfig
(Apr 11, 2005)
developerWorks:
Perl 6 Grammars and Regular Expressions
(Nov 09, 2004)
developerWorks:
Genetic Algorithms Simulate a Multi-Celled Organism
(Oct 28,
2004)
developerWorks:
Cultured Perl: Managing Linux Configuration Files
(Jun 15,
2004)
developerWorks:
Cultured Perl: Fun with MP3 and Perl, Part 2
(Feb 09, 2004)
developerWorks:
Cultured Perl: Fun with MP3 and Perl, Part 1
(Dec 16, 2003)
developerWorks:
Inversion Lists with Perl
(Oct 27, 2003)
developerWorks:
Cultured Perl: One-Liners 102
(Mar 21, 2003)
developerWorks: Developing cfperl, From the Beginning
(Jan 22,
2003)
IBM
developerWorks: Using the xinetd program for system administration
(Nov 28, 2001)
IBM
developerWorks: Reading and writing Excel files with Perl
(Sep
30, 2001)
IBM
developerWorks: Automating UNIX system administration with Perl
(Jul 22, 2001)
IBM
developerWorks: A programmer's Linux-oriented setup - Optimizing your machine for
your needs
(Mar 25, 2001)
IBM
developerWorks: Cultured Perl: Debugging Perl with ease
(Nov
23, 2000)
IBM
developerWorks: Cultured Perl: Review of Programming Perl, Third Edition
(Sep 17, 2000)
IBM
developerWorks: Cultured Perl: Writing Perl programs that speak English Using
Parse::RecDescent
(Aug 05, 2000)
IBM
developerWorks: Perl: Small observations about the big picture
(Jul 02, 2000)
IBM
developerWorks: Parsing with Perl modules
(Apr 30, 2000)
|
Learning Perl has been my most enjoyable experience working as a systems administrator. Perl
isn't the easiest language to master; most programmers have to rewire their brains somewhat to "think
Perl." Once programmers have learned the basics, they have a powerful and rich tool for attacking
most any task. But you know this already.
In the old days, if you wrote a program to perform data manipulation on some file, there were
standard operations that had to be implemented to access the file's data. Your program would have
to open the file, read each record and process it, decide what to do with the newly manipulated
data, and close the file. Perl doesn't let you avoid any of these steps, but by employing some of
Perl's unique features, you can express your programs much more concisely-and they'll be faster,
too.
In this article, we'll take a simple task and show how familiarity with Perl idioms can reduce
the size and complexity of the solution. Our task is to display the lines of a file that are neither
comments nor blank, and here's our first attempt:
#!/usr/bin/perl -w
# Obtain filename from the first argument.
$file = $ARGV[0];
# Open the file - if it can't be opened, terminate program
# and print an error message.
open INFILE, $file or die("\nCannot open file $!\n");
# For each record in the file, read it in and process it.
while (defined($line = <INFILE>)) {
# Grab the first one and two characters of each line.
$firstchar = substr($line,0,1);
$firsttwo = substr($line,0,2);
# If the line does NOT begin with a #! (we want to see
# any bang operators) but the first character does begin
# with a # (we don't want to see any # comments), skip it.
if ($firsttwo ne "#!" && $firstchar eq "#") { next; }
# Or, if the line consists of only a newline (i.e. it's
# a blank line), skip it.
elsif ($firstchar eq "\n") { next; }
# Otherwise display the line to standard output (i.e.
# your terminal).
else { print $line; }
# Proceed to next record.
}
# When finished processing records, be a good programmer
# and close the input file.
close INFILE;
This script works just fine, but it's pretty large-you have to look at a lot of lines to figure
out what it does. Let's streamline this code step-by-step until we're left with the bare essentials.
First, while (<>) opens the files provided on the command line and reads input lines
without you having to explicitly assign them to a variable. Let's remove the comments and change
the Perl script to use this feature.
#!/usr/bin/perl -w
while (<>) {
$firstchar = substr($_,0,1);
$firsttwo = substr($_,0,2);
if ($firsttwo ne "#!" && $firstchar eq "#") {
next;
} elsif ($firstchar eq "\n") {
next;
} else {
print $_;
}
}
As each line is read, it is stored in the scalar $_. We changed our call to substr
(which extracts or replaces individual characters from a string) and the print statement
to use this internal variable.
We can even make the while loop implicit as well. The -n switch wraps your program inside a loop:
LINE: while (<>) { your code }.
So we can shorten our little program even more:
#!/usr/bin/perl -wn
$firstchar = substr($_,0,1);
$firsttwo = substr($_,0,2);
if ($firsttwo ne "#!" && $firstchar eq "#") { next; }
elsif ($firstchar eq "\n") { next; }
else { print $_; }
In Perl, there's more than one way to do nearly anything-even good old conditionals. We can use
an alternate form-and the fact that our loop is now named LINE-to rewrite our program with
even less punctuation:
#!/usr/bin/perl -wn
$firstchar = substr($_,0,1);
$firsttwo = substr($_,0,2);
next LINE if $firsttwo ne "#!" && $firstchar eq "#";
next LINE if $firstchar eq "\n";
print $_;
The 'next LINE' commands aren't executed unless their if statements are true.
The intermediate variables $firstchar and $firsttwo make sense if they're going
to be used repeatedly, but for our program they aren't. They require unnecessary amounts of time
and memory. So let's eliminate them by using the substr function on the left side of the
comparisons:
#!/usr/bin/perl -wn
next LINE if substr($_,0,2) ne "#!" && substr($_,0,1) eq "#";
next LINE if substr($_,0,1) eq "\n";
print $_;
Our Perl program is now down to three lines of code (not count-ing the #! line). By
combining the two ifs into one compound if, I can reduce the program to two lines:
#!/usr/bin/perl -wn
next LINE if (substr($_,0,2) ne "#!"
&& substr($_,0,1) eq "#") || substr($_,0,1) eq "\n";
print $_;
That 'next LINE' statement won't fit in one column, but as usual There's Always More
Than One Way To Shorten It. Using the match (m//) operator, you can construct regular expressions,
which determine whether a string matches a pattern. Some simple regular expressions relevant to
our task:
m/^#!/ Does the string begin (^) with '#!'
m/^#/ Does the string begin (^) with '#'
m/^\n$/ Does the string begin (^) a newline (\n) and
end ($) with it too?
The =~ and !~ operators are used to test whether the pattern on the right applies
to the string on the left. $string =~ /^#/ is true if $string begins with a#,
and $string !~ /^#/ is true if it doesn't. The program can now be shortened even further:
#!/usr/bin/perl -wn
next LINE if ($_ !~ m/^#!/ && $_ =~ m/^#/) || $_ =~ m/^\n$/;
print $_;
What if there are blank lines with whitespace preceding the new-line? Then m/^\n$/ won't
be true, and the line will be displayed, which isn't what we want to happen. Inside a pattern, Perl
can test for a whitespace character with \s, which matches not only spaces but tabs and
carriage returns as well. Inside a pattern, you can specify how much you want of something with
a quantifier. The quantifiers are:
* 0 or more times
+ 1 or more times
? 0 or 1 time
{x,y} at least x but not more than y times
Since we might have any amount of extraneous whitespace, even none, * fits the bill.
\s* means zero or more whitespace characters. Added into the matching operator, our program
now reads:
#!/usr/bin/perl -wn
next LINE if ($_ !~ m/^#!/ && $_ =~ m/^#/) || $_ =~ m/^\s*\n$/;
print $_;
Perl often uses $_ as a default variable for its operators. It does this both with pattern
matches and print:
#!/usr/bin/perl -wn
next LINE if (!m/^#!/ && m/^#/) || m/^\s*\n$/;
print;
If we're applying a pattern match to $_, we can leave off the m in m//
matches:
#!/usr/bin/perl -wn
next LINE if (!/^#!/ && /^#/) || /^\s*\n$/;
print;
We can combine these two lines into one by using unless:
#!/usr/bin/perl -wn
print unless !/^#!/ && /^#/ || /^\s*\n$/;
Finally, we can execute this program directly from the command line, with the -e flag.
We can even trim the semicolon because it's the last statement of a block.
% perl -wne 'print unless !/^#!/ && /^#/ || /^\s*\n$/'
The result is a script that is starting to look like the others here in TPJ. Once you get used
to these idioms, you'll spill out streamlined code like this without thinking. There are probably
some Perl hackers out there who will come up with further optimiza-tions to this code.
Have fun!
__END__
Q: Hi, I'm looking forward to learn perl, I m a systems administrator ( unix ) .I'm intrested
in an online course, any recommendations would be highly appreciated. Syed
A: I used to teach sysadmins Perl in corporate environment and I can tell you that the main danger
of learning Perl for system administrator is overcomplexity that many Perl books blatantly sell.
In this sense anything written by Randal L. Schwartz is suspect and Learning Perl is a horrible
book to start. I wonder how many sysadmins dropped Perl after trying to learn from this book
See http://www.softpanorama.org/Bookshelf/perl.shtml
It might be that the best way is to try first to replace awk in your scripts with Perl. And only
then gradually start writing full-blown Perl scripts. For inspiration you can look collection on
Perl one-liners but please beware that some (many) of them are way too clever to be useful. Useless
overcomplexity rules here too.
I would also recommend to avoid OO features on Perl that many books oversell. A lot can be done
using regular Algol-style programming with subroutines and by translating awk into Perl. OO has
it uses but like many other programming paradigms it is oversold.
Perl is very well integrated in Unix (better then any of the competitors) and due to this it opens
for sysadmin levels of productivity simply incomparable with those levels that are achievable using
shell. You can automate a lot of routine work and enhance existing monitoring systems and such with
ease if you know Perl well.
109. Match something that looks like an IP address./^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$/
This regex doesn't guarantee that the thing that got matched is in fact a valid IP. All it does
is match something that looks like an IP. It matches a number followed by a dot four times. For
example, it matches a valid IP 81.198.240.140 and it also matches an invalid IP such as
923.844.1.999.
Here is how it works. The ^ at the beginning of regex is an anchor that matches the
beginning of string. Next \d{1,3} matches one, two or three consecutive digits. The
\. matches a dot. The $ at the end is an anchor that matches the end of the string.
It's important to use both ^ and $ anchors, otherwise strings like foo213.3.1.2bar
would also match.
This regex can be simplified by grouping the first three repeated \d{1,3}\. expressions:
/^(\d{1,3}\.){3}\d{1,3}$/
119. Replace all <b> tags with <strong>$html =~ s|<(/)?b>|<$1strong>|g
Here I assume that the HTML is in variable $html. Next the <(/)?b> matches
the opening and closing <b> tags, captures the optional closing tag slash in group
$1 and then replaces the matched tag with either <strong> or </strong>, depending
on if it was an opening or closing tag.
These are one liners that might be of use. Some of them are from the net and some are one that I
have had to use for some simple task. If Perl 5 is required, the perl5 is used.
- perl -ne '$n += $_; print $n if eof'
perl5 -ne '$n += $_; END { print "$n\n" }'
- To sum numbers on a stream, where each number appears on a line by itself. That kind of
output is what you get from cut(1), if you cut out a numerical field from an output.
There is also a C program called
sigma that does this faster.
- perl5 -pe 's/(\w)(.*)$/\U$1\L$2/'
perl5 -pe 's/\w.+/\u\L$&/'
- To capitalize the first letter on the line and convert the other letters to small
case. The last one is much nicer, and also faster.
- perl -e 'dbmopen(%H,".vacation",0666);printf("%-50s: %s\n",$K,scalar(localtime(unpack("L",$V)))while($K,$V)=each(%H)'
- Well, it is a one-liner. :)
You can use it to examine who wrote you a letter while you were on vacation. It examines the
file that vacation(1)
produces.
- perl5 -p000e 'tr/ \t\n\r/ /;s/(.{50,72})\s/$1\n/g;$_.="\n"x2'
- This piece will read paragraphs from the standard input and reformat them in such a manner
that every line is between 50 and 72 characters wide. It will only break a line at a whitespace
and not in the middle of a word.
- perl5 -pe 's#\w+#ucfirst lc reverse $&#eg'
- This piece will read lines from the standard input and transform them into the Zafir language
used by Zafirs troops, i.e. "Long Live Zafir!" becomes "Gnol Evil Rifaz!" (for some reason they
always talk using capital letters).
Andrew Johnson and I posted slightly different versions, and we both split the string unnecessarily.
This one avoids splitting the string.
- perl -pe '$_ = " $_ "; tr/ \t/ /s; $_ = substr($_,1,-1)'
- This piece will remove spaces at the beginning and end of a line and squeeze all other sequences
of spaces into one single space.
This was one of the "challenges" from comp.lang.perl.misc
that occurs frequently; I am just unable to resist those. :)
In the Perl spirit of "Programming is fun", here are some one-liners
that might be actually useful. Please
mail me yours, the best one-liner writer
wins a Perl magnetic poetry kit.
Contest closes July 31st, 2000. Please note that it is me personally running this competition, not
NRCC, CBR or IMB. "Best" is subjective, and will be determined by an open vote.
Take a multiple sequence FASTA sequence file and print the non-redundant
subset with the description lines for identical sequence concatenated with ;'s. Not a small one-liner,
but close enough.
perl -ne 'BEGIN{$/=">";$"=";"}($d,$_)=/(.*?)\n(.+?)>?$/s;push
@{$h{lc()}},$d if$_;END{for(keys%h){print">@{$h{$_}}$_"}}' filename
Split a multi-sequence FastA file into individual files named after
their description lines.
perl -ne 'BEGIN{$/=">"}if(/^\s*(\S+)/){open(F,">$1")||warn"$1
write failed:$!\n";chomp;print F ">", $_}'
Take a blast output and print all of the gi's matched, one per line.
perl -pe 'next unless ($_) = /^>gi\|(\d+)/;$_.="\n"' filename
Filter all repeats of length 4 or greater from a FASTA input file.
This one is thanks to Lincoln Stein and Gustavo Glusman's discussions on the
bio-perl mailing list.
perl -pe 'BEGIN{$_=<>;print;undef$/}s/((.+?)\2{3,})/"N"x
length$1/eg' filename
By : anonymous
( Fri Oct 8 01:39:43 2004 )
Try using the -P option with grep. This enables perl regular expressions
in grep e.g.
grep -P "\S+\s+\S+" file
By : anonymous ( Fri Sep 17 20:22:02 2004 )
perl -e 'chmod 0000 $_ while <*>'
By : anonymous
( Tue Mar 16 15:05:27 2004 )
perl -wne 'BEGIN{$" = ","} @fields = split/\s+/;
print "@fields\n";'
Softpanorama Recommended
...
One-liner program - Wikipedia, the free encyclopedia
Famous
Perl One-Liners Explained, Part I File Spacing
Here is the general plan:
Famous
Perl One-Liners Explained, Part VII Handy Regular Expressions
one-liners
by Jeff Bay,
Essential One-Liners
Perl Scripts
And One Liners - www.socher.org
Perl Regex One-Liner Cookbook
TPJ One Liners
- The Perl Journal, Fall 1998
Perl tricks by Neil
Kandalgaonkar
Introduction to Perl one-liners - good coders code, great
coders reuse
Society
Groupthink :
Two Party System
as Polyarchy :
Corruption of Regulators :
Bureaucracies :
Understanding Micromanagers
and Control Freaks : Toxic Managers :
Harvard Mafia :
Diplomatic Communication
: Surviving a Bad Performance
Review : Insufficient Retirement Funds as
Immanent Problem of Neoliberal Regime : PseudoScience :
Who Rules America :
Neoliberalism
: The Iron
Law of Oligarchy :
Libertarian Philosophy
Quotes
War and Peace
: Skeptical
Finance : John
Kenneth Galbraith :Talleyrand :
Oscar Wilde :
Otto Von Bismarck :
Keynes :
George Carlin :
Skeptics :
Propaganda : SE
quotes : Language Design and Programming Quotes :
Random IT-related quotes :
Somerset Maugham :
Marcus Aurelius :
Kurt Vonnegut :
Eric Hoffer :
Winston Churchill :
Napoleon Bonaparte :
Ambrose Bierce :
Bernard Shaw :
Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient
markets hypothesis :
Political Skeptic Bulletin, 2013 :
Unemployment Bulletin, 2010 :
Vol 23, No.10
(October, 2011) An observation about corporate security departments :
Slightly Skeptical Euromaydan Chronicles, June 2014 :
Greenspan legacy bulletin, 2008 :
Vol 25, No.10 (October, 2013) Cryptolocker Trojan
(Win32/Crilock.A) :
Vol 25, No.08 (August, 2013) Cloud providers
as intelligence collection hubs :
Financial Humor Bulletin, 2010 :
Inequality Bulletin, 2009 :
Financial Humor Bulletin, 2008 :
Copyleft Problems
Bulletin, 2004 :
Financial Humor Bulletin, 2011 :
Energy Bulletin, 2010 :
Malware Protection Bulletin, 2010 : Vol 26,
No.1 (January, 2013) Object-Oriented Cult :
Political Skeptic Bulletin, 2011 :
Vol 23, No.11 (November, 2011) Softpanorama classification
of sysadmin horror stories : Vol 25, No.05
(May, 2013) Corporate bullshit as a communication method :
Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000):
the triumph of the US computer engineering :
Donald Knuth : TAoCP
and its Influence of Computer Science : Richard Stallman
: Linus Torvalds :
Larry Wall :
John K. Ousterhout :
CTSS : Multix OS Unix
History : Unix shell history :
VI editor :
History of pipes concept :
Solaris : MS DOS
: Programming Languages History :
PL/1 : Simula 67 :
C :
History of GCC development :
Scripting Languages :
Perl history :
OS History : Mail :
DNS : SSH
: CPU Instruction Sets :
SPARC systems 1987-2006 :
Norton Commander :
Norton Utilities :
Norton Ghost :
Frontpage history :
Malware Defense History :
GNU Screen :
OSS early history
Classic books:
The Peter
Principle : Parkinson
Law : 1984 :
The Mythical Man-Month :
How to Solve It by George Polya :
The Art of Computer Programming :
The Elements of Programming Style :
The Unix Hater’s Handbook :
The Jargon file :
The True Believer :
Programming Pearls :
The Good Soldier Svejk :
The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society :
Ten Commandments
of the IT Slackers Society : Computer Humor Collection
: BSD Logo Story :
The Cuckoo's Egg :
IT Slang : C++ Humor
: ARE YOU A BBS ADDICT? :
The Perl Purity Test :
Object oriented programmers of all nations
: Financial Humor :
Financial Humor Bulletin,
2008 : Financial
Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related
Humor : Programming Language Humor :
Goldman Sachs related humor :
Greenspan humor : C Humor :
Scripting Humor :
Real Programmers Humor :
Web Humor : GPL-related Humor
: OFM Humor :
Politically Incorrect Humor :
IDS Humor :
"Linux Sucks" Humor : Russian
Musical Humor : Best Russian Programmer
Humor : Microsoft plans to buy Catholic Church
: Richard Stallman Related Humor :
Admin Humor : Perl-related
Humor : Linus Torvalds Related
humor : PseudoScience Related Humor :
Networking Humor :
Shell Humor :
Financial Humor Bulletin,
2011 : Financial
Humor Bulletin, 2012 :
Financial Humor Bulletin,
2013 : Java Humor : Software
Engineering Humor : Sun Solaris Related Humor :
Education Humor : IBM
Humor : Assembler-related Humor :
VIM Humor : Computer
Viruses Humor : Bright tomorrow is rescheduled
to a day after tomorrow : Classic Computer
Humor
The Last but not Least Technology is dominated by
two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt.
Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org
was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP)
without any remuneration. This document is an industrial compilation designed and created exclusively
for educational use and is distributed under the Softpanorama Content License.
Original materials copyright belong
to respective owners. Quotes are made for educational purposes only
in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains
copyrighted material the use of which has not always been specifically
authorized by the copyright owner. We are making such material available
to advance understanding of computer science, IT technology, economic, scientific, and social
issues. We believe this constitutes a 'fair use' of any such
copyrighted material as provided by section 107 of the US Copyright Law according to which
such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free)
site written by people for whom English is not a native language. Grammar and spelling errors should
be expected. The site contain some broken links as it develops like a living tree...
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or
referenced source) and are
not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness
of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be
tracked by Google please disable Javascript for this site. This site is perfectly usable without
Javascript.
Last modified:
September, 30, 2020