This page introduces the bare bones of the Perl programming language in a hands-on fashion. There are many examples the reader can cut and paste into his/her programming environment and experiment with.
Of the multitude of Perl features, this page concentrates on those likely to be of use for the beginning CGI programmer.
Basic Perl statements and variables Reading data from the keyboard If statements and relational operators Using logical operators List of Perl operators Arrays Repetition and "looping constructs" Associative arrays Subroutines and modular programs Miscellaneous built-in Perl functions Pattern matches and regular expressions Programming with Perl objects Some final thoughts
Basic Perl statements and variables
To begin with, you can build a very simple program that does something
obvious to the user by using the Perl print statement:
The first line tells the operating system where to find the Perl interpreter. The second line causes the string "Hello world!" to appear on your screen. The second line is composed of a command, print and an "argument", "Hello world!", and is terminated by a semicolon (;).
Use the pico editor to define a file named hello.pl containing this file. Type pico followed by the file name at the UNIX command line prompt, which is probably a dollar sign ($).
After you have entered the program and saved it to the file hello.pl, run it by using a command like:
perl hello.pl
you should see "Hello world!" appear on your screen, followed by the next command prompt, as in:
falcon:/homef/imajhawk$ perl hello.pl Hello world!falcon:/homef/imajhawk$
This is somewhat infelicitous, because the prompt ends up on the same line as the output string. To pretty things up, change the print string to "Hello world!\n", and rerun the program. You will then see something like:
falcon:/homef/imajhawk$ perl hello.pl Hello world! falcon:/homef/imajhawk$
This time the program prints a "newline" (represented by "\n") after printing "Hello world!", so that the next command prompt appears on its own line. If you use two "\n" sequences you will create a blank line between the line containing "Hello world!" and the command prompt. Each newline character forces the writing cursor to drop down one line.
Another way to print the same string is to use a print statement like:
This statement uses the "concatenation operator" (.) to concatenate the two strings to form a single string containing the same characters in the original.
Yet another way to print this string is to assign it to a Perl "variable", say $GREETING, and then print that variable as in:
The variable $GREETING is simply a place to store the string for later use. Variables can actually be included within print strings as in:
which will produce something like:
falcon:/homef/imajhawk$ perl hello.pl What I want to say to you is: Hello world! falcon:/homef/imajhawk$
Note that $GREETING is a "scalar" variable, a variable holding only
one value at a time. Scalar variable names always begin with a
dollar sign ($).
Note also that Perl variables are not "typed". Variable type is determined
by use or context.
Reading data from the keyboard
You can read data from the keyboard into a Perl variable by using a program like:
This program will read one line of input from the "standard input" and write that line to the "standard output" which happens to be the screen. Both "STDIN" and "STDOUT" are optional in this context.
You can prompt the user for input by using a print statement followed immediately by a read, as in:
When this program runs it will write the prompt string to the screen and wait for you to type in your age and press the Return key. When the input line is placed into the $line variable it ends with a newline representing the Return key press. The Perl chop() function was used to eliminate the newline at the end of the reply.
Note the use of the # sign to include comments in the program.
Any text to the right of a # sign will be ignored by the Perl interpreter.
Note also the use of the Perl exit statement to leave the program.
exit is optional in this program because there were no more
statements to execute, but it is necessary in some programs.
If statements and relational operators
You can evaluate the data input from the user (data validation) by using an "if" statement and a comparison as in:
The if statement allows you to compare two numbers and respond accordingly. In this case the number stored in the variable $line is compared with the number 130 to see whether $line is greater than 130. If the user enters a number that is greater than 130, the message expressing skepticism will be printed.
The whole expression, $line > 130 is a "conditional expression", and the general syntax (form) for an if statement may be represented as:
Another way to handle this validation is to use an if...else... statement like:
This time EITHER the message of skepticism OR the message of acceptance will be printed, but NOT both.
You can compare two string values for equality by using the "eq" relational operator, and you can compare two string values for inequality by using the "ne" relational operator, as in:
FYI: relational operators return 1 for true and "" for false.
Using logical operators
A user may also enter a negative number, which would be another invalid entry.
You can test against both of these invalid entry possibilities by
using a logical OR operator (||), as in:
You may also use "or" as the logical OR operator, "&&" or "and" as the logical AND operator, and "!" or "not" as the logical NOT.
For example, you might ask:
to identify a believable response. Actually, to be safe you might use parentheses around every comparison as in:
to make sure the expression gets evaluated the way you expect it to.
In fact, you MUST use parens in the Megen example, as in:
The line
will not work because "! $line" will be evaluated as one operation,
the result of which will be compared with "Megen". In general, you can
define arbitrarily complex conditional expressions by using the comparison
and logical operators with appropriate (matched) sets of parentheses.
List of Perl operators
A partial list of Perl operators includes those in the table below.
The list is presented in (approximate) decending precedence order.
That is, if an expression contains two or more operators, Perl will
apply the highest ones in the list first. (Some of the operators in
this list share the same order of precedence, but have been shown
on different lines for clarity.) Knowing the precedence order of an
operator may help you decide whether to use parentheses in your
conditional expressions.
Unary operators | |
---|---|
! | logical not |
+ | signify positive numeric value |
- | signify negative numeric value |
Binding operators | |
=~ | pattern match failure returns false |
!~ | pattern match failure returns true |
Arithmetic and string operators | |
* / | arithmetic multiply, divide |
+ - . | arithmetic add, subtract, string concatenation |
Compare two numbers | |
> | true if the first is greater than the second |
< | true if the first is less than the second |
<= | true if the first is less than or equal to the second |
>= | true if the first is greater than or equal to the second |
Compare two string values | |
eq | true if they are identical |
ne | false if they are identical |
Compare two numbers | |
== | true if they are equal |
!= | false if they are equal |
Logical operators | |
&& | logical and |
|| | logical or |
You can then print the value of any element of the list by specifying its location within the list. That is,
will print "bananas". Note! You might have expected $fruits[2] to get you "pears", but item numbering in Perl lists begins with 0. Note also that array names begin with "@", but that single elements within an array (i.e., $fruit[2]) are usually scalars, so references to them begin with "$".
It is also possible to assign the values of one array to another, as in:
In this case the value of $red_fruit will become "apples", $green_fruit will become "pears", etc.
Repetition and "looping constructs"
It would be impractical if not impossible to print out the values of large arrays using single print statements. Perl provides special
constructs for performing repetitive tasks such as this.
For example, to print the values of each list element within $fruits you can write a foreach loop like this:
This foreach statement repeats the following process for each element in the array @fruits:
With the first approach you need not know exactly how many elements are contained in a list. An alternative approach requires foreknowledge of the number of elements and employs a "for loop":
This approach implements the following process:
You might also use a "while loop" to accomplish the same thing as in:
While loops are especially useful when a program cannot know at the outset how many repetitions it must make. This would be the case when reading information from a file (as you will see later) or from a network connection, etc.
Note that the general form of the while loop provides an unless
conditional and a continue block.
Associative arrays
Perl provides another data structure, called a "hash" or "associative array",
to keep track of lists of things. A hash is an array that indexes
each element with a string rather than a number. For example, an
associative array named %fruit_colors can be defined as:
Once again, the name of an associative array begins with a percent sign (%), but a single element with the array is referenced with a name beginning with a dollar sign ($). You can then print the contents of this associative array with a foreach loop like:
Here the Perl function keys() examines the hash and finds every index value. Within the loop, the color of each fruit is printed directly from the array.
You can also assign values to a hash by using a list of pairs of strings by using a Perl statement like:
or even
Subroutines and modular programming
The next program does what an earlier one does, but with a "modular"
structure. This program has two "subroutines" that may be called
from a "main" program, or "mainline".
Note that the first subroutine needs no information from the mainline to do its work. The second one, however, is called with an "argument," the age value entered by the user and stored in $line. The subroutine "call"
places a pointer to $line in the first element of a special array called @_, and the information in $line can be copied to or modified from the subroutine as $_[0], just as you would access the first element of any "non-special" array.
However, rather than use $line itself, print_age stores the value of $line in a variable called $age that is NOT available to any other subroutine OR to the mainline. $age is "isolated" from the rest of the program through the use of the "my" statement.
(The information passed to the subroutine is said to be passed "by reference" since @_ holds "pointers" or "hard references" to the information being passed.)
Modular programs are typically easier to understand, debug, and modify because their structure is usually clear and data isolation keeps subroutines from changing each other's data.
Note that the subroutines in the example above are used in contexts that do not require the subroutine to posses or return a value. A subroutine used to produce a value that is used within a calling statement is called a "function". For example, the subroutine credible() returns the value "yes" or the value "no".
The function receives $line as an argument, compares it with
bounding values, 0 and 130, and returns the string "no" or the string "yes".
The last value evaluated within the function is the value that will be
returned by the function. A Perl function can return a scalar or an
array.
Miscellaneous built-in Perl functions
This section presents some frequently used functions included in the
language definition, and expands previous information with examples.
There are many other functions available within Perl. See the Perl manual
or tutorials for descriptions.
For example, suppose you have file, called "fruit-colors.txt", containing the following lines:
You can "read" that file using a program like:
The "<" signifies the program intends to read from the file. That is, the program intends to move data FROM the file TO the program.
The following program can be used to create fruit-colors.txt:
The next program opens a connection, called a "pipe", to a process running the cat command. It then reads the results back to the script for processing:
which will yield:
apple red pear green banana yellow
The same result may be accomplished by a slight variation of the previous example, namely:
Note the use of the backtick (`) in the print command. This special character is used to identify the command that should be sent to the UNIX shell for execution.
A similar and quite useful alternative is to use:
which has the advantage that the program may manipulate the results of the shell command before sending it on to the user.
This program will read each line from the file and split that line into two parts at the first space " ". The first part of the line will be stored in the variable called $fruit, and the second part will be stored in $color. Those two variables are then used to make an entry into the hash %fruit_colors. The result of running this program will be:
color of pear is green color of banana is yellow color of apple is red
Pattern matches and regular expressions
Consider the example presented earlier which attempted to recognize the name of a user named "Megen". The program compared the
information entered by a user with the string "Megen".
Of course, the user might enter her name as "Megen Smith", in which case the comparison with "Megen" fails even though the user's name
really is Megen. One way around this is to search for the string "Megen" within the string submitted instead of testing for equality.
You can use a Perl "pattern match" to do just that as follows:
Here the binding operator =~ binds a "pattern match" to a specific variable, $line in this case). In this context if the user enters a string that contains the substring "Megen" anywhere within it, she will be greeted as "Megen". The pattern match will return either a "true" or "false", just as does a conditional expression used within an if statement. If the pattern match is successful the binding operation will return a "true" logical value.
However, you really can't be sure whether Megen will type her name with a leading capital "M". If she types "megen" or "MEGEN SMITH", the if statement above will not respond with "Hi Megen!" To make sure case sensitivity doesn't confuse things, you could simply add the letter "i" after the search string, making it: m/Megen/i. Note also that the "m" is optional in this context.
It is also possible to modify the values in a variable using binding operators. The statement
will "substitute" the string "megen" for the first (leftmost) occurrence of the string "Megen" within the variable $line.
will substitute "megen" for every occurrence of "Megen".
The patterns used within "m" and "s" pattern matches may include a (bewildering?) variety of special characters that have special meanings within the context. For example, the period (.) represents any character; it is a single-character wildcard. The asterisk (*) represents repetition of a character. That is, "a*" will match any number of occurrences of the letter "a", and "a+" represents one or more occurrences of the letter "a".
These pattern match strings are known as "regular expressions".
The set of special characters and the rules for using them constitute
a pattern matching language that has been thoroughly studied by
the theoretical computing community for many years.
See the Perl tutorials for more information.
Programming with Perl objects
An "object" is a special kind of data structure that combines data and
program components (e.g., subroutines and functions) that use that data.
The "object model" provides a way to structure programs to make them
more readable, usable, reliable, and re-usable.
Most programming models distinguish between data and the programs using that data. but either do not provide data isolation, implement isolation using complex rules, or inhibit flexibility with their isolation policies.
Most programming languages also provide constructs for modular programming. The object model further encourages or even dictates modular program design.
The Perl features presented to this point are adequate for writing powerful, efficient Perl programs. These features can all be used within the overarching object model.
A "class" is a collection of variable definitions and subroutines that define any object that is a member of that class. Data values contained within an object are called "properties" and subroutines defined within an object are called "methods".
Here is a definition of a class of objects called "person" that would normally be stored in a file called person.pm:
The method "new" is a special method that can be used to "construct" a new instantiation of the class "person". Three arguments are passed to new through the @_ array (as is common with subroutine invocations), and new uses the line
to assign each argument to a corresponding scalar variable reserved for use only by this execution of the subroutine new().
The statement
defines an empty, anonymous hash with $self, as a hard reference to the hash. new then assigns values to the two object properties within the new object. For example, the statement
assigns the color argument to the "Favorite Color" hash element, the value $weight to the "Weight" hash element and again stores the same hard reference into $self.
The new() function then declares $self to be a pointer to an object of the specified class by using the Perl bless() function. This is a critical step, since an object in Perl is nothing more than "a blessed thingie." Finally, new returns $self to the calling routine.
You can create a new "instance" of a person with a Perl statement like:
or
Both of these statements will define an object with a favorite color of "blue" and a weight of "160" and store a pointer to that object in the variable $jeff.
You can then print information from the object using statements like:
or
The next example shows a simple program to tell users their favorite color. It uses the class definition provided above in person.pm.
The next version of the same program stores the object locations in a hash, named %list so they may be more easily found.
You can easily imagine how to modify this program to read colors and weights
from a file or a database rather than define them within the program itself.
Some final thoughts
Perl includes many more "features" and functions. Some consider it a
veritable kitchen sink of a language in that features seem to appear
willy-nilly, with no integrating rationale.
Others consider it a godsend and frequently discover features that
meet unique needs in special circumstances, and thereby demonstrate a
"utilitarian" rationale. Some even consider it both.
Because of this conglomeration of features, Perl programs can be very hard to comprehend, which makes them difficult to modify or maintain, and sometimes even difficult to write.
Restricting your programs to a consistent structure and a subset of the available Perl statements should help keep your programs tractable.
Michael Grobe grobe@ku.edu