goldb.org home

AS OF MAY 2008, THIS BLOG IS NO LONGER BEING UPDATED.
Visit the new blog at: http://coreygoldberg.blogspot.com



 Sunday, February 04, 2007

Perl - Building Web Clients

The following is a short tutorial on web programming in Perl I wrote several years ago.  This type of programming was my first foray into the guts of the web.  Writing tools at the protocol level forced me to gain a deep understanding of HTTP and Web Architecture, which has been extremely helpful to me since.


These examples show how to use Perl's 'LWP' (libwww-perl) modules to make requests to a web server. The libwww-perl collection is a set of Perl modules which provides a simple and consistent application programming interface to the web.


Using 'LWP' to do an HTTP GET Request:

This will request the main Google page and store the entire contents of the response in the the '$response' object.
#!/usr/bin/perl

use LWP;

$useragent = LWP::UserAgent->new;
$request = new HTTP::Request('GET',"http://www.example.com");
$response = $useragent->simple_request($request);

print $response->as_string();

(*use "useragent->request" instead of "useragent->simple_request" to follow server redirects)


Working With Cookies:

Here is the http header returned by the initial http request to Google:
(first part of 'print $response->as_string();' output in the previous example)
Date: Mon, 14 Apr 2003 18:38:28 GMT
Server: GWS/2.0
Content-Length: 2691
Content-Type: text/html
Content-Type: text/html; charset=ISO-8859-1
Client-Date: Mon, 14 Apr 2003 18:38:29 GMT
Client-Peer: 216.239.57.99:80
Client-Response-Num: 1
Connection: Close
Set-Cookie: PREF=ID=48fd767576ebd920:TM=1050345508:LM=1050345508:S=qLA8i5XyvLX37lG6;
expires=Sun, 17-Jan-2038 19:14:07 GMT; path=/; domain=.google.com
Title: Google

Notice the "Set-Cookie:" line in the header. This is what tells your web browser that a cookie needs to be set and returned as part of the http header in subsequent http requests to this server. In this case the cookie doesnt do much, but for a site that requires a login, this is how the server knows who you are to maintain a session.

In Perl, cookies can be handled for you by using the HTTP::Cookies module.

You first need to construct the object to contain your cookies:
$cookie_jar = HTTP::Cookies->new;

After an http request is sent, you can then extract the cookie from the response header:
$cookie_jar->extract_cookies($response);

Once you have the cookie stored in your cookie_jar, it needs to be sent back to the server in the header of every subsequent http request. This is done by adding the following command after you format each request:
$cookie_jar->add_cookie_header($request);


Now for the whole thing in a script:


The following script will make a request to the main Google page and store the cookie it receives. It will then make a request to Google to change the default language (user preference) to Spanish. A new cookie will be returned that we will store and use it to make another request to the main Google page. Google will recognize the information stored in our cookie and return the page in Spanish.
#!/usr/bin/perl

use LWP;
use HTTP::Cookies;

# construct objects
$useragent = LWP::UserAgent->new;
$cookie_jar = HTTP::Cookies->new;

# send request for main Google page
$request = new HTTP::Request('GET',"http://www.google.com");
$response = $useragent->simple_request($request);

# extract cookie from response header
$cookie_jar->extract_cookies($response);

# set user preference on Google to Spanish language
$request = new HTTP::Request('GET',"http://www.google.com/setprefs?
               submit2=Save+Preferences+&hl=es<=all&safe=images&num=10
               &q=&prev=http%3A%2F%2Fwww.google.com%2F&ie=UTF-8&oe=UTF-8");
$cookie_jar->add_cookie_header($request);
$response = $useragent->simple_request($request);

# extract new cookie from response header
$cookie_jar->extract_cookies($response);

# send request for main Google page (will return Spanish Google page)    
$request = new HTTP::Request('GET',"http://www.google.com");
$cookie_jar->add_cookie_header($request);
$response = $useragent->simple_request($request);

print $response->as_string; # print response body to verify cookies work (some text now in spanish)

#    Comments [0] |
Comments are closed.