I often need to automate tasks in web based applications. I like to do this at the protocol level by simulating a real user's interactions via HTTP. Python comes with two built-in modules for this: urllib (higher level Web interface) and httplib (lower level HTTP interface).
However, I usually don't use either of these. I prefer to use Joe Gregario's excellent httplib2 module (btw, I really wish this could make its way into Python's Standard Library). It is a much richer library and has a lot of nice features for dealing with HTTP.
When automating something, you often need to "login" to maintain some sort of session/state with the server. This is usually achieved with form-based authentication. You post a form to the server, and it responds with a cookie in the incoming HTTP header. You need to pass this cookie back to the server in subsequent requests to maintain state or to keep a session alive.
Here is an example of how to deal with cookies when doing your HTTP Post.
First, lets import the modules we will use:
import urllib import httplib2
Now, lets define the data we will need: In this case, we are doing a form post with 2 fields representing a username and a password.
url = 'http://www.example.com/login' body = {'USERNAME': 'foo', 'PASSWORD': 'bar'} headers = {'Content-type': 'application/x-www-form-urlencoded'}
Now we can send the HTTP request:
http = httplib2.Http() response, content = http.request(url, 'POST', headers=headers, body=urllib.urlencode(body))
At this point, our "response" variable contains a dictionary of HTTP header fields that were returned by the server. If a cookie was returned, you would see a "set-cookie" field containing the cookie value. We want to take this value and put it into the outgoing HTTP header for our subsequent requests:
headers['Cookie'] = response['set-cookie']
Now we can send a request using this header and it will contain the cookie, so the server can recognize us.
So... here is the whole thing in a script. We login to a site and then make another request using the cookie we received:
#!/usr/bin/env python import urllib import httplib2 http = httplib2.Http() url = 'http://www.example.com/login' body = {'USERNAME': 'foo', 'PASSWORD': 'bar'} headers = {'Content-type': 'application/x-www-form-urlencoded'} response, content = http.request(url, 'POST', headers=headers, body=urllib.urlencode(body)) headers = {'Cookie': response['set-cookie']} url = 'http://www.example.com/home' response, content = http.request(url, 'GET', headers=headers)
Copyright © 2006-2008 Corey Goldberg
Disclaimer The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.