goldb.org home

AS OF MAY 2008, THIS BLOG IS NO LONGER BEING UPDATED.
Visit the new blog at: http://coreygoldberg.blogspot.com



 Tuesday, November 27, 2007

Python - Extracting Files From Zip Archives

Here is a way to unzip files in Python.  If you have a zip containing multiple files, you can unzip it like this:

import zipfile

fh = open('foo.zip', 'rb')
z = zipfile.ZipFile(fh)
for name in z.namelist():
outfile = open(name, 'wb')
outfile.write(z.read(name))
outfile.close()
fh.close()
#    Comments [6] |
 Monday, November 26, 2007

wxPython - Hello World!

Here is a simple example for those getting started with Python GUI Programming, wxWidgets, and the wxPython Bindings.

This small program will display a Frame and the static text "Hello World!", positioned with a BoxSixer.

Output looks like this:



#!/usr/bin/env python

import wx

class Application(wx.Frame):
    def __init__(self, parent):
        wx.Frame.__init__(self, parent, -1, 'My GUI', size=(300, 200))
        panel = wx.Panel(self)
        sizer = wx.BoxSizer(wx.VERTICAL)
        panel.SetSizer(sizer)
        txt = wx.StaticText(panel, -1, 'Hello World!')
        sizer.Add(txt, 0, wx.TOP|wx.LEFT, 20)
        self.Centre()
        self.Show(True)

app = wx.App(0)
Application(None)
app.MainLoop()
#    Comments [0] |
 Monday, November 19, 2007

FSF Releases GNU Affero General Public License

The Free Software Foundation just released the final version of the GNU Affero General Public License (GNU AFDL).  This license covers software that is hosted on a computer network (SaaS - Software as a Service).  The regular GNU GPL only covers software distribution, so you are able to run modified GPL code on a network server without releasing your modified source code.  The GNU AFDL prohibits this and ensures source code for hosted software is made available.

from FSF:

"The Free Software Foundation (FSF) today published the GNU Affero General Public License version 3 (GNU AGPLv3). This is a new license; it is based on version 3 of the GNU General Public License (GNU GPLv3), but has an additional term to allow users who interact with the licensed software overa network to receive the source for that program."

It will be interesting to see which projects adopt this license and what its effects will be.  I can imagine that commercial companies would be very hesitant to use AFDL code.

#    Comments [0] |
 Thursday, November 15, 2007

Mike Kelly On Vendor Hype

Mike Kelly wrote an excellent piece about dealing with product/service vendors.  Go read the full post:

I think I’m ready to close the dialog… 

It is a rant, but the points he makes are right on the mark. If you want to cut through the crap that vendors like to spew, Mike has some great points and observations to think about. He defines some of the common sales terminolgy, what they actually mean. Vendors should take a close look at this.

I have been in the meetings he described where slicked up sales guys try to hawk some product that will not only solve all of your problems, but will also cure world hunger and save baby seals. This is usually followed by some fancy powerpoint and the same talking points that you have already read in their website's FAQ. Half of the time it takes everything I have to not stand up and just yell: "just give me the damn white paper and tell me the price already!".

My favorite observations he makes about terminology:

“Additional value add”:

"It’s a superfluous phrase. I’m tempted to ask, “What services or products do you provide that don’t add value? I just want to know so I can be sure I’m not paying for any of those.” I hope all your services add value. If they don’t, don’t offer them. When you use terms like “additional value add,” in my mind you become the used car salesman of your industry."
"Opening a dialog”

"The last thing I want to do is “open a dialog” about my problems. If I put a specific problem in front of you, I’m interested in specific solutions. Put the person in front of me who can help me understand what we need to do, and what you can do to help. If you’re not that person, I’ll find another vendor who is. I don’t have time to dialog, I have problems."

Nice post Mike!

#    Comments [0] |
 Wednesday, November 14, 2007

Regex Capture Groups In Python and Perl

I am a Python programmer and ex-Perl hacker.

Regular Expressions are possibly the quintessential feature of Perl and are directly part of the language syntax.

Rather than being part of the syntax, Python's Regular expressions are available via the 're' module. For some reason, I had some trouble figuring out matching groups when I first started using Python's Regular Expressions.

He are examples of extracting capture groups in both Perl and Python.

Lets say we have a string containing a date: '11/14/2007', and we want to capture only the year from this string.

A regex to match this format might be something like this:

[0-9]{2}/[0-9]{2}/[0-9]{4}

We can then put parenthesis around the piece we want to extract (the 4-digit year) to denote a capture group.

So now our regex would look like this:

[0-9]{2}/[0-9]{2}/([0-9]{4})


Perl Example:

$foo = '11/14/2007';

if ($foo =~ m^[0-9]{2}/[0-9]{2}/([0-9]{4})^) {
    print $1;
}

output:

2007

* Note the string we captured ended up in the special variable $1


Python Example:

import re

foo = '11/14/2007'

match = re.search('[0-9]{2}/[0-9]{2}/([0-9]{4})', foo)
if match:
    print match.group(1)

output:

2007

* Note the string we captured ended up in a match object, which can be accessed with the 'group()' method.

#    Comments [6] |
 Tuesday, November 13, 2007

Lintel (Linux/Intel) Dominates Supercomputers

Pretty interesting...

via BetaNews article:

"Twice each year, the rankings of 500 of the world's supercomputers are assessed by the University of Mannheim in association with Berkeley National Laboratory and the University of Tennessee, Knoxville. Their figures are then sorted by tested clusters' maximal observed peak performance, in gigaflops."
"Intel-based processors walked away with one, if not two, lions' shares worth of the Top 500 list, with a staggering 354 total systems."
"460 of the Top 500 systems were running one flavor of Linux or another, including all of the Top 10."
#    Comments [0] |
 Thursday, November 08, 2007

A Quick Guide To GPLv3

The FSF just posted this:

A Quick Guide to GPLv3

A very nice high level overview of the current GPL and what it means.

#    Comments [1] |
 Wednesday, November 07, 2007

Python - Processing Large Text Files One Line At A Time

I want to process some very large text files one line at a time.  Normally when I process text files, I slurp them into a list using the readlines() method.   However, sometimes the files are huge and it isn't feasible or optimal to read the entire content into memory upfront.   In this case, it makes sense to process them one line at a time.

The best solution I can come up with is this:


fh = open('foo.txt', 'r')
line = fh.readline()
while line:
    # do something here
    line = fh.readline()

It doesn't feel very pythonic/idiomatic.  Anyone have a better solution?


Update
Thanks to the comments below, I found a few different ways to do it. The best and most Pythonic way seems to be this:


for line in open('foo.txt', 'r'):
    # do something here

Python file objects support the iterator protocol, so you can just open it and go.   This is the same as using a while loop and calling readline() but more compact.

#    Comments [7] |

Done With Bloglines (So Long, And Thanks For All The Fish)

I was a Bloglines user for several years.  I liked the old-school frame interface and it generally met all of my needs to keep up to date with the hundreds of feeds I read regularly.

Recently, Bloglines released a Beta of their new feed reader.  It uses lots of AJAX and is significantly different than the classic version.  Since this will entail learning a new web-based feed reader, I thought it would be a good time to check out Google Reader.  So I exported my OPML and gave it a try.

First impressions are very good. I like the interface and keyboard shortcuts a lot. So.. now I'm hooked and it is the feed reader I will use going forward.

The only gripe I have is that Google Reader doesn't display the Favicon (little graphic icon) next to each feed.  All you get is a generic blue icon.  What's up with that?

#    Comments [6] |
 Tuesday, November 06, 2007

Extreme Linux Performance Monitoring And Tuning

I just came across a great site with lots of papers related to performance monitoring and tuning for Linux:

http://www.ufsdump.org

One paper I especially liked:

Extreme Linux Performance Monitoring And Tuning

"The purpose of this document is to describe how to monitor Linux operating systems for performance.  This paper examines how to interpret common Linux performance tool output.  After collecting this output, the paper describes how to make conclusions about performance bottlenecks."

Lots of great info!

#    Comments [2] |
 Friday, November 02, 2007

Is Wal-Mart's $200 Linux-based PC "Unacceptably Low End"?

Wal-Mart unveiled its $200 Linux-based PC.

from the Wired blog:

"It has a 1.5 Ghz VIA C7 CPU embedded in a Mini-ITX motherboard, 512MB of RAM and an 80GB hard drive. Normally, this would simply mark it as unacceptably low-end for use with modern software."

I'm not so sure about "unacceptably low-end".  The specs on this PC are substantially better than my home machine.  I have a box at home that I primarily use for web surfing.  It was an old castaway Windows NT machine from an old job.  I run Ubuntu (with Gnome) on it, and it works like a charm.  It's a 933MHz P3 with a 256MB RAM and a dog slow hard drive.

So.. with superior specs, I think the Wal-Mart machine would be a great PC for basic home use.

.. though the "VIA C7" chip scares me a bit.  Any idea how it stacks up against a similar spec'ed Intel or AMD?

#    Comments [0] |