goldb.org home

AS OF MAY 2008, THIS BLOG IS NO LONGER BEING UPDATED.
Visit the new blog at: http://coreygoldberg.blogspot.com



 Monday, November 26, 2007

wxPython - Hello World!

Here is a simple example for those getting started with Python GUI Programming, wxWidgets, and the wxPython Bindings.

This small program will display a Frame and the static text "Hello World!", positioned with a BoxSixer.

Output looks like this:



#!/usr/bin/env python

import wx

class Application(wx.Frame):
    def __init__(self, parent):
        wx.Frame.__init__(self, parent, -1, 'My GUI', size=(300, 200))
        panel = wx.Panel(self)
        sizer = wx.BoxSizer(wx.VERTICAL)
        panel.SetSizer(sizer)
        txt = wx.StaticText(panel, -1, 'Hello World!')
        sizer.Add(txt, 0, wx.TOP|wx.LEFT, 20)
        self.Centre()
        self.Show(True)

app = wx.App(0)
Application(None)
app.MainLoop()
#    Comments [0] |
 Monday, November 19, 2007

FSF Releases GNU Affero General Public License

The Free Software Foundation just released the final version of the GNU Affero General Public License (GNU AFDL).  This license covers software that is hosted on a computer network (SaaS - Software as a Service).  The regular GNU GPL only covers software distribution, so you are able to run modified GPL code on a network server without releasing your modified source code.  The GNU AFDL prohibits this and ensures source code for hosted software is made available.

from FSF:

"The Free Software Foundation (FSF) today published the GNU Affero General Public License version 3 (GNU AGPLv3). This is a new license; it is based on version 3 of the GNU General Public License (GNU GPLv3), but has an additional term to allow users who interact with the licensed software overa network to receive the source for that program."

It will be interesting to see which projects adopt this license and what its effects will be.  I can imagine that commercial companies would be very hesitant to use AFDL code.

#    Comments [0] |
 Thursday, November 15, 2007

Mike Kelly On Vendor Hype

Mike Kelly wrote an excellent piece about dealing with product/service vendors.  Go read the full post:

I think I’m ready to close the dialog… 

It is a rant, but the points he makes are right on the mark. If you want to cut through the crap that vendors like to spew, Mike has some great points and observations to think about. He defines some of the common sales terminolgy, what they actually mean. Vendors should take a close look at this.

I have been in the meetings he described where slicked up sales guys try to hawk some product that will not only solve all of your problems, but will also cure world hunger and save baby seals. This is usually followed by some fancy powerpoint and the same talking points that you have already read in their website's FAQ. Half of the time it takes everything I have to not stand up and just yell: "just give me the damn white paper and tell me the price already!".

My favorite observations he makes about terminology:

“Additional value add”:

"It’s a superfluous phrase. I’m tempted to ask, “What services or products do you provide that don’t add value? I just want to know so I can be sure I’m not paying for any of those.” I hope all your services add value. If they don’t, don’t offer them. When you use terms like “additional value add,” in my mind you become the used car salesman of your industry."
"Opening a dialog”

"The last thing I want to do is “open a dialog” about my problems. If I put a specific problem in front of you, I’m interested in specific solutions. Put the person in front of me who can help me understand what we need to do, and what you can do to help. If you’re not that person, I’ll find another vendor who is. I don’t have time to dialog, I have problems."

Nice post Mike!

#    Comments [0] |
 Wednesday, November 14, 2007

Regex Capture Groups In Python and Perl

I am a Python programmer and ex-Perl hacker.

Regular Expressions are possibly the quintessential feature of Perl and are directly part of the language syntax.

Rather than being part of the syntax, Python's Regular expressions are available via the 're' module. For some reason, I had some trouble figuring out matching groups when I first started using Python's Regular Expressions.

He are examples of extracting capture groups in both Perl and Python.

Lets say we have a string containing a date: '11/14/2007', and we want to capture only the year from this string.

A regex to match this format might be something like this:

[0-9]{2}/[0-9]{2}/[0-9]{4}

We can then put parenthesis around the piece we want to extract (the 4-digit year) to denote a capture group.

So now our regex would look like this:

[0-9]{2}/[0-9]{2}/([0-9]{4})


Perl Example:

$foo = '11/14/2007';

if ($foo =~ m^[0-9]{2}/[0-9]{2}/([0-9]{4})^) {
    print $1;
}

output:

2007

* Note the string we captured ended up in the special variable $1


Python Example:

import re

foo = '11/14/2007'

match = re.search('[0-9]{2}/[0-9]{2}/([0-9]{4})', foo)
if match:
    print match.group(1)

output:

2007

* Note the string we captured ended up in a match object, which can be accessed with the 'group()' method.

#    Comments [6] |
 Tuesday, November 13, 2007

Lintel (Linux/Intel) Dominates Supercomputers

Pretty interesting...

via BetaNews article:

"Twice each year, the rankings of 500 of the world's supercomputers are assessed by the University of Mannheim in association with Berkeley National Laboratory and the University of Tennessee, Knoxville. Their figures are then sorted by tested clusters' maximal observed peak performance, in gigaflops."
"Intel-based processors walked away with one, if not two, lions' shares worth of the Top 500 list, with a staggering 354 total systems."
"460 of the Top 500 systems were running one flavor of Linux or another, including all of the Top 10."
#    Comments [0] |
 Thursday, November 08, 2007

A Quick Guide To GPLv3

The FSF just posted this:

A Quick Guide to GPLv3

A very nice high level overview of the current GPL and what it means.

#    Comments [1] |
 Wednesday, November 07, 2007

Python - Processing Large Text Files One Line At A Time

I want to process some very large text files one line at a time.  Normally when I process text files, I slurp them into a list using the readlines() method.   However, sometimes the files are huge and it isn't feasible or optimal to read the entire content into memory upfront.   In this case, it makes sense to process them one line at a time.

The best solution I can come up with is this:


fh = open('foo.txt', 'r')
line = fh.readline()
while line:
    # do something here
    line = fh.readline()

It doesn't feel very pythonic/idiomatic.  Anyone have a better solution?


Update
Thanks to the comments below, I found a few different ways to do it. The best and most Pythonic way seems to be this:


for line in open('foo.txt', 'r'):
    # do something here

Python file objects support the iterator protocol, so you can just open it and go.   This is the same as using a while loop and calling readline() but more compact.

#    Comments [7] |

Done With Bloglines (So Long, And Thanks For All The Fish)

I was a Bloglines user for several years.  I liked the old-school frame interface and it generally met all of my needs to keep up to date with the hundreds of feeds I read regularly.

Recently, Bloglines released a Beta of their new feed reader.  It uses lots of AJAX and is significantly different than the classic version.  Since this will entail learning a new web-based feed reader, I thought it would be a good time to check out Google Reader.  So I exported my OPML and gave it a try.

First impressions are very good. I like the interface and keyboard shortcuts a lot. So.. now I'm hooked and it is the feed reader I will use going forward.

The only gripe I have is that Google Reader doesn't display the Favicon (little graphic icon) next to each feed.  All you get is a generic blue icon.  What's up with that?

#    Comments [6] |
 Tuesday, November 06, 2007

Extreme Linux Performance Monitoring And Tuning

I just came across a great site with lots of papers related to performance monitoring and tuning for Linux:

http://www.ufsdump.org

One paper I especially liked:

Extreme Linux Performance Monitoring And Tuning

"The purpose of this document is to describe how to monitor Linux operating systems for performance.  This paper examines how to interpret common Linux performance tool output.  After collecting this output, the paper describes how to make conclusions about performance bottlenecks."

Lots of great info!

#    Comments [2] |
 Friday, November 02, 2007

Is Wal-Mart's $200 Linux-based PC "Unacceptably Low End"?

Wal-Mart unveiled its $200 Linux-based PC.

from the Wired blog:

"It has a 1.5 Ghz VIA C7 CPU embedded in a Mini-ITX motherboard, 512MB of RAM and an 80GB hard drive. Normally, this would simply mark it as unacceptably low-end for use with modern software."

I'm not so sure about "unacceptably low-end".  The specs on this PC are substantially better than my home machine.  I have a box at home that I primarily use for web surfing.  It was an old castaway Windows NT machine from an old job.  I run Ubuntu (with Gnome) on it, and it works like a charm.  It's a 933MHz P3 with a 256MB RAM and a dog slow hard drive.

So.. with superior specs, I think the Wal-Mart machine would be a great PC for basic home use.

.. though the "VIA C7" chip scares me a bit.  Any idea how it stacks up against a similar spec'ed Intel or AMD?

#    Comments [0] |
 Wednesday, October 31, 2007

Which Version Of Python Ships With Mac OS X Leopard?

I am not a Mac user, but in case anyone is interested in knowing which version of Python ships with OS X Leopard, the answer is Python 2.5.

#    Comments [0] |

Learn The Ideals And History Of Free And Open Source Software

There are lots of resources available online to learn about Free and Open Source Software.

If you want to understand the essence and ideals of this movement, a great start would be to read the following 4 books. After reading these, you will have a good grasp of the history and philosophy of freedom in the technology world.

#    Comments [0] |
 Wednesday, October 24, 2007

Python - List Comprehensions Leak Variables

One thing to remember when using List Comprehensions is that they "leak" their temporary iteration variable to the outside.

what does that mean?

In the following example, we still have access to 'x' after we run the list comprehension.

foo = ['a', 'b', 'c']
my_list = [x for x in foo]
print x

output:
>> c

This behaviour is different from how a Generator Expression works. We could have wrote the List Comprehension using a Generator Expression like this:

my_list = list(x for x in foo)

Now, the temporary variable we used is not accessible from outside the scope of the expression.

foo = ['a', 'b', 'c']
my_list = list(x for x in foo)
print x

output:
>> NameError: name 'x' is not defined

Note: This is fixed in Python 3000

#    Comments [5] |
 Monday, October 22, 2007

OpenSTA 1.4.4 Release (Open Source HTTP Performance Test Tool)

The OpenSTA team has announced the release of version 1.4.4

OpenSTA is a distributed software testing architecture designed around CORBA.  The applications that make up the current OpenSTA toolset were designed to be used by performance testing practitioners for web load testing.

Info:
http://portal.opensta.org/index.php?name=News&file=article&sid=51

Download:
http://opensta.org/download.html

Congrats and thanks to Bernie Velivis, Daniel Sutcliffe, Jerome Delemarche for making this release possible.




#    Comments [1] |
 Thursday, October 18, 2007

Charts And Graphs - Modern Solutions

To all the chart/graph/plot/visualization weenies out there...
Here is a great overview of some modern charting and graphing technologies.

Some options I will be exploring:

#    Comments [0] |
 Sunday, October 14, 2007

Python - Simple Multithreaded HTTP Load Generator/Timer

This is a module for generating concurrent requests to an HTTP server.  Each thread makes HTTP GET requests to a single URL at the specified interval.  Threads are added over a given rampup time if you want to generate increasing load.  Response times are printed to STDOUT.  Can be used for cursory performance benchmarking or load testing a web resource.

load_generator.py module

sample usage:


#!/usr/bin/env python

from load_generator import LoadManager

lm = LoadManager()
lm.msg = ('www.example.com', '/')
lm.start(threads=5, interval=2, rampup=2)
#    Comments [3] |
 Wednesday, October 10, 2007

Twelve Networking Truths - Good, Fast, Cheap: Pick Any Two

I love reading old RFC's.

One of my favorites is RFC 1925 - The Twelve Networking Truths:

The Fundamental Truths

  1. It Has To Work.
  2. No matter how hard you push and no matter what the priority, you can't increase the speed of light.
    (corollary) No matter how hard you try, you can't make a baby in much less than 9 months. Trying to speed this up *might* make it slower, but it won't make it happen any quicker.
  3. With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead.
  4. Some things in life can never be fully appreciated nor understood unless experienced firsthand. Some things in networking can never be fully understood by someone who neither builds commercial networking equipment nor runs an operational network.
  5. It is always possible to agglutinate multiple separate problems into a single complex interdependent solution. In most cases this is a bad idea.
  6. It is easier to move a problem around (for example, by moving the problem to a different part of the overall network architecture) than it is to solve it.
    (corollary) It is always possible to add another level of indirection.
  7. It is always something.
    (corollary) Good, Fast, Cheap: Pick any two (you can't have all three).
  8. It is more complicated than you think.
  9. For all resources, whatever it is, you need more.
    (corollary) Every networking problem always takes longer to solve than it seems like it should.
  10. One size never fits all.
  11. Every old idea will be proposed again with a different name and a different presentation, regardless of whether it works.
  12. In protocol design, perfection has been reached not when there is nothing left to add, but when there is nothing left to take away.
#    Comments [1] |
 Tuesday, October 09, 2007

Forrester Report - Evaluating Functional Testing Solutions

I was tipped off about this report in a recent forum discussion.

Report from Forrester Research:
    Evaluating Functional Testing Solutions (Powerpoint)

It is a nice overview of automated test execution and popular functional testing tools. It also gives an overview of the top tool vendors.

#    Comments [0] |

Google StockQuote Gadget - Back By Popular Demand!

A while back, I created a Google Gadget for displaying stock quotes and daily charts.  It was very popular and I was getting more than 12,000 page views per day.  I was scraping data from Google Finance and then received a takedown notice from Google.  So.. I took down the service that I was running.

Well... it seems the gadget was really popular and I have received lots of emails asking to revive the service.

And now... it's baaaack (using data from Yahoo)

- Add my gadget to your Google Personalized Homepage
- Add my gadget to your own web page


Have at it!

#    Comments [0] |
 Monday, October 08, 2007

Asleep On The Job

This is embarrassing.

After a long day of hacking, I fell asleep at my desk.  This has *never* happened to me at work.

Of course, somebody got a picture of it on their camera phone:

#    Comments [1] |
 Wednesday, September 26, 2007

Python - Tk Graph Example

I found a snippet to draw bar graphs in Python using Tk:
http://www.daniweb.com/code/snippet583.html

The output looks like this:


Here is a modified version that creates a bar graph in a Tk panel:

import Tkinter as tk

def graph_points(seq, width=375, height=325):
root = tk.Tk()
c = tk.Canvas(root, width=width, height=height, bg='white')
c.pack()
y_stretch = 15
y_gap = 20
x_stretch = 10
x_width = 20
x_gap = 20
for x, y in enumerate(data):
x0 = x * x_stretch + x * x_width + x_gap
y0 = height - (y * y_stretch + y_gap)
x1 = x * x_stretch + x * x_width + x_width + x_gap
y1 = height - y_gap
c.create_rectangle(x0, y0, x1, y1, fill="red")
c.create_text(x0+2, y0, anchor=tk.SW, text=str(y))
root.mainloop()

data = (18, 15, 10, 7, 5, 4, 2, 5, 8, 10, 13)
graph_points(data)
#    Comments [0] |
 Monday, September 24, 2007

Give One Get One - I Want My OLPC XO

OK.. looks like I need to plunk down $400 to get an XO.  When you buy one, they will give one away to a child in a developing nation.  Do some good, and get my own OLPC?  Too good to pass up.

For 2 weeks only, starting Nov. 12:  www.xogiving.org

#    Comments [0] |
 Monday, September 17, 2007

Old-School Pair Programming And My Inclination To Become A Tester

My first programming course was as an undergrad freshman in 1993. It was the basic introductory programming class for CS majors. The course was pretty difficult and was a great filter to separate the real CS students from the wannabes. About one-third of the students dropped the class, and out of the remaining two-thirds, many changed majors after this course was complete.

The course was taught using Scheme, with SICP as the text book. We programmed on a VAX cluster with Ultrix (DEC's Unix flavor) as the Operating System. We had to learn the Unix shells, VI, and all sorts of fun stuff to get us up and running.

The computer lab ("the cluster") was a large sterile room with rows of green-screen dumb terminals. I remember our professor told us that the VAX had 128 MB of memory and I was blown away by how huge that was (my rippin' fast PC had 4 MB at the time).

Spending hours in the computer lab was no fun at all. But I was one of the lucky ones. I owned a brand new 486-DX33 PC running Windows 3.1. I had a blazing fast 14.4 bps Zoom modem and could use Procomm Plus to dial into the VAX and program from the comfort of my own dorm room. I also found a Scheme interpreter that ran on DOS, giving me further options to do my work offline.

The programming assignments were brutal. All-nighters were the norm. Collaboration on the assignments was encouraged, but we were all expected to turn in our own original work. I found a fellow student that I got along with well and we decided to work together (unfortunately, I don't even remember his name... all I remember is that he was a lot smarter than me).

So the basic workflow was that we would get together, work out the basics of the assignment, get most of the algorithms working, then each take the code and finish it on our own. Since I had the bad-ass PC, we would work in my room. Two things quickly became apparent: He was a much better programmer than me, but I had a better eye for subtle details and debugging. Eventually we settled into a pattern where he would do the programming and I would look over his shoulder to give advice and input. Every few minutes, he would shoot a copy of the code to my dot-matrix printer. I would grab it, go through it line by line, and mark errors with my red pen and hand-write parts of the code that weren't correct. I would then hand the printout back to him and let him enter the changes. We iterated like this until we had all of the core code working.

For some reason, that instinct for attention to detail and debugging has always stuck with me. Because of that, my career has always been influenced by testing. I am a develop/tester, rather than just a developer. Most of the impact I have had in all of my jobs is from creating test tools and bringing in new ways to test software.

Just an interesting observation. I wonder how many others were naturally drawn to testing as soon as they started writing code?

#    Comments [2] |

Python - Yahoo Stock Quote Module

Last week I wrote a small Python module for retrieving stock prices.

It used screen scraping to get data from Google Finance.  Yahoo offers stock data in a much more digestible form which allowed me to get values without screen scraping and regular expressions.  So, I wrote a module based around this.

This new module is much more comprehensive and exposes a Python API for retrieving all sorts of stock data from Yahoo Finance.

My ystockquote module provides a Python API for retrieving stock data from Yahoo Finance.  This module contains the following functions:

  • get_all(symbol)
  • get_price(symbol)
  • get_change(symbol)
  • get_volume(symbol)
  • get_avg_daily_volume(symbol)
  • get_stock_exchange(symbol)
  • get_market_cap(symbol)
  • get_book_value(symbol)
  • get_ebitda(symbol)
  • get_dividend_per_share(symbol)
  • get_dividend_yield(symbol)
  • get_earnings_per_share(symbol)
  • get_52_week_high(symbol)
  • get_52_week_low(symbol)
  • get_50day_moving_avg(symbol)
  • get_200day_moving_avg(symbol)
  • get_price_earnings_ratio(symbol)
  • get_price_earnings_growth_ratio(symbol)
  • get_price_sales_ratio(symbol)
  • get_price_book_ratio(symbol)
  • get_short_ratio(symbol)

Sample Usage:


>>> import ystockquote
>>> print ystockquote.get_price('GOOG')
529.46
>>> print ystockquote.get_all('MSFT')
{'stock_exchange': '"NasdaqNM"', 'market_cap': '268.6B', 
'200day_moving_avg': '29.2879', '52_week_high': '31.84', 
'price_earnings_growth_ratio': '1.45', 'price_sales_ratio': '5.33',
'price': '28.65', 'earnings_per_share': '1.423', 
'50day_moving_avg': '28.7981', 'avg_daily_volume': '55579700',
'volume': '25330856', '52_week_low': '26.48', 'short_ratio': '1.60', 
'price_earnings_ratio': '28.65', 'dividend_yield': '1.38', 
'dividend_per_share': '0.40', 'price_book_ratio': '8.76', 
'ebitda': '20.441B', 'change': '-0.39', 'book_value': '3.315'}

The module is available here:  http://www.goldb.org/ystockquote.html

#    Comments [11] |