goldb.org home

AS OF MAY 2008, THIS BLOG IS NO LONGER BEING UPDATED.
Visit the new blog at: http://coreygoldberg.blogspot.com



 Tuesday, May 15, 2007

Litigate vs. Innovate: Free Advice for the Litigious

Jonathan Schwartz (CEO of Sun Microsystems) posted an excellent article describing Sun's stark choice of how to re-invent itself.  They stepped towards Free software and embraced Open Source.  Microsoft is taking a much different stance.  They are asserting patent claims over many pieces of the GNU/Linux system.

Jonathan gives some great advice in his Free Advice for the Litigious:

"No amount of fear can stop the rise of free media, or free software (they are the same, after all). The community is vastly more innovative and powerful than a single company. And you will never turn back the clock on elementary school students and developing economies and aid agencies and fledgling universities - or the Fortune 500 - that have found value in the wisdom of the open source community. Open standards and open source software are literally changing the face of the planet - creating opportunity wherever the network can reach."

Can you hear us *now*?

#    Comments [0] |
 Friday, May 11, 2007

Mnesia - Scalable Data Persistence in Erlang

SlideAway - There is a world outside of Ruby on Rails:

"Who needs Oracle/Mysql when you have Mnesia, a free, distributed, in memory database ? The ability to store native Erlang structures out of the box is so liberating: suddenly the need for your object-database mapping layer almost vanishes (well, not 100% to be fairly honest, but a big chunk of it: no need to create a 1-to-n relationship or a n-to-n relationship and a mapping table in many simple cases)

Not to mention that Mnesia supports table replication and is fully distributed, with the ability to add new 'nodes' on the fly. All of this out of the box ! (did I mention it was free too ?) This makes scaling up almost a joke. Compare this to the usual nightmares (and cost) of trying to implement a distributed Mysql/Oracle."


Awesome.

#    Comments [0] |

Mike Shaver on New RIA Tools vs. Web Standards

Via The high cost of some free tools (Mike Shaver):

"If you choose a platform that needs tools, if you give up the viral soft collaboration of View Source and copy-and-paste mashups and being able to jam jQuery in the hole that used to have Prototype in it, you lose what gave the web its distributed evolution and incrementalism. You lose what made the web great, and what made the web win. If someone tells you that their platform is the web, only better, there is a very easy test that you can use:

When the tool spits out some bundle of shining Deployment-Ready Code Artifact, do you get something that can be mashed up, styled, scripted, indexed by search engines, read aloud by screen readers, read by humans, customized with greasemonkey, reformatted for mobile devices, machine-translated, excerpted, transcluded, edited live with tools like Firebug? Or do you get a chunk of dead code with some scripted frills about the edges, frozen in time and space, until you need to update it later and have to figure out how to get the same tool setup you had before, and hope that the platform is still getting security and feature updates? (I’m talking to you, pre-VB.NET Visual Basic developers.)"

All hail "View Source".

#    Comments [0] |
 Thursday, May 10, 2007

Sticky ToolLook - Tools and System Performance with Corey Goldberg

I was recently interview by Joseph McAllister for his Sticky ToolLook newsletter.  Sticky ToolLook is an extension of StickyMinds.com and Better Software magazine.  I mostly talk about Performance testing and tools.

The article can be found here: http://www.stickyminds.com/stickytoollook/index.asp?cd=5/10/2007


Transcript:


A Word with the Wise:
Tools and System Performance with Corey Goldberg
by Joseph McAllister

Corey Goldberg is a Boston-based software engineer who focuses on performance engineering and tool development. He also contributes to open source projects and has developed some of his own, such as WebInject. I spoke with him earlier this year about his passion for the craft of software tools.

Joseph McAllister:
What makes you passionate about performance test tools?

Corey Goldberg:
I am passionate about system performance, so tools are an integral part of that. Performance is an interesting and diverse space. It touches technology in so many ways, and the skill set it requires allows me to be close to several different technical areas at once: development, testing, analysis, design, operations, etc.

The thinking becomes pervasive, though. For example, last night I was standing in line for movie tickets, and all I could think about was how they could improve the queuing system to get better sales throughput. I regularly have conversations with my colleagues about service performance and input/output contention at our local burrito joint.

JM:
Is there a particular type of performance tool that is more "fun" to use? Is there a type that tends to offer more results?

CG:
The various types of tools all work together to form your full tool set or test suite. They all have their own fun parts. Load generation can be complex, as it involves software development alongside workload modeling. But it is also the fun part where you get to slam load through a system and watch it react.

The most satisfying tools to build are analysis and monitoring tools, especially tools with real-time monitoring and graphing. This enables you to look inside your test runs or production system and actually see what is happening in real time. Complex data sets and metrics collected from deep within a system are transformed into informative graphs as things happen. That is pretty exciting to work with.

JM:
Is there a clear division between the commercial tools you've used and the tools you've written? Do you prefer one or the other?

CG:
Yes and no. First, for terminology, I like to think of things in terms of proprietary vs. free tools. Proprietary tools tend to force you into a certain pattern of use and often don't provide the flexibility to change or extend them in ways you might want.

Lately I have been building a lot of my own tools that work alongside some commercial tools. I recently developed a reporting and analysis suite that replaced the analytics in a commercial tool we were using.

Commercial tools also offer some rich features that are sometimes not feasible to re-create in a reasonable amount of time. So you have to remember that building your own tools is only worthwhile if it is cost effective.

JM:
Describe your open source test tool, WebInject.

CG: WebInject is a test tool that I developed in Perl that is used for functional testing of Web application/services and ad-hoc monitoring of HTTP response times. It can run as its own GUI application with real-time graphing capabilities or can be integrated as a plug-in with other tools.

I was doing this type of stuff in various scripts for years, so I packaged a lot of it together and made a more generic interface that can be used across a variety of projects. I thought others might be interested in it, so I setup a SourceForge project and released it in January 2004.

The basic concept is that you define test cases in XML files that are fed through WebInject and executed against your system under test. WebInject provides a basic harness/framework that includes HTTP transport, parsing, cookie handling, authentication, SSL, etc. It gives you real-time response timing as well as functional verification using regex-based content verification and HTTP status codes.

I spend all of my time on newer tools these days, but I still keep on top of WebInject enough to facilitate others' using it and posting patches/updates to it. Other test tools have progressed a lot in the past few years, so I am sure there are lots of new options for doing this type of testing. Oddly, WebInject has become somewhat entrenched in monitoring systems. Most of the users lately seem to be people running Nagios (an open source monitoring system) that need an intelligent Web plug-in/agent.

JM:
What is your favorite element of creating and distributing an open source software tool?

CG:
Community feedback feels really good. I like sharing and collaborating. The feedback is also tremendously useful in terms of discovering bugs and offering suggestions, advice, or even patches of working code. I care about my craft, and I realize the only way to advance is through open collaboration.

I am also pretty influenced by the free software movement and do some volunteer work with the GNU Project. I have some core beliefs about the ethics of software freedom. Creating and distributing my own GPL-licensed software is my own little way to help that cause.

#    Comments [0] |
 Wednesday, May 09, 2007

PerfLog - Performance Analysis Tool for Web Server Logs (Python)

I wrote a small tool that I have found useful.  It is a Python script that parses and analyzes web log files (in W3C Extended Log File Format).  It creates and HTML report with data and PNG images showing graphs of things like: request throughput, error rates, HTTP method distribution, content type distribution, time-series, etc.

Many log parsing/analysis tools exist, but I was looking for something more specific to Performance than something a webmaster would want to look at.

The script is pretty basic. It was very useful for my own needs, but others might want to modify it.  If anyone has good suggestions to add to it, I am willing to enhance it at some point (or just grab my code and hack it yourself if you know Python).


Project Home

Features

  • Produces metrics and graphs from web logs (W3C Extended Log File Format)
  • Useful during performance testing and analysis
  • Output is created in XHTML/CSS with embedded PNG images
  • PerfLog is written in Python and uses Matplotlib for graphs and plotting

License

Project Info

Requirements

  • Python 2.4+
  • Matplotlib (requires Numeric or Numpy)

Platforms

  • Cross-Platform.  PerfLog will run on any system that supports Python and Matplotlib.
#    Comments [1] |
 Thursday, May 03, 2007

Mark Pilgrim on Vendor-Specific Hype

Mark Pilgrim speaks the truth about this hype going on with the new announcements of proprietary/vendor specific web stacks and runtimes (Microsoft Silverlight, Adobe Apollo, etc).  Don't get fooled again!:

"Y’all have fun. Play with your vendor-specific runtimes. Don’t call me when you wake up one morning with a pink line in the round window and your BFF vendor won’t return your calls. If you need me (but of course you won’t), I’ll be holed up in my drab unpainted toolshed around the corner, quietly building applications on the web that works."

Love it.

#    Comments [0] |
 Monday, April 30, 2007

I Am LISP?

I just took the "Which Programming Language Are You?" quiz. Was hoping to be Python.

Apparently I am LISP?

You are Lisp.  Very few people like you (Probably because you use too many parenthesis (You better stop it (Reallly)))
Which Programming Language are You?

#    Comments [2] |
 Thursday, April 19, 2007

Linus Torvalds on Competition by Technical Merit

I saw this message from Linus on the LKLM and I thought it was well stated.  I love the way Linus runs the crazy bazaar of Linux Kernel development.  He stays true to technical merit and essentially bases all of his decisions on this.  (though sometimes this is in conflict with the ethics of Free Software).

Linus Torvalds from the Linux Kernel Mailing List:

"One of the most motivating things there *is* in open source is "personal pride".

It's a really good thing, and it means that if somebody shows that your code is flawed in some way (by, for example, making a patch that people claim gets better behaviour or numbers), any *good* programmer that actually cares about his code will obviously suddenly be very motivated to out-do the out-doer!

Does this mean that there will be tension and rivalry? Hell yes. But that's kind of the point. Life is a game, and if you aren't in it to win, what the heck are you still doing here?

As long as it's reasonably civil (I'm not personally a huge believer in being too polite or "politically correct", so I think the "reasonably" is more important than the "civil" part!), and as long as the end result is judged on TECHNICAL MERIT, it's all good.

We don't want to play politics. But encouraging peoples competitive feelings? Oh, yes."
#    Comments [0] |
 Wednesday, April 18, 2007

Microsoft Silverlight - Flash Killer? Lose the Geeks, Lose the Battle

Microsoft has renamed "WPF/E" to "Silverlight":

"Silverlight is a cross-browser, cross-platform plug-in for delivering the next generation of media experiences and rich interactive applications (RIAs) for the Web."

It looks like Microsoft is pushing this technology aggressively to CDN's and content distributors:

"Early supporters of the new platform include Akamai, Brightcove, Eyeblaster, Limelight, Major League Baseball, Navisite, Netflix, Skinkers, Sonic Solutions, SyncCast, Tarari, Telestream, Winnov, and more."

Silverlight will work on Windows and Mac OSX.  OK.. so no Linux support?  I think if Microsoft hopes to supplant Flash, it truly needs to be cross platform (not just Windows and OSX).

from the Silverlight FAQ:

"Microsoft is gathering feedback from customers like you on Silverlight and to help determine which platforms should be supported in the future."

Better hop to it boys.. With the proliferation of GNU/Linux, pushing a presentation framework that doesn't run on it is a large oversight.

You need the geeks on board.. lose the geeks.. lose the battle.

#    Comments [2] |
 Wednesday, April 11, 2007

Radview WebLOAD goes Open Source!

OK, this is huge news: www.webload.org

The commercial performance/load test tool market is dominated by large proprietary commercial vendors (HP/Mercury, Borland/Segue, etc). Radview has a nice product called WebLOAD that competes in the space.

As of this morning, Radview announced they have released WebLOAD OS, an open source version of WebLOAD. It is full-on GPL licensed (no fake open source). I already browsed their source tree. They have a Subversion repository.. code is in C and C++,

The Open Source performance/load test tool market doesn't offer many choices. Currently the most popular tools are JMeter and OpenSTA

This will be exciting. I wonder how well Radview will deal with the community on this. Though if it's not good, GNU GPL certainly allows forking :)

more to come...

#    Comments [3] |
 Tuesday, April 10, 2007

Python and IEC - Stupid-Simple Windows Browser Automation

I have been using IEC lately for automating repetitive administrative tasks within my company:

IEC.py - Automating Internet Explorer with Python

IEC is a simple library with a nice API for automating an IE browser. I found it simple to work with for basic automation needs. I have also used it as the core of a small UI testing framework.

From Mayukh Bose:

IEC is a python library designed to help you automate and control an Internet Explorer window. You can use this library to navigate to web pages, read the values of various HTML elements, set the values of checkboxes, text boxes, radio buttons etc., click on buttons and submit forms.

Yeah I know.. pretty lame it only works with IE, but in the environment I was working in, the applications ran on *IE Only*.


A personal story:

My company is very analytical and detail oriented when it comes to tracking/planning project resource allocation. We track all sorts of projections, budgets, resources, etc. The workflow is basically: some business guys (no idea what they actually do) take data from some reports and enter them into some arcane hosted tracking software. This is done by entering copious amounts of data into web form after web form. Then they submit the form to run a report. Once that is finished, they cut & paste the data into MS Excel. Then they take the Excel spreadsheet and follow some wild sequence of copying, cutting, pasting, converting, running macros, graphing, etc. At the end of this, a few images are produced so some wizz-bang graphs can go into a monthly Powerpoint... wow.

So... I wrote a Python script that takes their input data, drives a web browser to do the report, screen scrapes the result, processes it, generates some fancy graphs with Matplotlib, and presents a web page with the results.  End result: Converted a multi-hour manual process into the click of an icon and 20 seconds of processing.

I could have done this with HTTP directly, but this UI automation technique made it very quick to develop; and it looked impressive ("whoa it's like.. making my browser move on its own").


To use IEC, you need the Python for Windows Extensions. If you use the ActiveState Python distribution, these are already included.

I used to use ActiveState Python for Windows programming (because I was a big fan of ActiveState Perl, where the installer and PPM package manager rocked). I recently spent close to an hour getting SSL (HTTP) to work with ActiveState.  I couldn't get it to work so I ditched it for the standard Python distro.


--
Happy Hacking.

#    Comments [4] |
 Monday, April 09, 2007

Geo Location Mashup - Python, Yahoo Maps AJAX API

Mapping User Metro Concentration by IP Address

I just posted this: http://www.goldb.org/geo_maps

It is a tutorial/example showing how to create a geolocation mashup by generting HTML/JavaScript code from a Python script.  The resulting code is an HTML page with embedded JavaScript that you can open with your browser.  It works with the Yahoo Maps AJAX API to plot markers at specified locations.  I also explain how this technique can be used to create a [near] real-time map of user concentration based on IP addresses.

... feedback welcome.


It generates cool AJAXy eye-candy like this:

and this:

Since I use the AJAX control, the rendered map has a zooming, panning, dynamic, tiled interface.  Pretty Slick.

#    Comments [1] |
 Sunday, April 08, 2007

Got Scalability? Compute/Storage Grids

It seems to be all about size and massive buildouts of compute and storage grids these days... both commercially (see Google, Microsoft, Amazon, Yahoo, Sun, IBM, HP, Oracle, etc) and in academia.  The interesting thing is that the technology used is good for both distributed and centralized (tight clusters of distributed nodes) computing. Processing and storage can be pushed to the edges, or gathered centrally... it's up to you... the mechanisms are there...  it's all converging.

The world is becoming a massive digital fabric.

I'm just fascinated by the scale of the data centers, operations, and services that are being deployed.

Article from NY Times last summer (June '06):

"The best guess is that Google now has more than 450,000 servers spread over at least 25 locations around the world. The company has major operations in Ireland, and a big computing center has recently been completed in Atlanta. Connecting these centers is a high-capacity fiber optic network that the company has assembled over the last few years.

Google has found that for search engines, every millisecond longer it takes to give users their results leads to lower satisfaction. So the speed of light ends up being a constraint, and the company wants to put significant processing power close to all of its users."

Wow. Now we do we understand why our systems must scale?


Related:
http://www.tbray.org/ongoing/When/200x/2006/05/24/On-Grids
http://www.globus.org/toolkit/
http://www.sun.com/service/grid/
http://www.amazon.com/gp/browse.html?node=16427261
http://www.amazon.com/gp/browse.html?node=201590011

#    Comments [0] |
 Monday, April 02, 2007

I Need Better Web Hosting

My website and blog were down most of today, after getting pounded with traffic from Reddit Programming.

The day started great... I already had 600 visitors today when I woke up for work at 7AM.  Then one of my posts started floating near the top of Reddit.  My server couldn't handle the traffic and soon fell over.  It didn't come back online until just now.

Granted, I am using ultra cheap shared hosting, so this shouldn't come as a huge surprise.  However, I am now looking for some hosting that is slightly more reliable.  Aside from the the heavy traffic today, my site goes up and down intermittently all the time anyways.

Can anybody recommend some good cheap web hosting?  Basically I am looking for about 1 gig of storage and at least 3 gigs of transfer per month.  I understand that reliability and availability are something one must pay for (and usually mutually exclusive with shared hosting).  So.. I would sacrifice availability for price, as long as availability and reliability were decent.

I need both Windows (with ASP.NET 2.0) and Linux (with Python/Perl) hosting. These can be from a single provider, or with 2 different providers.  I have used lots shared hosting services over the years and all of them generally suck.

.. any good hosting recommendations?

#    Comments [1] |
 Sunday, April 01, 2007

Massive Concurreny with PyPy Stackless

(via)
PyPy had its 1.0 release recently.

Now, This looks *really* interesting:

PyPy Stackless

PyPy can expose to its user language features similar to the ones present in Stackless Python: no recursion depth limit, and the ability to write code in a massively concurrent style. It actually exposes three different paradigms to choose from:
  • Tasklets and Channels
  • Greenlets
  • Plain Coroutines
#    Comments [0] |

One Laptop Per Child - More Prototype Pics and Info

I posted some pics of the latest OLPC prototypes a few weeks ago.  Well... I got to see them 2 weeks in a row; so here are some more pics of the machine up close.

... Seems the whole "hand crank" idea is gone.  There is now a pullchord on the external power supply with a 10:1 ratio (1 minute of pulling = 10 mins of computing) for manually recharging power... The keyboard is tiny and soft feeling.  The screen is small but is very viewable in direct light without backlighting (which is probably the #1 power drain on laptops).

OLPC rocks!

Me geeking out:

Old school meets new school...
Gerald J. Sussman (yes, the MIT Scheme guy) playing with the latest OLPC prototype:

Closeups:


.. these machines run a scaled down version of Fedora Linux that is loaded with Python applications.

-Corey

#    Comments [3] |
 Saturday, March 31, 2007

Digital Ethnography

(I can't even tell you how many times I've watched this video since it came out a few months ago)

For posterity...

Professor Michael Wesch:

teaching the machine.
the machine is us.

we'll need to rethink a few things...
copyright
authorship
identity
ethics
aesthetics
rhetorics
governance
privacy
commerce
love
family
ourselves

- The Machine is Us/ing Us

#    Comments [0] |
 Thursday, March 29, 2007

Python - Remove Duplicate Items From a Sequence

Say you have a sequence like:

[1, 1, 2, 2, 2, 3, 4, 4, 4]

... and you want a sequence containing all the unique items (remove duplicates) like:

[1, 2, 3, 4]


Here is a function to do it:

def remove_dups(seq):
    x = {}
    for y in seq:
        x[y] = 1
    u = x.keys()
    return u


or a one-liner:

u = [x for x in seq if x not in locals()['_[1]']]



update: in the comments below, some other ways were suggested..

with 'set'.. like this:

u = list(set(seq))

or with a dictionary.. like this:

u = dict.fromkeys(seq).keys()
#    Comments [4] |
 Wednesday, March 28, 2007

Microsoft IIS - Welcome to Last Decade (Performant CGI)

Wow..
CGI will run well on IIS
Rails will run well on IIS.

Rob Conery on running Ruby on Rails (or other CGI based platforms) on IIS:

"Rails works using CGI - basically an executable that gets run each time a request comes into a web site. Most of the frameworks out there do NOT support multi-threading, so each time a request comes in that requires anything dynamic, CGI is "instanced" and executed. If you have a lot of requests at once, this isn't really a good thing. Now some servers are built to mitigate this (Apache, Lighttpd, etc); IIS is not.

... I would imagine that in the next 6 months we'll see a great addition to IIS 6 and 7 for all the CGI-enabled platforms out there."


hmm.. good to hear. (seriously)
but damn... weren't we doing this 10 years ago with Perl/Apache? :)

#    Comments [2] |
 Sunday, March 25, 2007

Real World Web Scalability

(via reddit programming)

Very lengthy overview of performance and scalability issues for web systems by Ask Bjorne Hansen.  This presentation covers a vast range of information:

Real World Web Scalability  (warning large PDF)

The takeaway?
Create horizontally scalable distributed systems..  always.

#    Comments [0] |

Scalability Comparison of Virtualization Tools

A report about scalability of virtualization techniques:

SCALABILITY COMPARISON OF 4 HOST VIRTUALIZATION TOOLS (QUETIER B / NERI V / CAPPELLO F)

"Virtualization tools are becoming popular in the context of Grid Computing because they allow running multiple operating systems on a single host and provide a confined execution environment.  In several Grid projects, virtualization tools are envisioned to run many virtual machines per host.  This immediately raises the issue of virtualization scalability."

4 types of virtualization tools are discussed in the context of scalability:

  • Processor Virtualization
  • Kernel Replication
  • Operating System Virtualization
  • Resource Virtualization
#    Comments [0] |

Operating System Genealogy - Timelines

Sweet...
The history of entirely too many operating systems in way too high resolution.
... but great fun for OS geeks.

Operating System Genealogy:

#    Comments [0] |
 Saturday, March 24, 2007

Free Software Foundation - 2007 Associate Member Meeting

The Free Software Foundation's annual Associate Members Meeting is always an inspiring event for me.  It serves as a sort of State of The Free Software Union; where members gather to discuss ideas and listen to speakers.  Most of the FSF Board of Directors were there to speak.

I attended the meeting today (Saturday 03/24/2007) for the 4th time in the past 5 years.

It was held at MIT (Cambridge, Massachusetts):

 

I arrived during Joshua Ginsberg's (FSF Senior System Administrator) speech on “FSF Systems Administration”.  He gave an overview of some of the systems and internal work going at the FSF offices. Some highlights:

  • FSF now runs LinuxBIOS on new Tyan servers for FSF and GNU Project resources.  They will be contributing documentation and information to help others install a Free BIOS.
  • New and much improved FSF network infrastructure and connectivity for FSF/GNU hosted resources.
  • FSF is switching from Zope to Django (both Python powered!) for web application development...  Lots of new stuff coming soon, including contributions back to the Django community.

Next up was Brett Smith, the new GPL Compliance Engineer at the Compliance Lab.  One thing Brett mentioned was that GPL license violations are pretty much kept secret and not disclosed to the community.  FSF prefers to negotiate with violators and talk them into compliance behind closed doors.  I'm not sure I agree with this practice.  I asked Richard Stallman about this during his Q&A Session... stating that I thought this information should be released to the public.  I don't see it as an overly aggressive move and I think publicly outing companies that are GPL violators would be a good way to give exposure to Free Software and help curb future violations.  RMS doesn't quite agree with my standpoint, but he asked some FSF staff to explore generically publicizing more types of violations.

Next was Gerald Jay Sussman, speaking about "Robust Design". Gerry was the author of my first Computer Science book, the venerable Wizard Book (SICP), and one of the authors of Scheme (a programming language dialect of LISP).  I was able to thank him for the pain and enlightenment his texts brought me during my CS studies.

Gerry is a complete madman when he gives presentations.  Forget the powerpoints and fancy presentation gear... he just slings around old school projector slides at blazing speed.  Admittedly, the stuff he talks about is far over my head.  I'm just a lowly computer programmer.  This guy has been at MIT since 1964 studying the cutting edge of computer science, mechanics, and electrical engineering. Watching him ease through functional programming and Scheme code is a little intimidating, but the entertainment value alone is worth it.

OK.. now the person most people came to see speak... the GNU Project founder, FSF President, former MIT AI Lab hacker, Emacs/GCC/GDB author, Chief GNUisance, and St. Gnucius himself... Richard Stallman:

RMS was in a surprisingly jovial mood. He is usually sorta moody and prone to outbursts.  I saw him shout at, and absolutely berate Larry Lessig a few years ago in front of a large audience at an FSF meeting.  However, today he was in fine form and gave his speech "Free Software and Software Patents".  He delivered well and really punched home the point about the absurdity of patents when applied to software.

After RMS was Eben Moglen, FSF Chief Council, Columbia Law Professor, and founder of the Software Freedom Law Center.  Eben is my favorite speaker.. bar none.  He speaks with passion and insight that is truly inspiring to watch.  He gave his "After GPLv3" speech.  It was an update on the current state of the GPL revision process.  Stallman and Moglen are leading the massive effort to complete GPLv3.  I am very thankful that people like Eben Moglen are on the front lines protecting our freedom.

Eben Moglen:

Bruce Perens was in attendance: 

He seems to have taken a very strong interest in the GPLv3 recently.

... and of course there were the obligatory FSF activist signs:

RMS listening to Moglen's speech:


Now... everyone... go join the FSF and become an Associate Member.
... or at least continue your Free Software hacking and advocacy.


Goldberg... out!

#    Comments [0] |
 Friday, March 23, 2007

Python - Creating Bar Graphs with Matplotlib

Matplotlib is an open source 2D plotting library for Python.  It is very impressive and robust, but the API and documentation is maddeningly difficult to follow.

Here I have provided a function that will create a bar graph [as a png image] from a Python dictionary using the Matplotlib API.

It will auto-size the bars and auto-adjust the axis labels for you. All you need to pass into it is a dictionary data structure (and optionally a graph title and output name).


We start with a Python dictionary like this:

{'A': 70, 'B': 290, 'C': 130}


... and the function will use Matplotlib to create a graph like this:


Here is a sample script that uses my function:


#!/usr/bin/env python

from pylab import *

def main():  
    my_dict = {'A': 70, 'B': 290, 'C': 130}
    bar_graph(my_dict, graph_title='ABC')


def bar_graph(name_value_dict, graph_title='', output_name='bargraph.png'):
    figure(figsize=(4, 2)) # image dimensions  
    title(graph_title, size='x-small')
   
    # add bars
    for i, key in zip(range(len(name_value_dict)), name_value_dict.keys()):
        bar(i + 0.25 , name_value_dict[key], color='red')
   
    # axis setup
    xticks(arange(0.65, len(name_value_dict)),
        [('%s: %d' % (name, value)) for name, value in
        zip(name_value_dict.keys(), name_value_dict.values())],
        size='xx-small')
    max_value = max(name_value_dict.values())
    tick_range = arange(0, max_value, (max_value / 7))
    yticks(tick_range, size='xx-small')
    formatter = FixedFormatter([str(x) for x in tick_range])
    gca().yaxis.set_major_formatter(formatter)
    gca().yaxis.grid(which='major')
   
    savefig(output_name)


if __name__ == "__main__":
    main()


enjoy.

-Corey

#    Comments [6] |
 Thursday, March 22, 2007

Python - Convert Date/Time to Epoch

I'm not sure why, but this took me forever to figure out; so I'm posting it here for others...

Let's say you have a string representing a date and a time and you want to convert it to epoch time (# secs since the epoch).

First you will need to create a pattern for your time format, using time format directives.

For example, the pattern for:

'2007-02-05 16:15:18'

Would be:

'%Y-%m-%d %H:%M:%S'

You can then convert it to epoch like this:

int(time.mktime(time.strptime('2007-02-05 16:15:18', '%Y-%m-%d %H:%M:%S')))


Now in a script:

#!/usr/bin/env python

import time

date_time = '2007-02-05 16:15:18'
pattern = '%Y-%m-%d %H:%M:%S'
epoch = int(time.mktime(time.strptime(date_time, pattern)))
print epoch
#    Comments [0] |