goldb.org home

AS OF MAY 2008, THIS BLOG IS NO LONGER BEING UPDATED.
Visit the new blog at: http://coreygoldberg.blogspot.com



 Friday, February 16, 2007

Google - All Your Search Traffic Are Belong To Us

(Yes, the title is intentionally ungrammatical

I run a personal website: www.goldb.org This is where I host my blog as well as content pages mostly dealing with computer programming. I was just looking over my traffic/visitors stats for the past month and noticed something interesting.

Basically, all of my search traffic comes from Google (I am indexed in every major search engine). I keep reading about search volume comparisons and how Google is slightly leading, and how more parity in the search market now exists.

Obviously my website visitors are skewed towards technical types, and the search terms they use to find my site are all technical/programming/software terms. The takeaway from this is that nearly all technical users are searching from Google instead of the other popular search engines.


Here is a breakdown of some stats from the last 30 days:

Where did my traffic come from?

  • 14.8% came directly
  • 70.4% from searches
  • 14.8% from other sites


Search Engine - # Visitors

  • Google - 1729
  • Yahoo - 20
  • Microsoft Live - 17
  • Technorati - 4
  • Del.icio.us - 2
  • AOL Search - 1



97.52% of visitors that reached my site in the past 30 days via search, came from Google.

#    Comments [0] |
 Thursday, February 15, 2007

Perl - File Slurping

A common idiom in Perl 5 is "slurping".  Slurping is the process of reading a file into an array, split by line breaks.  You can then iterate over the array and perform an operation on each line.  This is the basic input mechanism I use to process all sorts of data/text files.


The basic slurp goes like this...

Open a file in read mode and assign it a file handle:

open(FILE, 'foo.txt') or die $!;

Read (slurp) the file into an array of lines (splitting the file on newlines):

@file = <FILE>;


You can then process the array in a foreach loop and "Un-slurp" (De-slurp?) it back to the file system like this...

Now we have an array which we can iterate through and do whatever we want with each line:

foreach (@file) { # do something here }

Re-open the file in overwrite mode:

open(FILE, '>foo.txt') or die $!;

Print the contents of the array back to the file:

print FILE @file;


The following script shows some slurping in a action. This script will read a file named "foo.txt" and replace all intances of "foo" with "bar"

#!/usr/bin/perl replace('foo.txt', 'foo', 'bar'); sub replace { ($filename, $original, $substituted) = @_; open(FILE, $filename) or die $!; @file = ; foreach (@file) { s/$original/$substituted/g; } open(FILE, '>foo.txt') or die $!; print FILE @file; }
#    Comments [0] |
 Tuesday, February 13, 2007

Trampolining With Generators - Roll Your Own Scheduler?

Even the subject sounds confusing huh?

I was reading Neil Mix's: Threading in JavaScript 1.7 post and was really fascinated by the concept he discusses: trampolining

Basically, trampolining it is a way to achieve concurrency by using Generators to create a coroutine scheduler.

In JavaScript 1.7 (which Firefox 2 supports), you can already do concurrent programming with this technique.


Neil Mix:

"The way trampolining works is that a scheduler object (written in JavaScript) manages the execution of a series of generators, cobbling together a stack-like execution. Here’s how it works: The scheduler sets the starting generator as the base “frame” in the call stack. The scheduler then calls next() on the generator to obtain a yield value. If the yielded value is itself a generator, the scheduler pushes this new generator on the stack and calls next() on it, again obtaining a yield value. This continues until the top generator yields a non-generator value. This value could be a special directive to the scheduler (for example, a SUSPEND value that tells the scheduler to freeze execution of the “stack” of generators we’ve piled up). If not, the scheduler treats it as a return value. The scheduler then pops and closes the now complete generator and sends the return value back into the next generator in the stack."

pretty sick, huh?  ... definitely a twisted idea :)

The interesting takeaway is that this technique could be used to implement concurrency in any language that supports Generators.  It looks like Python has a similar capability.  This is described in detail in: PEP 342 - Coroutines via Enhanced Generators.

Generator-based state machines sound really interesting.  Hopefully I'll find some time to play with them [in python] as an alternate to threading.

#    Comments [0] |

Screen Scraping in Python

Mads Kristensen just posted an article: Screen scraping in C#, where he shows several ways to make HTTP requests in C# that can be used for screen scraping.

from Mads:

"Some say that screen scraping is a lost art because it is no longer an advanced discipline. That may be right, but there are different ways of doing it. Here are some different ways that all are perfectly acceptable, but can be used for various different purposes."


Not to be outdone... here are 2 examples of how to do the same thing in Python:

using httplib:

conn = httplib.HTTPConnection("www.python.org")
conn.request("GET", '/')
print conn.getresponse().read()


using urllib:

f = urllib.urlopen('http://www.python.org/')
print f.read()


#    Comments [2] |

Solaris Zero-Day Exploit - TELNET Insanity

A few days ago there was a zero-day exploit announced for Solaris that allowed people to use TELNET to gain root access to your machine.  This seems pretty bad and a lot of people on the inter-web are freaking over it.

However, the question is not:  "How did this bug go undiscovered?"

The question should be:  "What were you smoking when you enabled the TELNET daemon facing the public Internet??"  (and can I have some?)

#    Comments [0] |
 Monday, February 12, 2007

Microsoft Performance Testing Guidance

Scott Barber just posted some info about: Patterns & Practices: Performance Testing Guidance, a new guide to Performance Testing over at Microsoft's Codeplex.

from Scott:

"I am involved in Microsoft's Patterns & Practices Performance Testing Guidance project. We have reached a critical mass with regards to our "mostly final" content and have made that content publicly available"

"We're tackling various flavors of performance testing (stress, load, capacity) as well as how to bake performance testing into your life cycle."

from the Patterns & Practices site:

"The purpose of this project is to build some insightful and practical guidance around doing performance testing and using Visual Studio 2005. It's a collaborative effort between industry experts, Microsoft ACE, patterns & practices, Premier, and VSTS team members."

I have always had ambivalent feelings towards MS, but it is great to see all of the effort they are putting into the Performance field. Performance has always been a sort of niche domain. It straddles the disciplines of development, testing, and operations, while also involving the integration of software, hardware, and networks. The past few years have provided much more thought leadership that is pushing the state of Performance (load, stress, scalability, capacity, availability) into more mature territory.

For those that don't know Scott Barber, he certainly gets my vote for the most prolific writer in Performance over the past several years. His body of writing is second to none: http://www.perftestplus.com/pubs.htm

#    Comments [0] |
 Sunday, February 11, 2007

Sqeezebox - Hackable Device For Streaming My Tunes

(* I am not affiliated with Slim Devices.. this is just a fanboy post.)

I wanted to hook up something that would integrate my home PC (jacked full of glorious DRM-free MP3's) and my home audio system. I didn't want to go the full HTPC route, I am more of an audio guy and my immediate need is for an audio-only solution.

After browsing the various Media Servers, Sound Cards, External Components, and other music playing gizmos; I finally figured out the type of unit I was looking for and what it should do...

Here are my requirements:

  • Must be able to play the MP3's stored on my computer
  • Must have a cross platform client application (I use both Linux and Windows at home)
  • Must have high quality analog and digital output that can connect to my receiver/amp
  • Must have a remote with basic controls (volume, tuning, select, etc)
  • Must be able to control it from my computer (change songs, playlists, etc)
  • Must be able to stream Internet Radio (of some sort)
  • No wires connected to my PC

Obviously some of these requirements became apparent when I read about the Squeezebox from Slim Devices.  It seems to do everything I need (and more), and isn't outrageously expensive ($299 retail).



So I ended up ordering one of these bad boys to play around with.  (of course I got the all black version)

OK.. but now the real reason I chose the Squeezebox... It is Open Source and has a developer community!

Well... it is sort of Open Source.  Its SlimServer software (the audio server software) is GPL licensed, so the Source Code is available to modify, hack, and contribute to.  But.. the device's firmware is proprietary and closed (boo).

The real kicker came when I realized that the SlimServer is written in Perl It is web based, and uses templated HTML/CSS.  Hmm.. I've written a Perl program or 2 [thousand] in my day... this could get interesting.  I already have SlimServer running from Source and I am making little tweaks to the interface to customize it for myself.  Hopefully someday I'll do something worth contributing back...  I love that I have the option.

... More to come once I actually get the Sqeezebox and hook it up..  I just ordered it last night.

#    Comments [0] |
 Friday, February 09, 2007

Python - use Psyco (x86 JIT-like compiler) for a speed boost

Psyco is a Python extension module which can speed up the execution of any Python code.

from the Psyco site:

"Think of Psyco as a kind of just-in-time (JIT) compiler, a little bit like what exists for other languages, that emit machine code on the fly instead of interpreting your Python program step by step. The difference with the traditional approach to JIT compilers is that Psyco writes several version of the same blocks (a block is a bit of a function), which are optimized by being specialized to some kinds of variables (a "kind" can mean a type, but it is more general). The result is that your unmodified Python programs run faster"


I have been working on some Python projects recently where Pysco has given me a a really substantial performance increase. The type of work I am doing mostly involves statistcal analysis of large numerical data sets.. array math.. percentiles.. time-series.. etc, etc.

To use it, all I do is copy Psyco to my system (to Python's Lib/site-packages), and add the following to the top of my python source file:

import psyco
psyco.full()


Thats all ...

... or even better; wrap it in a try/except so your program still runs on systems without Psyco installed.

try:
    import psyco
    psyco.full()
except:
    pass
#    Comments [0] |

Live Brain-Surgery With Python

Another example of improved productivity with dynamic languages...

Gojko Adzic on prototyping in Python:

"writing the prototype in Python allowed us to start a web server and open an interactive console to re-wire it and perform live brain-surgery while the server is running. I cannot imagine doing that in Java or C#. In a month, we wrote the functional equivalent of at least 4-5 months of C# code."

#    Comments [0] |
 Thursday, February 08, 2007

Are We In The Matrix?

When I first saw the Matrix in '99, it instantly became one of my favorite movies.  To this day, the concept this movie delves into is pretty disturbing.  Ever since seeing it, I had the realization that there is no proof we are not in a Matrix-like simulated reality... which I can't even get my head around.

The 2002 Edge Question asked was: "Is the universe a quantum computer?"

Seth Lloyd validated my thoughts in his answer:

"The universe is quantum mechanical, and its dynamics can be simulated precisely and efficiently using quantum information processing. The amount of quantum computation required to perform this simulation is finite and has been calculated. Consequently, there is no obvious way to distinguish the universe from a very large quantum logic circuit."

.. that really freaks me out

[insert Keanu Reeves joke here]

#    Comments [0] |
 Tuesday, February 06, 2007

Anders Hejlsberg on LINQ and Functional Programming

Anders Hejlsberg on LINQ and Functional Programming

This is a good video of an interview with Anders Hejlsberg on LINQ and Functional Programming.  He is the designer of C# and talks about some of the upcoming features in Orcas (the next Visual Studio with C# 3.0).

I think it is very interesting (and good) that Microsoft (and many other modern language designers) are adding functional programming features.  If functional programming, lambda expressions, list comprehensions, and set processing, are your bag.. watch this.

#    Comments [0] |

Improving Regular Expression Performance

Alex from Dojo just linked to a fascinating article about by Russ Cox about Regular Expressions:  Regular Expression Matching Can Be Simple And Fast (but is slow in Java, Perl, PHP, Python, Ruby, ...)

from the article:  
"This article reviews the good theory: regular expressions, finite automata, and a regular expression search algorithm invented by Ken Thompson in the mid-1960s. It also puts the theory into practice, describing a simple implementation of Thompson's algorithm. That implementation, less than 400 lines of C, is the one that went head to head with Perl above. It outperforms the more complex real-world implementations used by Perl, Python, PCRE, and others. The article concludes with a discussion of how theory might yet be converted into practice in the real-world implementations."
so.. there is a 40 year old technique that improves performance of regexes dramatically?

The following graph plots time required to check whether a?^na^n matches a^n:



wow... so awk and grep use the Thomson NFA implementation of regexes, while most programming languages don't.  

... and here I thought Perl was the regex king.

#    Comments [0] |
 Sunday, February 04, 2007

Perl - Building Web Clients

The following is a short tutorial on web programming in Perl I wrote several years ago.  This type of programming was my first foray into the guts of the web.  Writing tools at the protocol level forced me to gain a deep understanding of HTTP and Web Architecture, which has been extremely helpful to me since.


These examples show how to use Perl's 'LWP' (libwww-perl) modules to make requests to a web server. The libwww-perl collection is a set of Perl modules which provides a simple and consistent application programming interface to the web.


Using 'LWP' to do an HTTP GET Request:

This will request the main Google page and store the entire contents of the response in the the '$response' object.
#!/usr/bin/perl

use LWP;

$useragent = LWP::UserAgent->new;
$request = new HTTP::Request('GET',"http://www.example.com");
$response = $useragent->simple_request($request);

print $response->as_string();

(*use "useragent->request" instead of "useragent->simple_request" to follow server redirects)


Working With Cookies:

Here is the http header returned by the initial http request to Google:
(first part of 'print $response->as_string();' output in the previous example)
Date: Mon, 14 Apr 2003 18:38:28 GMT
Server: GWS/2.0
Content-Length: 2691
Content-Type: text/html
Content-Type: text/html; charset=ISO-8859-1
Client-Date: Mon, 14 Apr 2003 18:38:29 GMT
Client-Peer: 216.239.57.99:80
Client-Response-Num: 1
Connection: Close
Set-Cookie: PREF=ID=48fd767576ebd920:TM=1050345508:LM=1050345508:S=qLA8i5XyvLX37lG6;
expires=Sun, 17-Jan-2038 19:14:07 GMT; path=/; domain=.google.com
Title: Google

Notice the "Set-Cookie:" line in the header. This is what tells your web browser that a cookie needs to be set and returned as part of the http header in subsequent http requests to this server. In this case the cookie doesnt do much, but for a site that requires a login, this is how the server knows who you are to maintain a session.

In Perl, cookies can be handled for you by using the HTTP::Cookies module.

You first need to construct the object to contain your cookies:
$cookie_jar = HTTP::Cookies->new;

After an http request is sent, you can then extract the cookie from the response header:
$cookie_jar->extract_cookies($response);

Once you have the cookie stored in your cookie_jar, it needs to be sent back to the server in the header of every subsequent http request. This is done by adding the following command after you format each request:
$cookie_jar->add_cookie_header($request);


Now for the whole thing in a script:


The following script will make a request to the main Google page and store the cookie it receives. It will then make a request to Google to change the default language (user preference) to Spanish. A new cookie will be returned that we will store and use it to make another request to the main Google page. Google will recognize the information stored in our cookie and return the page in Spanish.
#!/usr/bin/perl

use LWP;
use HTTP::Cookies;

# construct objects
$useragent = LWP::UserAgent->new;
$cookie_jar = HTTP::Cookies->new;

# send request for main Google page
$request = new HTTP::Request('GET',"http://www.google.com");
$response = $useragent->simple_request($request);

# extract cookie from response header
$cookie_jar->extract_cookies($response);

# set user preference on Google to Spanish language
$request = new HTTP::Request('GET',"http://www.google.com/setprefs?
               submit2=Save+Preferences+&hl=es<=all&safe=images&num=10
               &q=&prev=http%3A%2F%2Fwww.google.com%2F&ie=UTF-8&oe=UTF-8");
$cookie_jar->add_cookie_header($request);
$response = $useragent->simple_request($request);

# extract new cookie from response header
$cookie_jar->extract_cookies($response);

# send request for main Google page (will return Spanish Google page)    
$request = new HTTP::Request('GET',"http://www.google.com");
$cookie_jar->add_cookie_header($request);
$response = $useragent->simple_request($request);

print $response->as_string; # print response body to verify cookies work (some text now in spanish)

#    Comments [0] |
 Sunday, January 28, 2007

Roll Your Own Performance Tools.. Real-time Graphing and Round Robin Data Storage

(Moving some of my older blog posts to this site for permanent archiving.. This entry was originally posted at testingReflections on 04/25/2006)

I have spent a lot of time playing around with graphics libraries and toolkits for integrating real-time graphs within my own performance testing and monitoring tools. It seems like there are many open source tools available in the world of performance testing and system monitoring. And lots of people roll their own tools in whatever programming language they are into... but many lack graphics capabilities.

Two of the toolkits/libraries I end up using often for my own homebrew test tools are: RRDTool, and JRobin.

from the RRDTool site:
"RRD is the Acronym for Round Robin Database. RRD is a system to store and display time-series data (i.e. network bandwidth, machine-room temperature, server load average). It stores the data in a very compact way that will not expand over time, and it can create beautiful graphs. It can be used via simple shell scripts or as a perl module."

So...
RRDTool is a really good back-end for storing time-series data; which is mostly what we care about in performance testing.  It has bindings for various scripting languages, or can be invoked from the command line. If you are developing tools that need a data repository and graphing capabilities, this provides you both. You create an RRD and then you begin inserting data values at regular intervals. You then call the graphing API to have a graph displayed. The cool thing about this data storage is its “round robin” nature. You define various time spans, and the granularity at which you want them stored. I fixed binary file is created, and this never grows in size over time. As you insert more data, it is inserted into each span. As results are collected, they are averaged and rolled into successive time spans. It makes a much more efficient system than using your own complex object structures, or a relational database, or file system storage.

You will probably recognize the graphs it creates, as RRDTool is integrated in many popular monitoring tools (it is Free/Open Source, GPL License). I have built many tools around RRDTool, and it is really a nice system.

If you are in the Java world, there is a very cool project named JRobin. JRobin is a clone of RRDTool in pure Java. So you can create RRD's directly from your Java code.. and all in memory if you want to!

Some days I pretend to be a Java programmer, so I had to build a tool using JRobin. As a proof of concept, I wrote a small network latency monitoring tool. It shows off some of JRobin's capabilities. It pings a host at a given interval and records the latency. A graph of the network latency is rendered in real-time onto a Swing panel.

Here is my network latency monitoring tool: NetPlot (includes Java source code, GPL Licensed)


The tool itself is just a trivial example, and really isn't the point. But you could easily adapt this code or create your own to develop real-time graphs of your own time-series data.

#    Comments [0] |
 Friday, January 26, 2007

Python - Sort A Nested Sequence With DSU

The DSU (Decorate, Sort, Undecorate) idiom originates from Lisp.  I first learned it in Perl, where it is called the Schwartzian Transform (coolest name ever?), named after longtime Perl hacker Randal L. Schwartz.

I find myself using this same DSU idiom in Python when I need to sort a nested sequence (single level sequence of sequences).

Lets say I have the following list of lists:

seq = [
    ['a', 1, 5],
    ['b', 3, 4],
    ['c', 2, 2],
    ['d', 4, 3],
    ['e', 5, 1],
]

... and I want the outer list to contain the inner lists sorted by their last column (in this case, index 2).

How would I do this?

Here is an implementations of the DSU (Decorate, Sort, Undecorate) idiom in a Python function:

def dsu_sort(idx, seq):
    for i, e in enumerate(seq):
        seq[i] = (e[idx], e)
    seq.sort()
    for i, e in enumerate(seq):
        seq[i] = e[1]
    return seq
   
(Keep in mind that lists in Python are mutable and this will transform your original sequence.)


So applying this to the sequence above like this:

dsu_sort(2, seq)

gives us:

[['e', 5, 1], ['c', 2, 2], ['d', 4, 3], ['b', 3, 4], ['a', 1, 5]]

which is the original sequence, transformed so it is sorted by the last column (index 2).



Randal's original implementation in Perl from 1994:
#!/usr/bin/perl
 print
     map { $_->[0] }
     sort { $a->[1] cmp $b->[1] }
     map { [$_, /(\S+)$/] }
     <>;

#    Comments [3] |
 Thursday, January 25, 2007

Performance Acronym: RASP

RASP:

Reliability. Availability. Scalability. Performance.

System performance has been one of my main interests since I got into computing.  My first memories of this go back to my DOS 5.0 days ('92?).  I would obsessively optimize my beige-box x86 desktop... staying up all night tweaking my config.sys and running fractals.  (anyone remember all the fiddling with device drivers and Terminate and Stay Resident programs to load them into Upper Memory Blocks, so you can free up conventional memory?)  That obsession grew as I gained more knowledge and eventually led me into testing/tuning large distributed systems, which I have essentially focused on professionally for the past 9 years  (hey, my job title even includes "Performance Engineer").

anyway,

I just wanted to re-introduce the acronym RASP.  In the performance world, there is a lack of standard vocabulary.  There is enough shared terminology to have intelligent conversations and get engineering problems solved, but the terms used vary pretty widely from company to company and region to region.. much more so than pure development or admin language.  So sharing vocabulary is a good thing, to further push performance as its own discipline.  

I first heard the acronym "RASP" from Goranka Bjedov, a performance engineer at Google (shown here in her excellent tech-talk Using Open Source Tools for Performance Testing), during discussions at WOPR6.  She said it was an old Bell Labs term used in telco.  

RASP encapsulates all things related to system performance into a nice logical taxonomy.

Reliability
Availability
Scalability
Performance

For some reason RASP has really stuck in my head... just passing the word.

#    Comments [2] |
 Tuesday, January 23, 2007

SQL Server - How To Tell If There Is A Trace Running

Server-side tracing is the process of having your SQL Server machine save events to a physical file on that machine without using the Profiler client tool.  Server-side tracing is enabled and controlled by using SQL Server system-supplied stored procedures and functions. With these system-supplied processes, you can identify what to trace, when to start and stop tracing, what traces are running, and view trace information stored in the trace file.


Here is how you view the number of traces currently running:

SELECT count(*) FROM :: fn_trace_getinfo(default) WHERE property = 5 and value = 1


Here is how you can find more detail about the running traces:

SELECT * FROM :: fn_trace_getinfo(default)


You can terminate a trace with the 'sp_trace_setstatus' stored procedure using the traceid:

EXEC sp_trace_setstatus 1, @status = 0
EXEC sp_trace_setstatus 1, @status = 2

setting the status to 0 stops the trace
setting the status to 2 closes the trace and deletes its definition from the server

#    Comments [0] |

Joel on The Big Picture

Joel posted a review of Dreaming in Code (book by Scott Rosenberg).


Somehow Joel gave an extremely interesting description of human vision:

"You can only see at a high-resolution in a fairly small area, and even that has a big fat blind spot right exactly in the middle, but you still walk around thinking you have a ultra-high resolution panoramic view of everything. Why? Because your eyes move really fast, and, under ordinary circumstances, they are happy to jump instantly to wherever you need them to jump to. And your mind provides this really complete abstraction, providing you with the illusion of complete vision when all you really have is a very small area of high res vision, a large area of extremely low-res vision, and the ability to page-fault-in anything you want to see—so quickly that you walk around all day thinking you have the whole picture projected internally in a little theatre in your brain."

... and a classic quote about design by comittee:

"What kills me is the teams who get into the bad habit of holding meetings every time they need to figure out how something is going to work. Did you ever try to write poetry in a committee meeting? It’s like a bunch of fat construction guys trying to write an opera while sitting on the couch watching Baywatch. The more fat construction guys you add to the couch, the less likely you are to get opera out of it."


#    Comments [0] |
 Monday, January 22, 2007

StockQuote Google Gadget

I use Google Personalized Homepage a lot.  I wanted a gadget for getting stock quotes ("gadget" is Google's lingo for a module or widget), but I couldn't find one I particularly liked.  So I created my own:



Behind the gadget is a remote .NET/C# service I created which scrapes stock quotes and charts from Google Finance.

You can see it and play with the demo: cgoldberg.googlepages.com

- Add my gadget to your Google Personalized Homepage
- Add my gadget to your own web page

Caveat:  The service runs off my own server so I can't guarantee I'll host this thing forever.

#    Comments [2] |

C# Simple Multithreading Example

Here is a simple example of multithreading in C#

using System;
using System.Threading;

public class Test
{
    static void Main()
    {
        ThreadStart job = new ThreadStart(ThreadJob);
        Thread thread = new Thread(job);
        thread.Start();

        for (int i=0; i < 5; i++)
        {
            Console.WriteLine ("Main thread: {0}", i);
            Thread.Sleep(1000);
        }
    }

    static void ThreadJob()
    {
        for (int i=0; i < 10; i++)
        {
            Console.WriteLine ("Spawned thread: {0}", i);
            Thread.Sleep(500);
        }
    }
}

#    Comments [0] |

Calling A Command Line Program From C#

I often need to call external command line programs from within my C# code.  To do this, I use a Process object.  Here is some example code I use for calling a Python program:

private void Execute()
{
    Process proc = new Process();
    
    proc.StartInfo.WorkingDirectory = @"C:\scripts";
    proc.StartInfo.FileName = "python.exe";
    proc.StartInfo.Arguments = "foo.py";
    proc.StartInfo.UseShellExecute = false;
    proc.StartInfo.RedirectStandardOutput = false;
    proc.StartInfo.RedirectStandardError = true;
    proc.Start();
    proc.WaitForExit();
    proc.Close()
}


#    Comments [0] |
 Sunday, January 21, 2007

Perl Bottles

Guess what this is?

    ''=~(        '(?{'        .('`'        |'%')        .('['        ^'-')
    .('`'        |'!')        .('`'        |',')        .'"'.        '\\$'
    .'=='        .('['        ^'+')        .('`'        |'/')        .('['
    ^'+')        .'||'        .(';'        &'=')        .(';'        &'=')
    .';-'        .'-'.        '\\$'        .'=;'        .('['        ^'(')
    .('['        ^'.')        .('`'        |'"')        .('!'        ^'+')
   .'_\\{'      .'(\\$'      .';=('.      '\\$=|'      ."\|".(      '`'^'.'
  ).(('`')|    '/').').'    .'\\"'.+(    '{'^'[').    ('`'|'"')    .('`'|'/'
 ).('['^'/')  .('['^'/').  ('`'|',').(  '`'|('%')).  '\\".\\"'.(  '['^('(')).
 '\\"'.('['^  '#').'!!--'  .'\\$=.\\"'  .('{'^'[').  ('`'|'/').(  '`'|"\&").(
 '{'^"\[").(  '`'|"\"").(  '`'|"\%").(  '`'|"\%").(  '['^(')')).  '\\").\\"'.
 ('{'^'[').(  '`'|"\/").(  '`'|"\.").(  '{'^"\[").(  '['^"\/").(  '`'|"\(").(
 '`'|"\%").(  '{'^"\[").(  '['^"\,").(  '`'|"\!").(  '`'|"\,").(  '`'|(',')).
 '\\"\\}'.+(  '['^"\+").(  '['^"\)").(  '`'|"\)").(  '`'|"\.").(  '['^('/')).
 '+_,\\",'.(  '{'^('[')).  ('\\$;!').(  '!'^"\+").(  '{'^"\/").(  '`'|"\!").(
 '`'|"\+").(  '`'|"\%").(  '{'^"\[").(  '`'|"\/").(  '`'|"\.").(  '`'|"\%").(
 '{'^"\[").(  '`'|"\$").(  '`'|"\/").(  '['^"\,").(  '`'|('.')).  ','.(('{')^
 '[').("\["^  '+').("\`"|  '!').("\["^  '(').("\["^  '(').("\{"^  '[').("\`"|
 ')').("\["^  '/').("\{"^  '[').("\`"|  '!').("\["^  ')').("\`"|  '/').("\["^
 '.').("\`"|  '.').("\`"|  '$')."\,".(  '!'^('+')).  '\\",_,\\"'  .'!'.("\!"^
 '+').("\!"^  '+').'\\"'.  ('['^',').(  '`'|"\(").(  '`'|"\)").(  '`'|"\,").(
 '`'|('%')).  '++\\$="})'  );$:=('.')^  '~';$~='@'|  '(';$^=')'^  '[';$/='`';


It is Perl 5 source code.  When executed, it prints the "99 Bottles of Beer" song.  Like this:

99 bottles of beer on the wall, 99 bottles of beer!
Take one down, pass it around,
98 bottles of beer on the wall!

98 bottles of beer on the wall, 98 bottles of beer!
Take one down, pass it around,
97 bottles of beer on the wall!

97 bottles of beer on the wall, 97 bottles of beer!
Take one down, pass it around,
96 bottles of beer on the wall!

etc...


Pretty insane.
Who said Perl can be hard to read?

(Lots of implementations of the song generator in various languages are available; but none as cool as this one.)

#    Comments [0] |
 Friday, January 19, 2007

LDTP and UI Test Tools for GNU/Linux

There currently aren't many commercial UI test tools for GNU/Linux applications.  GNU/Linux has come a long way towards becoming more popular on the desktop, but it is still somewhat niche in the business world.   There is a large contingent of Windows software testers and QA engineers that make their living using commercial UI test tools (WinRunner, QTP, SilkTest, Robot, etc) from the big tool vendors (HP/Mercury, IBM/Rational, Borland/Segue, Compuware, etc).  I am not talking about small test utilities; I am talking about large UI layer test suites that people build extensive customized test frameworks on top of.  These are used most often in large business applications for automating functional and regression tests.

Good test tooling is a prerequisite for any large deployment of a business application.  As GNU/Linux becomes more popular on the desktop, I think this will become a more important factor and tool vendors will begin to beef up their GNU/Linux UI test tool offerings.  It would be great if there were viable open source tools as an alternative.  On Windows, this never happened.  There are currently no high quality open source UI test tools available.


I just took a look at the GNU/Linux Desktop Testing Project (GNU/LDTP):

"GNU/Linux Desktop Testing Project (GNU/LDTP) is aimed at producing high quality test automation framework and cutting-edge tools that can be used to test GNU/Linux Desktop and improve it."


wow.. I had never heard of that until now.

The description looks good.. it is a UI layer test tool that works in both GNOME and KDE environments.

.. and it is Free/Open Source.

.. and it is written in Python (which completely rules)

I will be keeping an eye on this and any other open source test tools in that space.

#    Comments [0] |
 Tuesday, January 16, 2007

Python - Merge a Sequence of Lists Into a Single List

the function:

def merge(seq):
    merged = []
    for s in seq:
        for x in s:
            merged.append(x)
    return merged


sample usage:

foo = [['a', 'b'],['c'],['d', 'e', 'f']]
print merge(foo)

>>>['a', 'b', 'c', 'd', 'e', 'f']

Update:
Here is another implementation that uses a Python dictionary. This version merges the lists and only keeps unique entries.

def merge(seq):
d = {}
for s in seq:
for x in s:
d[x] = 1
return d.keys()
#    Comments [0] |
 Wednesday, January 10, 2007

VBScript - Creating a Microsoft Web Archive (*.mht) File Programmatically

Here is a little VBScript for generating a Microsoft Web Archive (*.mht) file.  Web archives are a convenient way to pack a bunch of web files (HTML/CSS/JavaScript) into a single file that is viewable in your browser.  The downside is MHT files are only viewable in MS Internet Explorer (lame).

Normally you would create an MHT by using the "Save As..." option in IE.  This script allows you to create one programmatically.

Sample Usage:

for a remote html file:

>cscript mht_converter.vbs http://www.example.com/temp/foo.html foo.mht


for a local html file:

>cscript mht_converter.vbs file:/temp/foo.html foo.mht



... And now the code:




'mht_converter.vbs

Const adSaveCreateNotExist = 1
Const adSaveCreateOverWrite = 2
Const adTypeBinary = 1
Const adTypeText = 2

Set args = WScript.Arguments

if args.Count = 0 then
WScript.Echo "Usage: [CScript | WScript] mht_converter.vbs <html file> <mht filename>"
WScript.Quit 1
end if

Set objMessage = CreateObject("CDO.Message")
objMessage.CreateMHTMLBody args.Item(0)
SaveToFile objMessage, args.Item(1)


Sub SaveToFile(Msg, Fn)
Dim Strm, Dsk
Set Strm = CreateObject("ADODB.Stream")
Strm.Type = adTypeText
Strm.Charset = "US-ASCII"
Strm.Open
Set Dsk = Msg.DataSource
Dsk.SaveToObject Strm, "_Stream"
Strm.SaveToFile Fn, adSaveCreateOverWrite
End Sub




Caveat:  I am not a VB programmer... don't pretend to be... and never wanna be.  This was just something I needed to do and this was the only way I could quickly figure out how to do it.

#    Comments [2] |