goldb.org home

AS OF MAY 2008, THIS BLOG IS NO LONGER BEING UPDATED.
Visit the new blog at: http://coreygoldberg.blogspot.com



 Friday, January 26, 2007

Python - Sort A Nested Sequence With DSU

The DSU (Decorate, Sort, Undecorate) idiom originates from Lisp.  I first learned it in Perl, where it is called the Schwartzian Transform (coolest name ever?), named after longtime Perl hacker Randal L. Schwartz.

I find myself using this same DSU idiom in Python when I need to sort a nested sequence (single level sequence of sequences).

Lets say I have the following list of lists:

seq = [
    ['a', 1, 5],
    ['b', 3, 4],
    ['c', 2, 2],
    ['d', 4, 3],
    ['e', 5, 1],
]

... and I want the outer list to contain the inner lists sorted by their last column (in this case, index 2).

How would I do this?

Here is an implementations of the DSU (Decorate, Sort, Undecorate) idiom in a Python function:

def dsu_sort(idx, seq):
    for i, e in enumerate(seq):
        seq[i] = (e[idx], e)
    seq.sort()
    for i, e in enumerate(seq):
        seq[i] = e[1]
    return seq
   
(Keep in mind that lists in Python are mutable and this will transform your original sequence.)


So applying this to the sequence above like this:

dsu_sort(2, seq)

gives us:

[['e', 5, 1], ['c', 2, 2], ['d', 4, 3], ['b', 3, 4], ['a', 1, 5]]

which is the original sequence, transformed so it is sorted by the last column (index 2).



Randal's original implementation in Perl from 1994:
#!/usr/bin/perl
 print
     map { $_->[0] }
     sort { $a->[1] cmp $b->[1] }
     map { [$_, /(\S+)$/] }
     <>;

#    Comments [3] |
 Thursday, January 25, 2007

Performance Acronym: RASP

RASP:

Reliability. Availability. Scalability. Performance.

System performance has been one of my main interests since I got into computing.  My first memories of this go back to my DOS 5.0 days ('92?).  I would obsessively optimize my beige-box x86 desktop... staying up all night tweaking my config.sys and running fractals.  (anyone remember all the fiddling with device drivers and Terminate and Stay Resident programs to load them into Upper Memory Blocks, so you can free up conventional memory?)  That obsession grew as I gained more knowledge and eventually led me into testing/tuning large distributed systems, which I have essentially focused on professionally for the past 9 years  (hey, my job title even includes "Performance Engineer").

anyway,

I just wanted to re-introduce the acronym RASP.  In the performance world, there is a lack of standard vocabulary.  There is enough shared terminology to have intelligent conversations and get engineering problems solved, but the terms used vary pretty widely from company to company and region to region.. much more so than pure development or admin language.  So sharing vocabulary is a good thing, to further push performance as its own discipline.  

I first heard the acronym "RASP" from Goranka Bjedov, a performance engineer at Google (shown here in her excellent tech-talk Using Open Source Tools for Performance Testing), during discussions at WOPR6.  She said it was an old Bell Labs term used in telco.  

RASP encapsulates all things related to system performance into a nice logical taxonomy.

Reliability
Availability
Scalability
Performance

For some reason RASP has really stuck in my head... just passing the word.

#    Comments [2] |
 Tuesday, January 23, 2007

SQL Server - How To Tell If There Is A Trace Running

Server-side tracing is the process of having your SQL Server machine save events to a physical file on that machine without using the Profiler client tool.  Server-side tracing is enabled and controlled by using SQL Server system-supplied stored procedures and functions. With these system-supplied processes, you can identify what to trace, when to start and stop tracing, what traces are running, and view trace information stored in the trace file.


Here is how you view the number of traces currently running:

SELECT count(*) FROM :: fn_trace_getinfo(default) WHERE property = 5 and value = 1


Here is how you can find more detail about the running traces:

SELECT * FROM :: fn_trace_getinfo(default)


You can terminate a trace with the 'sp_trace_setstatus' stored procedure using the traceid:

EXEC sp_trace_setstatus 1, @status = 0
EXEC sp_trace_setstatus 1, @status = 2

setting the status to 0 stops the trace
setting the status to 2 closes the trace and deletes its definition from the server

#    Comments [0] |

Joel on The Big Picture

Joel posted a review of Dreaming in Code (book by Scott Rosenberg).


Somehow Joel gave an extremely interesting description of human vision:

"You can only see at a high-resolution in a fairly small area, and even that has a big fat blind spot right exactly in the middle, but you still walk around thinking you have a ultra-high resolution panoramic view of everything. Why? Because your eyes move really fast, and, under ordinary circumstances, they are happy to jump instantly to wherever you need them to jump to. And your mind provides this really complete abstraction, providing you with the illusion of complete vision when all you really have is a very small area of high res vision, a large area of extremely low-res vision, and the ability to page-fault-in anything you want to see—so quickly that you walk around all day thinking you have the whole picture projected internally in a little theatre in your brain."

... and a classic quote about design by comittee:

"What kills me is the teams who get into the bad habit of holding meetings every time they need to figure out how something is going to work. Did you ever try to write poetry in a committee meeting? It’s like a bunch of fat construction guys trying to write an opera while sitting on the couch watching Baywatch. The more fat construction guys you add to the couch, the less likely you are to get opera out of it."


#    Comments [0] |
 Monday, January 22, 2007

StockQuote Google Gadget

I use Google Personalized Homepage a lot.  I wanted a gadget for getting stock quotes ("gadget" is Google's lingo for a module or widget), but I couldn't find one I particularly liked.  So I created my own:



Behind the gadget is a remote .NET/C# service I created which scrapes stock quotes and charts from Google Finance.

You can see it and play with the demo: cgoldberg.googlepages.com

- Add my gadget to your Google Personalized Homepage
- Add my gadget to your own web page

Caveat:  The service runs off my own server so I can't guarantee I'll host this thing forever.

#    Comments [2] |

C# Simple Multithreading Example

Here is a simple example of multithreading in C#

using System;
using System.Threading;

public class Test
{
    static void Main()
    {
        ThreadStart job = new ThreadStart(ThreadJob);
        Thread thread = new Thread(job);
        thread.Start();

        for (int i=0; i < 5; i++)
        {
            Console.WriteLine ("Main thread: {0}", i);
            Thread.Sleep(1000);
        }
    }

    static void ThreadJob()
    {
        for (int i=0; i < 10; i++)
        {
            Console.WriteLine ("Spawned thread: {0}", i);
            Thread.Sleep(500);
        }
    }
}

#    Comments [0] |

Calling A Command Line Program From C#

I often need to call external command line programs from within my C# code.  To do this, I use a Process object.  Here is some example code I use for calling a Python program:

private void Execute()
{
    Process proc = new Process();
    
    proc.StartInfo.WorkingDirectory = @"C:\scripts";
    proc.StartInfo.FileName = "python.exe";
    proc.StartInfo.Arguments = "foo.py";
    proc.StartInfo.UseShellExecute = false;
    proc.StartInfo.RedirectStandardOutput = false;
    proc.StartInfo.RedirectStandardError = true;
    proc.Start();
    proc.WaitForExit();
    proc.Close()
}


#    Comments [0] |
 Sunday, January 21, 2007

Perl Bottles

Guess what this is?

    ''=~(        '(?{'        .('`'        |'%')        .('['        ^'-')
    .('`'        |'!')        .('`'        |',')        .'"'.        '\\$'
    .'=='        .('['        ^'+')        .('`'        |'/')        .('['
    ^'+')        .'||'        .(';'        &'=')        .(';'        &'=')
    .';-'        .'-'.        '\\$'        .'=;'        .('['        ^'(')
    .('['        ^'.')        .('`'        |'"')        .('!'        ^'+')
   .'_\\{'      .'(\\$'      .';=('.      '\\$=|'      ."\|".(      '`'^'.'
  ).(('`')|    '/').').'    .'\\"'.+(    '{'^'[').    ('`'|'"')    .('`'|'/'
 ).('['^'/')  .('['^'/').  ('`'|',').(  '`'|('%')).  '\\".\\"'.(  '['^('(')).
 '\\"'.('['^  '#').'!!--'  .'\\$=.\\"'  .('{'^'[').  ('`'|'/').(  '`'|"\&").(
 '{'^"\[").(  '`'|"\"").(  '`'|"\%").(  '`'|"\%").(  '['^(')')).  '\\").\\"'.
 ('{'^'[').(  '`'|"\/").(  '`'|"\.").(  '{'^"\[").(  '['^"\/").(  '`'|"\(").(
 '`'|"\%").(  '{'^"\[").(  '['^"\,").(  '`'|"\!").(  '`'|"\,").(  '`'|(',')).
 '\\"\\}'.+(  '['^"\+").(  '['^"\)").(  '`'|"\)").(  '`'|"\.").(  '['^('/')).
 '+_,\\",'.(  '{'^('[')).  ('\\$;!').(  '!'^"\+").(  '{'^"\/").(  '`'|"\!").(
 '`'|"\+").(  '`'|"\%").(  '{'^"\[").(  '`'|"\/").(  '`'|"\.").(  '`'|"\%").(
 '{'^"\[").(  '`'|"\$").(  '`'|"\/").(  '['^"\,").(  '`'|('.')).  ','.(('{')^
 '[').("\["^  '+').("\`"|  '!').("\["^  '(').("\["^  '(').("\{"^  '[').("\`"|
 ')').("\["^  '/').("\{"^  '[').("\`"|  '!').("\["^  ')').("\`"|  '/').("\["^
 '.').("\`"|  '.').("\`"|  '$')."\,".(  '!'^('+')).  '\\",_,\\"'  .'!'.("\!"^
 '+').("\!"^  '+').'\\"'.  ('['^',').(  '`'|"\(").(  '`'|"\)").(  '`'|"\,").(
 '`'|('%')).  '++\\$="})'  );$:=('.')^  '~';$~='@'|  '(';$^=')'^  '[';$/='`';


It is Perl 5 source code.  When executed, it prints the "99 Bottles of Beer" song.  Like this:

99 bottles of beer on the wall, 99 bottles of beer!
Take one down, pass it around,
98 bottles of beer on the wall!

98 bottles of beer on the wall, 98 bottles of beer!
Take one down, pass it around,
97 bottles of beer on the wall!

97 bottles of beer on the wall, 97 bottles of beer!
Take one down, pass it around,
96 bottles of beer on the wall!

etc...


Pretty insane.
Who said Perl can be hard to read?

(Lots of implementations of the song generator in various languages are available; but none as cool as this one.)

#    Comments [0] |
 Friday, January 19, 2007

LDTP and UI Test Tools for GNU/Linux

There currently aren't many commercial UI test tools for GNU/Linux applications.  GNU/Linux has come a long way towards becoming more popular on the desktop, but it is still somewhat niche in the business world.   There is a large contingent of Windows software testers and QA engineers that make their living using commercial UI test tools (WinRunner, QTP, SilkTest, Robot, etc) from the big tool vendors (HP/Mercury, IBM/Rational, Borland/Segue, Compuware, etc).  I am not talking about small test utilities; I am talking about large UI layer test suites that people build extensive customized test frameworks on top of.  These are used most often in large business applications for automating functional and regression tests.

Good test tooling is a prerequisite for any large deployment of a business application.  As GNU/Linux becomes more popular on the desktop, I think this will become a more important factor and tool vendors will begin to beef up their GNU/Linux UI test tool offerings.  It would be great if there were viable open source tools as an alternative.  On Windows, this never happened.  There are currently no high quality open source UI test tools available.


I just took a look at the GNU/Linux Desktop Testing Project (GNU/LDTP):

"GNU/Linux Desktop Testing Project (GNU/LDTP) is aimed at producing high quality test automation framework and cutting-edge tools that can be used to test GNU/Linux Desktop and improve it."


wow.. I had never heard of that until now.

The description looks good.. it is a UI layer test tool that works in both GNOME and KDE environments.

.. and it is Free/Open Source.

.. and it is written in Python (which completely rules)

I will be keeping an eye on this and any other open source test tools in that space.

#    Comments [0] |
 Tuesday, January 16, 2007

Python - Merge a Sequence of Lists Into a Single List

the function:

def merge(seq):
    merged = []
    for s in seq:
        for x in s:
            merged.append(x)
    return merged


sample usage:

foo = [['a', 'b'],['c'],['d', 'e', 'f']]
print merge(foo)

>>>['a', 'b', 'c', 'd', 'e', 'f']

Update:
Here is another implementation that uses a Python dictionary. This version merges the lists and only keeps unique entries.

def merge(seq):
d = {}
for s in seq:
for x in s:
d[x] = 1
return d.keys()
#    Comments [0] |
 Wednesday, January 10, 2007

VBScript - Creating a Microsoft Web Archive (*.mht) File Programmatically

Here is a little VBScript for generating a Microsoft Web Archive (*.mht) file.  Web archives are a convenient way to pack a bunch of web files (HTML/CSS/JavaScript) into a single file that is viewable in your browser.  The downside is MHT files are only viewable in MS Internet Explorer (lame).

Normally you would create an MHT by using the "Save As..." option in IE.  This script allows you to create one programmatically.

Sample Usage:

for a remote html file:

>cscript mht_converter.vbs http://www.example.com/temp/foo.html foo.mht


for a local html file:

>cscript mht_converter.vbs file:/temp/foo.html foo.mht



... And now the code:




'mht_converter.vbs

Const adSaveCreateNotExist = 1
Const adSaveCreateOverWrite = 2
Const adTypeBinary = 1
Const adTypeText = 2

Set args = WScript.Arguments

if args.Count = 0 then
WScript.Echo "Usage: [CScript | WScript] mht_converter.vbs <html file> <mht filename>"
WScript.Quit 1
end if

Set objMessage = CreateObject("CDO.Message")
objMessage.CreateMHTMLBody args.Item(0)
SaveToFile objMessage, args.Item(1)


Sub SaveToFile(Msg, Fn)
Dim Strm, Dsk
Set Strm = CreateObject("ADODB.Stream")
Strm.Type = adTypeText
Strm.Charset = "US-ASCII"
Strm.Open
Set Dsk = Msg.DataSource
Dsk.SaveToObject Strm, "_Stream"
Strm.SaveToFile Fn, adSaveCreateOverWrite
End Sub




Caveat:  I am not a VB programmer... don't pretend to be... and never wanna be.  This was just something I needed to do and this was the only way I could quickly figure out how to do it.

#    Comments [2] |
 Tuesday, January 09, 2007

Adding A Second Hard Drive To My Dell Laptop

I have a Dell Inspiron 600m laptop and I currently use Windows XP as the base Operating System with Ubuntu (GNU/Linux) running in a Virtual Machine (VMWare) on top of it.  This generally works really well... but I want to natively run Linux (instead of hosting it inside a VM) and I'm not crazy about the idea of partitioning my primary hard disk.

So what to do?

Dell offers a "2nd Hard Drive Module" to overcome this.  It is a replacement for the default CD tray that sits in the side of the laptop.  It fits in the same slot as the CD tray but inside it has an IDE adapter and hard drive mounting brackets.  This lets you use any 2.5" IDE Hard Drive as a second disk in your laptop... very cool.

I went with a Hitachi Travelstar E7K80 drive to put inside it (7200 RPM, 80 gig).



note: the 2nd Hard Drive Module" is horribly expensive from Dell directly but I picked one up on eBay for under 50 bucks

#    Comments [0] |

Python - Formatted Dates and Times

I am not sure why, but every time I need to use some formatted dates or times in Python, I end up spending about 20 minutes going through the docs and reading up on the datetime module; which leads to more confusion.

So for my own clarity, here is how we do it using only the time module:

>>> import time
>>> print time.strftime("%m/%d/%y %H:%M:%S", time.localtime())
01/09/07 12:17:25

All of the formatting options for strftime() can be found here: http://docs.python.org/lib/module-time.html

#    Comments [0] |
 Wednesday, January 03, 2007

Dojo JavaScript Toolkit with ASP.NET

Dojo is a free/open source JavaScript toolkit.  I wanted to add some its eye candy to one of my ASP.NET 2.0 applications, so I integrated the Fisheye menu (a menu that balloons out, similar to the launcher on OS X).






Here is how I did it:


First I downloaded Dojo and created a 'dojo' directory under my main project directory.  I dropped dojo.js and the entire Dojo 'src' directory here.

Then in my C# codebehind (.aspsx.cs), I add this to the Page_Load event:

protected void Page_Load(object sender, EventArgs e)
{
    HtmlGenericControl Include = new HtmlGenericControl("script");
    Include.Attributes.Add("type", "text/javascript");
    Include.Attributes.Add("src", "dojo/dojo.js");
    Page.Header.Controls.Add(Include);

    HtmlGenericControl Include2 = new HtmlGenericControl("script");
    Include2.Attributes.Add("type", "text/javascript");
    Include2.InnerHtml = "dojo.require('dojo.widget.FisheyeList');";
    Page.Header.Controls.Add(Include2);
}

(I do this in the codebehind because I am using a Master Page and I need to access the HTML header from an individual content page.)


Then inside my ASP.NET page (.aspx), I added this div:

<div class="fisheyelist">
    <div dojoType="FisheyeList"
        itemWidth="80" itemHeight="80"
        itemMaxWidth="200" itemMaxHeight="200"
        orientation="horizontal"
        effectUnits="2"
        itemPadding="10"
        attachEdge="center"
        labelEdge="bottom"
        conservativeTrigger="false"
    >
        <div dojoType="FisheyeListItem"
             onclick="window.location = 'item1.aspx';"
             caption="Item 1"
             iconsrc="img/item1.png">
        </div>
        <div dojoType="FisheyeListItem"
             onclick="window.location = 'item2.aspx';"
             caption="Item 2"
             iconsrc="img/item2.png">
        </div>
        <div dojoType="FisheyeListItem"
             onclick="window.location = 'item3.aspx';"
             caption="Item 3"
             iconsrc="img/item3.png">
        </div>
    </div>
</div>



.. and it works.

#    Comments [0] |

Python - Find And Replace A String In Every File In A Directory

The Python Cookbook has a recipe to find and replace a string in every file in a directory.

I needed to do something like this today, so I cleaned up the script a little to make it [hopefully] a little more pythonic:


#!/usr/bin/env python
# replace a string in multiple files

import fileinput
import glob
import sys
import os


if len(sys.argv) < 2:
    print 'usage: %s search_text replace_text directory' \
        % os.path.basename(sys.argv[0])
    sys.exit(0)


stext = sys.argv[1]
rtext = sys.argv[2]
if len(sys.argv) == 4:
    path = os.path.join(sys.argv[3], '*')
else:
    path = '*'


print 'finding: %s and replacing with: %s' % (stext, rtext)


files = glob.glob(path)
for line in fileinput.input(files, inplace=1):
    if stext in line:
        line = line.replace(stext, rtext)
    sys.stdout.write(line)


#    Comments [0] |

Tim Bray on Simplicity

Best Tim Bray quote:
"Don Park makes his blog go fast by applying WhirlyCache to the DAO layer, slipping in a transaction layer to reduce database integrity corruption, and using aspect-oriented programming technology via the Spring framework, with the help of Java annotations to mark transactional methods and classes. Yow! My approach is to have Apache serve static data out of the filesystem. Whatever; faster is better."

often simplicity trumps everything, eh?

#    Comments [0] |
 Tuesday, January 02, 2007

Free Books On Technology Subjects

This site has a collection of links to freely downloadable versions of some really good programming and computer science books:

www.techbooksforfree.com

#    Comments [0] |
 Friday, December 22, 2006

Schools of Software Testing

Cem Kaner just posted an incredible piece of writing about the Schools of Software Testing.

Especially intriguing was his explanations of "paradigms" in a generic sense, and references to Thomas Kuhn.

good stuff.


#    Comments [0] |
 Monday, December 18, 2006

Got Master's? Yup

I just finished my Master's Degree at Boston University and will be receiving my Diploma next month!  This is a big relief for me, as I have been working on it part time for the past 3.5 years while working full time as a software engineer.

I now have a Master of Science in Computer Information Systems (CIS).

The coursework included:
  • Advanced Java Programming
  • Architecture and Design of Multi-tiered Systems
  • Database Design and Implementation for Business
  • Data Communications and Computer Networks
  • Grid Computing
  • Human-Computer Interface Design with .NET
  • Information Systems Analysis and Design
  • IT Project Management
  • Operating Systems
  • Web Application Development

Finished with a 3.73 GPA.  w00t!

#    Comments [0] |
 Thursday, December 14, 2006

DevHouse Boston

I was at the first DevHouse Boston last Saturday.  What a great idea.. Have a bunch of talented and inspired hackers show up... serve up free wifi and pizza... let people talk and throw ideas around... break into small groupos... and knock out working code the same day.

I didn't get to stay until the end, but I spent time working with a great group working on a cool tool.

I hope DevHouse becomes a regular thing.

Two initial observations gleaned from this group:  Python and Ruby own.. and Macs are everwhere.

#    Comments [0] |
 Tuesday, December 12, 2006

Java In Nuclear Reactors?

This is pretty funny (or scary)

I always thought this excerpt from the Java license was odd:

"not designed or intended for use in the design, construction, operation or maintenance of any nuclear facility."

So it made me do a double-take when I read this:

Sun Solaris Grid Powers Next Generation Nuclear Reactor Design from the Department of Energy

from the article:

"The solution includes more than 230 Sun Solaris Servers powered by AMD  Opteron processors; and, more than 12 Terabytes of Sun StorEdge 6320 storage, the Solaris 9 operating system, Java Enterprise System and Java development software"

I guess the DOE likes to keep things exciting :)

#    Comments [0] |
 Thursday, December 07, 2006

C# - Read Contents Of A Randomly Selected File

string[] fileList = Directory.GetFiles(@"c:\temp");
Random random = new Random();
string fileName = fileList[random.Next(fileList.Length)];
StreamReader sr = File.OpenText(fileName);
string contents = sr.ReadToEnd();



#    Comments [0] |
 Thursday, November 30, 2006

Feed Reading in Firefox 2

From Ben Goodger (Googler and Lead Firefox Dev) on RSS in Firefox:

"... Firefox philosophy of having enough features, not too many or too few. In general, we felt that RSS reader was a very personal choice to be made by the user, and that we did not want to compete with existing RSS readers, which are very competent in a variety of ways. Rather, we wanted to allow users to easily subscribe to feeds using their favorite reader."

I love this feature.. it lets me quickly subscribe to feeds through Bloglines.

I think Firefox nailed this one.  A feed reader belongs in a plugin if you choose to use your browser for such tasks.  Keep the core simple and extensible.

#    Comments [0] |
 Friday, November 24, 2006

Python, IDEs, and Drones

Python is a very popular programming language with adoption and advocacy from many corporations, and large factions of open source programmers using it extensively.  However, in the world of "corporate drone programming", it is still pretty niche. 

Have a look at this indication of popularity among programming languages:
TIOBE Programming Community Index

One thing I like about Python is the simpilcity it strives for.  I find myself writing all my code in SciTE, a simple text editor; rather than a full blown IDE.

I always looked at this is a strong point for dynamic languages.

Over in the cult of corporate drone programmers, static languages (C++, C#, Java) are the norm, and life is spent inside an IDE.


from Robert on comp.lang.python:

"Flat Web/DB programming is one major field where programmer masses are born.  The other big one is RAD-GUI/DB programming. This field is probably still wide open. Best tooled Borland RAD systems are going down meanwhile because of the stiff compiler language. Programmers look around for the next language & toolset. Python is the language - but with Python there is again a similar confusion around IDE's and GUI-libs. There is no really good IDE (but fat ones). And the major gui libs there are not Python, but are fat sickening layers upon layers upon other OO-langs."

Not that I necesarilly want Python to become the next default language for drones, but it makes me think about further adaption and mainstreamability of Python and other dynamic languages (which typically aren't as well suited to the features of many IDEs)

#    Comments [0] |
 Thursday, November 23, 2006

SOAP and REST - Conceptually

After reading thousands of articles about SOAP vs. REST, I was more confused about everything than convinced of anything.

Finally, this quote  from Stefan Tilkov made the conceptual difference between SOA(P) and REST very clear to me:
"In REST, you have lots and lots of resources all supporting the same interface; in SOA(P) (at least the wide-spread paradigm), you have few endpoints all supporting different interfaces."

#    Comments [0] |