goldb.org home

AS OF MAY 2008, THIS BLOG IS NO LONGER BEING UPDATED.
Visit the new blog at: http://coreygoldberg.blogspot.com



 Tuesday, November 07, 2006

Python - Removing Duplicates From A Sequence

Sequences (lists and tuples) are common data structures used in Python programming.

Here is a simple function that will remove duplicates from a sequence and return a sorted sequence of the unique items:


def remove_dups(seq):
    x = {}
    for y in seq:
        x[y] = 1
    u = x.keys()
    u.sort()
    return u


(Caveat:  It requires that all the sequence elements be hashable, and support equality comparison)


And another implementation (not sure which is better):

def remove_dups(seq):
    u = [x for x in seq if x not in locals()['_[1]']]
    u.sort()
    return u



Example using them from the Python Interpreter:

>>> my_seq = [1, 1, 3, 1, 2, 2, 7.75, 'foo', 7.75, 'foo']
>>> print remove_dups(my_seq)
[1, 2, 3, 7.75, 'foo']




Tim Peters has an excellent recipe in the Python Cookbook that dives into this much further:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52560
#    Comments [0] |
Comments are closed.