goldb.org home

AS OF MAY 2008, THIS BLOG IS NO LONGER BEING UPDATED.
Visit the new blog at: http://coreygoldberg.blogspot.com



 Thursday, March 29, 2007

Python - Remove Duplicate Items From a Sequence

Say you have a sequence like:

[1, 1, 2, 2, 2, 3, 4, 4, 4]

... and you want a sequence containing all the unique items (remove duplicates) like:

[1, 2, 3, 4]


Here is a function to do it:

def remove_dups(seq):
    x = {}
    for y in seq:
        x[y] = 1
    u = x.keys()
    return u


or a one-liner:

u = [x for x in seq if x not in locals()['_[1]']]



update: in the comments below, some other ways were suggested..

with 'set'.. like this:

u = list(set(seq))

or with a dictionary.. like this:

u = dict.fromkeys(seq).keys()
#    Comments [4] |
Thursday, March 29, 2007 3:07:53 PM (Eastern Standard Time, UTC-05:00)
An easier solution is

uniqseq = list(set(seq))

since sets cannot have duplicates, converting the list to a set and back again removes the duplicates from the sequence.

Scott
Thursday, March 29, 2007 5:22:15 PM (Eastern Standard Time, UTC-05:00)
cool.. thanks scott.. I didn't know sets could be used like that :)
Thursday, March 29, 2007 6:07:47 PM (Eastern Standard Time, UTC-05:00)
The following also preserves the order, but is probably trying too hard.

>>> l = [7, 4, 2, 7, 3, 1, 1, 4]
>>> tmp = {}
>>> [ x for x in l if (tmp.get(x,0) ^ tmp.setdefault(x,1))]
[7, 4, 2, 3, 1]
>>>

- Paddy3118
Friday, March 30, 2007 7:02:55 PM (Eastern Standard Time, UTC-05:00)
and if you really want to use the dict, make the code shorter:

>>> l = [7, 4, 2, 7, 3, 1, 1, 4]
>>> uniqlist = dict.fromkeys(l).keys()
>>> uniqlist
[1, 2, 3, 4, 7]
>>>
fs111
Comments are closed.