• Sort dict’s keys by values

    Can we actually sort a Python’s dict? No, but we can sort a list that contains its keys and values.

    A simplest way to do this is to use a sort method:

    >>> d = {'c': 100, 'a': 0, 'b': 10}
    >>> items = d.items()
    >>> items.sort(lambda x, y: cmp(x[1], y[1]))
    >>> items
    [('a', 0), ('b', 10), ('c', 100)]

    We can use a key argument to make function call a bit shorter:

    >>> items = d.items()
    >>> items.sort(key=lambda i: i[1])
    >>> items
    [('a', 0), ('b', 10), ('c', 100)]

    Also we can use a sorted built-in:

    >>> sorted(d.iteritems(), key=lambda i: i[1])
    [('a', 0), ('b', 10), ('c', 100)]

    But what if we care about speed? According to Gregg Lind, the fastest solution uses operator.itemgetter (that is suggested in PEP 265 named “Sorting Dictionaries by Value”) instead of lambda function:

    >>> from operator import itemgetter
    >>> sorted(d.iteritems(), key=itemgetter(1))
    [('a', 0), ('b', 10), ('c', 100)]

    This version is 10x faster than first three.

    About
    • Python
    . Written by
    Pavlo Kapyshin
    on Sunday, August 29th.
  • pymongo and binary data

    Let’s imagine we need to save zlib-compressed string.

    >>> import zlib
    >>> s = '...'
    >>> compressed = zlib.compress(s) # type(compressed) == str

    Here’s a problem: an exception will be raised if we’ll try to save it into DB as-is. pymongo uses unicode, while compressed is an instance of str. And we can’t just .decode('utf-8') it.

    >>> collection.insert({'c': compressed})
    \-\-\-------------------------------------------------------------------------
    InvalidStringData                         Traceback (most recent call last)
    \.\.\./env/lib/python2.6/site-packages/pymongo/collection.pyc in insert(self, doc\_or\_docs, manipulate, safe, check\_keys) 211 212 self.\_\_database.connection.\_send\_message( --> 213 message.insert(self.\_\_full\_name, docs, check\_keys, safe), safe) 214 215 ids = [doc.get("\_id", None) for doc in docs]
    InvalidStringData: strings in documents must be valid UTF-8

    But we can use Binary from pymongo.binary:

    >>> from pymongo.binary import Binary
    >>> collection.insert({'c': Binary(compressed)})

    Now it works.

    About
    • Python,
    • NoSQL
    • and MongoDB
    . Written by
    Pavlo Kapyshin
    on Monday, March 1st.

© 2006–2010 Pavlo Kapyshin (@, ☣!)