Dates as they were meant to be in Python and Common Lisp

I shouldn't find it surprising by now, but I do. Every time I have to do something potentially non-trivial with Python I have this wonderful feeling that this is how things were meant to be.

This afternoon I was trying to figure out where my money goes. I use Python to do that: I wrote a short program that parses the data coming from the bank in CSV format, bins the expenses in categories, and outputs average values separating core and non-essential expenses.

I had a date string field in the data, but up to now I had not bothered to parse it. Today, however, I wanted to filter out entries older than given a number of months. It turned out to be reassuringly simple.

Parsing the date

The dateutil parser might be already there if you are in OS-X. If it not it is easy to install1.

Assuming that your dates are in the date_strings array and that you are not American you just have to

from dateutil.parser import parse as parse_date
dates = [parse_date(ds, dayfirst=True).date()
         for ds in date_strings]

This will take care of dates of the form "13-04-2010", and also "23/5/2011". It will also take care of dates the way they should always be written, "2011-03-28". If you are American you might want to remove the dayfirst=True.

We've come to take these things for granted, but I think you'll agree with me that this code is a beautiful thing.

First, it's already there or easy to install, so you'll use the same thing for all your programs. I've just checked my Common Lisp projects and I've found three different parse-date functions; they tend to look like

(defun parse-date (date)
  "Date is YYYY-MM-DD, returns universal time."
  (unless (string-equal date "")
    (apply #'encode-universal-time
           (append '(0 0 0)
                   (nreverse (mapcar #'parse-integer
                                     (ppcre:split "-" date)))))))

which is OK, but it requires the (great) regular expression library to be part of your project, and it fails for separators different than "-". And it is one of three. Sure I could have made a library and always use the same function, but designing good libraries is hard, and I've never bothered for this.

Second, list comprehensions are a very clean solution. It is not side-effect free, but almost. Whatever happens in the comprehension expression stays there, until it comes out to be dates.

Filtering by dates

Say you have stored your dates in the date field of an object, and you want to extract from an array expenses those that have a date later than months_old months ago. You'd do something like this:

from datetime import date
today = date.today()
days_old = months_old*30
new_expenses = [e for e in expenses
                if (today - e.date).days < days_old]

This is not exact, but it got the job done for my purposes. And it is, you will agree, rather clear and succinct.

In Common Lisp

This would be about as clean in Common Lisp, assuming that you store your dates in universal time (you would store them as decoded times if you cared about time zone). First you'd probably define an output function to test what you are doing:

(defun format-date (ut)
  (multiple-value-bind (s min h d m y) (decode-universal-time ut)
    (declare (ignore s min h))
    (format nil "~A-~10,2,'0,R-~10,2,'0,R" y m d)))

;;; Try it out:
(let ((dates (mapcar #'parse-date '("2011-10-23"
                                    "2011-11-21"
                                    "2011-7-1"))))
  (mapcar #'format-date dates))
;;; => ("2011-10-23" "2011-11-21" "2011-07-01")

Filtering then is just as simple,

(defun filter-dates (dates months)
  (let ((s-in-months (* 3600 24 30 months))
        (today (get-universal-time)))
    (remove-if #'(lambda (d) (> (- today d) s-in-months)) dates)))

(let ((dates (mapcar #'parse-date '("2011-10-23"
                                    "2011-11-21"
                                    "2011-7-1"))))
  (mapcar #'format-date (filter-dates dates 2)))
;;; => ("2011-10-23" "2011-11-21")

Date range

You can also use the built-in min and max functions on your dates, as in

print "From %s to %s" % (min([e.date for e in expenses]),
                         max([e.date for e in expenses]))

which is not very efficient —it traverses the expenses array twice— but its clarity is hard to beat.

Learning more

There are plenty of good resources available online for free, so you probably don't need to buy anything. The one book I have found useful is the Python Essential Reference2, by David M. Beazley: it is surprisingly easy to find in it what you are looking for.

If you want to learn Lisp you need two books: ANSI Common Lisp, by Paul Graham, to understand what it is about and enjoy one of the best Computer Science books around (one in which you'll see how to implement a system of object oriented programming on top of Common Lisp in a single short chapter); and Practical Common Lisp, by Peter Seibel, also available on-line, to understand the actual details of how to go about writing Common Lisp programs in modern systems.

Footnotes:

1

Thank you to roopeshv who pointed out to me that dateutil is actually not a standard library.

2

Disclaimer: I do get a cut from your Amazon purchase. Thank you very much for your support.

Juan Reyero Barcelona, 2011-12-06
 

blog comments powered by Disqus