Data cleaning and anonymizing with GPT-3.5

Note added on 2025-07-28: It’s been two years of programming with and for LLMs. This article sounds so naïve now. Say you made a website in which customers buy personalized gifts. Each gift comes with a message, written by the customer in whatever language the customer wants. The use of grammar, punctuation and capitalizations in the messages is often creative. You would like to be able to offer reasonably normative messages to your customers. You would also like to store a fully anonymized version of the messages; replace all proper names with a [proper_name] placeholder, place names with [place_name], dates with [date], times with [time], and geographical coordinates with [coordinates]. ...

August 6, 2023

The wasted talent in the meeting room

How learning about the brain could improve meetings Go directly to the proposal for conducting meetings if you are in a hurry. Meetings are a topic close to my heart. I have spent a large part of my professional career in them, and I must acknowledge that not all this time has been all that productive. I started to keep an eye on other ways to merge people’s ideas ever since I realized the inescapable fact that a large part of my corporate life would be spent sitting in meeting rooms listening to people talk to each other. ...

December 18, 2011

Prototype your way out of uncertainty

Tinkering: prototype your way out of uncertainty Ten years ago I found myself near Portland talking to Bill, a brilliant colleague who was working at one of HP’s printer divisions. We were both worried about the lack of tools available for the particular set of problems we were working on —figuring out the algorithms that decide which drops to print, and when, in an ink-jet printer. It turned out we had both been thinking along the same lines: we wanted a set of libraries for image processing, written in C++ for speed, and accessible from Python for ease of prototyping. We were all avid users of the available toolkits for manipulating images (the classic pbm, ImageMagick) but there was nothing out there that could meet our rather special needs. ...

November 25, 2011

A graphic explanation of the Bayes Theorem

I enjoyed how the 3.16 section of the Stanford Artificial Intelligence class presented the Bayes theorem. Instead of giving a formula and expecting the alumni to apply it, they gave us a problem that the Bayes theorem would solve and expected, I believe, that we figured it out ourselves. Being as I am counting-challenged, it took me a while to figure out a way of solving it that was simple enough that I could be reasonably sure of my results. It turned out to be a very interesting detour. ...

November 1, 2011

23 visits a day

Note added in 2021. When I wrote this 10 years ago I had no idea that what I then saw as a fascinating side project would become my company, and sustain my family for years. I was wondering, back then, who would want these wonderful maps I was making. The answer ended up surprisig me beyond anything I could have imagined. The sky as seen from South Goa I’ve been obsessed for the last 24 hours, ever since I put on-line the version of http://greaterskies.com that I thought was the first one to deserve promotion. I knew that I shouldn’t really care; I knew that, most likely, nothing would happen; I knew that I should expect to be ignored in Hacker News. And I’ve done it because it was fun. I’ve had a good time figuring out how to build the puzzle of Common Lisp, Python and Javascript that serves the star chart PDFs. And hosting is free. But still. It’s hard not to give in to what I would naturally do at this stage: add features at a frantic pace, assuming that the next thing is the one that will make the difference. ...

October 3, 2011