Weekly Head Voices #117: Dissimilar.

The week of Monday February 13 to Sunday February 19, 2017 might have appeared to be really pretty boring to any inter-dimensional and also more mundane onlookers.

(I mention both groups, because I’m almost sure I would have detected the second group watching, whereas the first group, being interdimensional, would probably have been able to escape detection. As far as I know, nobody watched.)

I just went through my orgmode journals. They are filled with a mix of notes on the following mostly very nerdy and quite boring topics.

Warning: If you’re not an emacs, python or machine learning nerd, there is a high probability that you might not enjoy this post. Please feel free to skip to the pretty mountain at the end!

Advanced Emacs configuration

I finally migrated my whole configuration away from Emacs Prelude.

Prelude is a fantastic Emacs “distribution” (it’s a simple git clone away!) that truly upgrades one’s Emacs experience in terms of look and feel, and functionality. It played a central role in my return to the Emacs fold after a decade long hiatus spent with JED, VIM (there was more really weird stuff going on during that time…) and Sublime.

However, it’s a sort of rite of passage constructing one’s own Emacs configuration from scratch, and my time had come.

In parallel with Day Job, I extricated Prelude from my configuration, and filled up the gaps it left with my own constructs. There is something quite addictive using emacs-lisp to weave together whatever you need in your computing environment.

To celebrate, I decided that it was also time to move my todo system away from todoist (a really great ecosystem) and into Emacs orgmode.

From this… (beautiful multi-platform graphical app)
… to this!! (YOUR LIFE IN PLAINTEXT. DUN DUN DUUUUUN!)

I had sort of settled with todoist for the past few years. However, my yearly subscription is about to end on March 5, and I’ve realised that with the above-mentioned Emacs-lisp weaving and orgmode, there is almost unlimited flexibility also in managing my todo list.

Anyways,  I have it setup so that tasks are extracted right from their context in various orgfiles, including my current monthly journal, and shown in a special view. I can add arbitrary metadata, such as attachments and just plain text, and more esoteric tidbits such as live queries into my email database.

The advantage of having the bulk of the tasks in my month journal, means I am forced to review all of the remaining tasks at the end of the month before transferring them to the new month’s journal.

We’ll see how this goes!

Jupyter Notebook usage

Due to an interesting machine learning project at work, I had a great excuse to spend some quality time with the Jupyter Notebook (formerly known as IPython Notebook) and the scipy family of packages.

Because Far Too Much Muscle-Memory, I tried interfacing to my notebook server using Emacs IPython Notebook (EIN), which looked like this:

However, the initial exhilaration quickly fizzled out as EIN exhibits some flakiness (primarily broken indentation in cells which makes this hard to interact with), and I had no time to try to fix or work-around, because day job deadlines. (When I have a little more time, I will have to get back to the EIN! Apparently they were planning to call this new fork Zwei. Now that would have been awesome.)

So it was back to the Jupyter Notebook. This time I made an effort to learn all of the new hotkeys. (Things have gone modal since I last used this intensively.)

The Notebook is an awe-inspiringly useful tool.

However, the cell-based execution model definitely has its drawbacks. I often wish to re-execute a single line or a few lines after changing something. With the notebook, I have to split the cell at the very least once to do this, resulting in multiple cells that I now have to manage.

In certain other languages, which I cannot mention anymore because I have utterly exhausted my monthly quota, you can easily re-execute any sub-expression interactively, which makes for a more effective interactive coding experience.

The notebook is a good and practical way to document one’s analytical path. However, I sometimes wonder if there are less linear (graph-oriented?) ways of representing the often branching routes one follows during an analysis session.

Dissimilarity representation

Some years ago, I attended a talk where Prof. Robert P.W. Duin gave a fantastic talk about the history and future of pattern recognition.

In this talk, he introduced the idea of dissimilarity representation.

In much of pattern recognition, it was pretty much the norm that you had to reduce your training samples (and later unseen samples) to feature vectors. The core idea of building a classifier, is constructing hyper-surfaces that divide the high-dimensional feature space into classes. An unseen sample can then be positioned in feature space, and its class simply determined by checking on which side of the hypersurface(s) it finds itself.

However, for many types of (heterogenous) data, determining these feature vectors can be prohibitively difficult.

With the dissimilarity representation, one only has to determine a suitable function that can be used to calculate the dissimilarity between any two samples in the population. Especially for heterogenous data, or data such as geometric shapes for example, this is a much more tractable exercise.

More importantly, it’s often easier to discuss with domain experts about similarity than it is to talk about feature spaces.

Due to the machine learning project mentioned above, I had to work with categorical data that will probably later also prove to be of heterogeneous modality. This was of course the best (THE BEST) excuse to get out the old dissimilarity toolbox (in my case, that’s SciPy and friends), and to read a bunch of dissimilarity papers that were still on my list.

Besides the fact that much fun was had by all (me), I am cautiously optimistic, based on first experiments, that this approach might be a good one. I was especially impressed by how much I could put together in a relatively short time with the SciPy ecosystem.

Machine learning peeps in the audience, what is your experience with the dissimilarity representation?

A mountain at the end

By the end of a week filled with nerdery, it was high time to walk up a mountain(ish), and so I did, in the sun and the wind, up a piece of the Kogelberg in Betty’s Bay.

At the top, I made you this panoroma of the view:

Click for the 7738 x 2067 full resolution panorama!

At that point, the wind was doing its best to blow me off the mountain, which served as a visceral reminder of my mortality, and thus also kept the big M (for mindfulness) dial turned up to 11.

I was really only planning to go up and down in brisk hike mode due to a whiny knee, but I could not help turning parts of the up and the largest part of the down into an exhilarating lope.

When I grow up, I’m going to be a trail runner.

Have fun (nerdy) kids, I hope to see you soon!

Moving 12 years of email from GMail to FastMail

In 2013, when it became clear, primarily through Edward Snowden’s heroic actions, that the level of snooping by the US and other governments was far greater than any of us would have thought, I moved all of my data out of the US and of course blogged about it (that blog post has been read almost 70000 times; I think for many people this is an important issue).

This included migrating 60000 emails away from my beloved GMail (I got my GMail invite from The Vogon Poet on August 24, 2004. At that time, you could only get GMail by invitation. It was pretty exciting! (I have emails from before 2004, back to ’93 or ’94, but those are in a backup archive somewhere.)) to the little Synology DS213j standing next to my desk at the time.  This was all well and good behind the stable Dutch 100 Mbit/s down / 10 Mbit/s up cable connection I had, but when we decided to move back to South Africa, where home internet is a few years behind The Netherlands, I ended up having to pay for a virtual private server in Cape Town (to keep latency between me and my mail server manageable) and having to admin my own dovecot IMAP and postfix SMTP server.

Initially this was workable, until the Nth time that I had to interrupt my real job (which has nothing to do with mail servers) to apply a security patch or get the VPS booting again after a botched kernel upgrade. Besides that, I had to deal with keeping my server out of over-enthusiastic spam blacklists and whatnot. Also, inspite of mu4e, I did end up missing the fast graphical GMail web interface.

So, it was with a great deal of tail between my legs that on June 10, 2015 (I have a lab journal, remember) I went right back back to GMail. My mail setup, although pleasingly decentralised, was costing me too much time and hence actual money.

Fast forward to July 15, 2016 (there’s that lab journal again…) when, after receiving an email from Google asking me to indicate how exactly I would like them to use my data to customise adverts around the web, and after thinking for a bit about what kind of machine learning tricks I would be able to pull on you with 12 years of your email, I decided that I really had to make alternative plans for my little email empire.

Somehow FastMail came up and in one of those impulsive LET’S WASTE SOME TIME manoeuvres, I pressed the big red MIGRATE button!

The rest of this post is my mini-review of the FastMail service after almost 3 weeks of intensive use.

Importing mail from GMail

The main import & export window
The main import & export window
IMAP migration configuration dialog
IMAP migration configuration dialog

The Settings | Import & Export option in FastMail was easy to setup. It knows how to authenticate with GMail, even when you make use of two-factor authentication, like I do and you probably should.

The import takes place via the GMail IMAP interface. It’s important to remember that via the IMAP client, an email tagged in GMail with both important and info will appear in two different folders. Because of this, I did check the no duplicates checkbox, but still I noticed that my 15 GB FastMail evaluation mailbox was filling up more quickly than I would have expected.

After a support request which was responded to within minutes (bonus), I discovered the Quota Usage screen and could see that the duplicate detection did indeed not seem to work correctly during the import. Based on more tips from the support tech, I made use of the Mass delete or remove duplicates module (Settings | Folders | Scroll all the way to the bottom of the page) to delete thousands of duplicate emails during the import. This was indeed because of emails appearing in multiple IMAP folders due to their GMail tags. Note: Friend and reader stefanvdwalt reported the exact same mail duplication during import issue which in his case did go over quota, so do keep an eye on this!

After a day or so (during which I could email more or less normally) I received an import report from FastMail claiming that the import had been successful, except for this error:

Log: Fri Jul 15 17:49:17 2016; cpbotha/imap.gmail.com; Migrating folder Inbox -> Inbox
Log: Fri Jul 15 17:49:17 2016; cpbotha/imap.gmail.com; Creating local folder Inbox
Log: Fri Jul 15 17:49:17 2016; cpbotha/imap.gmail.com; Error migrating remote folder Inbox: Failed to create Inbox. IMAP Command : 'create' failed. Response was : no - Mailbox already exists

The import had managed to figure out that GMail Sent should map to FastMail Sent for example, but Inbox was probably too special to map in the same way. I fixed this by firing up my trusty Thunderbird, and using IMAP to drag and drop emails from my GMail Inbox to my shiny new FastMail Inbox.

In retrospect, I should have selected Create under new sub-folder in the IMAP migration configuration instead of Merge into existing folders. I discovered later that moving thousands of emails to a different folder is near instantaneous in the FastMail web-app.

What I like

Webmail speed

I live more or less at the southern tip of the African continent. My lowest latency connection with the rest of the internet is via undersea optic cable to Europe (about 140ms ping).

The FastMail web servers are in the USA, which is, as the ping flies, much further away. I was not expecting much from the webmail, but colour me surprised when I discovered that this felt subjectively faster than GMail (who have servers everywhere, even down here). Things remained snappy, even with all 50000 of my conversations imported.

As far as I can figure out, it seems that much of this is due to FastMail’s self-designed but open source IMAP-replacement called JMAP. JMAP has been designed for low latency, and for improved battery life. What it does differently, is batch requests together, and it also has optimisations specifically for interactive webmail.

The web-app has full support for keyboard shortcuts, which increases the subjective perception of speed.

Webmail search

For my purposes, search in FastMail is on par with that of GMail. I can dig up any of my emails, back up to 2004, in seconds.

FastMail advanced search interface
FastMail advanced search interface

What’s also very useful, is that you can turn any search into a virtual folder.

Tech support

This is one area where Google really can’t hold a candle to FastMail. If something goes wrong with your gmail account (this hardly ever happens, but it’s possible) it’s almost impossible to get hold of any kind of official tech support. Here’s a recent story where a GMail user’s account was summarily terminated. There was probably some kind of ToS infringment, but the user has no idea what or why, and has lost all access to their emails and contacts database.

So far I’ve contacted FastMail tech support twice: Once during my email migration, and once to confirm the absence of the “quote selected text in reply” feature (discussed below). In both cases, I was helped by real humans who responded very quickly and courteously to my support requests.

Email and contacts (and calendar) out of Google’s view

I’m still of the opinion that Google makes fantastic and valuable products. However, with all of their data mining know-how and resources, one has to decide how much of one’s personal information one is willing to trade in for the use of these fantastic products.

With FastMail, I have been able to extricate my significant email archive (2004 to 2016, 50000 conversations) as well as my contacts database. I’m still making use of Google Calendar, because of bunches of sharing going on with family members, but I have the option of moving that out also.

By the way, the FastMail Calendar web interface is more than capable (and pretty enough) to replace Google’s version.

What I don’t like

Missing integrations: Todoist

GMail, being as popular as it is, has tonnes of integrations with other apps. In my case, I will really miss the Todoist for Gmail extension. With this, I had a mini-todoist window inside my GMail, and I could turn any email into a task at the click of a button (or the press of a shortcut).

Because FastMail email URLs seem to be persistent, I use the Todoist Chrome extension’s “Add to Todoist” context menu action to add the URL and email subject as a task. This not as nice as the gmail-specific extension (the task goes immediately into the todoist inbox, without the possibility to edit metadata such as due date and tags).

Missing feature: Quote selection in reply

In Gmail and in Thunderbird, if you select text in an incoming email and then reply, that selected text is quoted in the reply email. Unfortunately, this feature is not available in the FastMail web-app, and they have no plans to implement it.

I use both the FastMail web-interface as well as Thunderbird, because of its great PGP email encryption and signature support (hey, find me on keybase, send me encrypted email!), so this issue is somewhat ameliorated. Still, it would have been nice.

Android app lag

I do have FastMail’s Android app on my telephone. The app is a Cordova / PhoneGap / CrossWalk style unit with real-time email push and notification via Google Cloud Messaging (this is a relatively energy-efficient way for android phones to get push notification and is natively supported by FastMail).

However, there is a few second lag when I open the inbox, so I prefer using the pro version of AquaMail, a great Android IMAP mail client. I have this set to 15 minute polling for new email, as IMAP IDLE (push, in other words) is not as battery efficient as GCM or Apple’s email push. Opening any folder or email in AquaMail is of course instantaneous, as the emails live on the phone.

That being said, I use the FastMail app for searching, which is just as fast and as effective as the web-app.

THAT being said, FastMail really needs to implement some sort of caching in the Android app for lightning fast folder and email access. (The FastMail app is quite attractive, I would prefer using it more.)

FastMail Android app Calendar screen, from the Google Play page.
FastMail Android app Calendar screen, from the Google Play page.

Niggle: Creating an email alias / incoming route automatically creates a new sending identity

FastMail can manage the DNS for any of the custom domains that you assign to it, which is super useful if you don’t already have a DNS service.

I already make use of webfaction’s DNS for all of my domains, so I chose to add DNS records to designate fastmail as the official MX for those domains. (All of this is explained clearly in the FastMail help.)

When you do this, you have to create an email alias for each incoming address you would like to receive mail for (you can also create a catchall, but this could result in more spam arriving in your inbox). For each and every alias, FastMail automatically creates an outgoing (from address) identity. While this is usually quite convenient, I have quite a number of incoming addresses, but I only ever send from a subset of these addresses, so the drop-down list with sending identities became quite unwieldy.

I deleted all of the unnecessary identities. What would help, would be if FastMail were to implement most-used-at-the-top sorting for that drop-down.

Other noteworthy points

Domain setup

For my most important domains, I have set FastMail to be the MX. I have also performed the necessary SPF and DKIM setup: FastMail gives super useful feedback in its configuration screens to help you with this. For these domains, I send mail directly via the FastMail SMTP servers, and mail is delivered directly to FastMail servers. Nice and simple.

Domain setup feedback screen.
Domain setup feedback screen.

For some other email accounts I have with clients, FastMail supports POP fetch from and SMTP send via foreign servers.

iOS Push support

If you use any Apple iOS devices to read your mail, you’ll be pleased to know that FastMail, with help from the big A, fully supports iOS push. This means battery efficient real-time incoming emails to make it even more difficult for you to focus on That One Really Important Thing.

Android contact syncing with CardDAV

With google contacts, syncing on Android just works, and it works really well. To sync my contacts with FastMail’s Address Book instead, I bought the pro version of the CardDAV android app for 24 South African Ront (that’s about EUR 1.5). This works as a sync provider, so once setup, the process is also pretty much transparent.

Final thoughts

So there you have it: A hopefully helpful story, with included mini-review, about my move from GMail to the FastMail service.

So far, my conclusion is that this is a service that is technically more than capable of replacing GMail, even for power users. Furthermore, FastMail’s primary (and in fact only) business model is to charge you money for making sure that you can keep on emailing like a boss. Together, this makes for an offer that I could not refuse.

P.S. Let me know in the comments if you would like me to add anything else to this post.

P.P.S. You can also join the lively Hacker News discussion of this post!

Weekly Head Voices #69: No sugar added.

This time, the head voices are echoing the span of time ending strictly on Sunday, April 27 at 23:59.

I have to break my rule and reach through past the start of that week however. On Wednesday April 16 I had quite a heavy sugar crash. After about 12 cups of coffee, each with a spoon of sugar (as per usual), some chocolates from the Stone Three sweetie jar during lunch ,and two giant coconut crunches at about TU Delft sugar fix time (yes children, I do my best to commemorate the sugar fix, even at 11000 km distance from you), my energy levels dropped through the floor and no amount of coffee could get them close to normal again.

That’s when I decided to stop taking sugar.

On Thursday April 17 I went cold turkey. I’m not taking any table sugar at all, no cookies or sweets (ARGH), and I’m even steering clear of breakfast cereals. Pretty boring, I know. After more than a week of completely unscientific N=1 case “study” experience, I can report that:

  • It took some getting used to my coffee without any sugar.
  • NO MORE  COOKIES. ARGH ARGH ARGH. COME CLOSER SO I CAN BITE YOU.
  • My perceived energy levels seem significantly more stable, and I remain all energetic until late at night. Sometimes I don’t sleep, because I run around in the neighbourhood making growling noises. Sometimes I wake up, miles away from home, with all kinds of gunk under my finger nails. Oh well.

On the topic of quitting, let’s talk about all of those lists we love so much. You should really go read Noeska’s presentation on Productivity, Project Management and Other Important Stuff in her latest status update blog post. Besides all of the Getting Things Done and Pull Yourself Together tools and systems she presents, I was happy to see her talk about the dangers of productivity tools on slide 23, and especially the “doing the right things vs doing things right” dilemma.

You see, I’ve been thinking much about this lately. Usually when I’m doing the most valuable and important things (designing and building new products, learning new programming languages, coming up with brand new ideas for artefacts to build) my email inbox starts overflowing and my todo system (currently todoist, which I do like) stagnates (my todoist karma is currently ZERO. I’m at KARMA ZERO damnit!!). Conversely, when I’m almost at inbox zero and my todoist is under control, it feels great, but I’m tired because I’ve spent all of that time taking care of a bunch of emails and mostly urgent but almost no important tasks.

Some people I’ve chatted with are hardcore enough to make the classification between important and urgent in their lists. However, when I see that list of tasks, my OCDs take over and I go into 100% reactive mode. NO ROOM FOR CREATIVITY.

I’m still thinking about how to solve this problem. I do think that the lists and the systems are really important, because some things do really need doing at certain points in time. For now, I’m still picking the three (or two, or one) most important things to do per day (see Noeska’s presentation, also see pro tip #2 in this 2011 post of mine). Also, what does work remarkably well for me, is maintaining a daily “done” or “I did it” list. Go read this, you can thank me later.

After all of that, the weekend took us to Vaalvlei, a picturesque wine farm just outside of Stanford:

Vaalvlei wine farm, just outside of Stanford.
Vaalvlei wine farm, just outside of Stanford.

Here we were treated to a super-exclusive wine tasting of the Vaalvlei Sauvignon Blanc, 2012 Shiraz Reserve, 2011 Shiraz, Shiraz port, and the top TOP secret Shiraz cognac right from the cask (don’t tell anyone, ok?):

Vaalvlei wine and cognac tasting
Vaalvlei wine and cognac tasting

I can report that these hand-crafted wines and the cognac were all beautiful, but I trust that my friend De Wijnrecensent (aka the Tall Philisophical Neighbour! all secrets are revealed on this blog.) will have more to say about this in a few months time.

Enjoy the rest of the week kids!