Weekly Head Voices #154: It’s full of flowers!

A view from the West Coast National Park on Langebaan with Schaapen Island visible. No, we were never a Dutch colony.

This was the week from Monday september 10 to Sunday september 17.

Nerd stuff

I fought with VTK renderer window reparenting on three different platforms. Suffice to say that the 2018 is probably also not going to be the year of the Linux desktop.

Serendipitously (seems to be a theme) I came across UMAP, a great new technique for dimensionality reduction which functions in the same space (weak math pun, sorry) as t-SNE.

My first impressions are great because UMAP is fast, it can be trained, and I really enjoyed this recording of its introduction at SciPy 2018:

Outdoorsy stuff

The highlight of my week was undoubtedly the weekend visit to the West Coast National Park to go greet the brand new flowers of spring.

During my morning run I was greeted by a herd of Eland antelope.

Although enormous, they are wary of humans, especially ones running across the savannah in their general direction.

In stark contrast, the ostrich male and female I then ran into were quite vicious, running fairly aggressively to and fro across the the hiking path before me, huffing and puffing. They probably thought that I was a threat to their young.

These birds are not to be trifled with (see for example this section on wikipedia), but I had to push on, so we played the waiting and shuffling game for a few minutes before I could continue.

At least I knew for sure that I would have the privilege of taking an entirely different route home.

Sometimes one’s arrival on the west coast is perfectly timed, and other times not at all, just like life. This time, the flowers were out in full.

There were brilliant fields of yellow, orange and purple, up and down the mountain-sides.

As if the flowers were not sufficient, we were treated with stunning views of the Grecian-blue sea, and with sunsets like these:

Weekly Head Voices #153: pH < 7 dreams.

Looking back at the week from Monday September 3 to Sunday September 9, I present to you the following memories and after-effects.

Aphex Twin never left us

I serendipitously ran into T69 Collapse, the brand new track and video by Aphex Twin.

In the grand tradition of WHV intro art, I have embedded the video above.

Whether you’re a fan or not, I think it’s worth sitting through this one, preferably with the headphones and the video in full screen.

Pro-tip: This is not one of those tracks where the whole thing can be more or less predicted by viewing the first minute. There’s a thing at 1:55 and a second thing at 3:14.

I had to wonder whether the 3:14 was intentional. We’re not much into our biblical references over here as you might know, but you have to recall that Aphex Twin is the guy who, already back in 1999, hid his face in the spectrogram of a music track called:


\[\Delta M_i^{-1} = -\alpha \sum\limits_{n=1}^N D_i [n] \left[\sum\limits_{j \in C[i]} F_{ji} [n-1] + Fext_i [n^{-1}]\right]\]

That’s the actual name of the track (#2 on the famous Windowlicker EP), although most people (plebs!) refer  to it as just Function or Equation. I got sucked down that rabbit hole last night, but no-one on the internet seems to know the true meaning of the equation. Please ask RDJ if you ever run into him.

Anyways, I have embedded \(\Delta M_i^{-1} = -\alpha \sum\limits_{n=1}^N D_i [n] \left[\sum\limits_{j \in C[i]} F_{ji} [n-1] + Fext_i [n^{-1}]\right]\) below for your listening and viewing pleasure. Aphex Twin’s face appears at 5:30.

APFS encryption vs Samsung hardware encryption effective SSD speed

I ran benchmarks on my external Samsung T3 SSD comparing the speed of encrypted APFS to unencrypted APFS with Samsung’s hardware-based full disk encryption.

I used AmorphousDiskMark, BlackMagic Disk Speed Test and plain old iostat whilst copying 30GB of files to and from the disk.

There will probably soon be a detailed blog post over on vxlabs.com about this, but I’ll give you the skinny here:

  • It’s hard to get benchmarks right. BlackMagic gave wildly varying results depending on how many times I let it run its benchmark for example.
  • APFS’s software encryption looks like it causes a performance hit ranging from 5 to about 10%, with outliers in both directions.
  • Emacs can calculate over columns of data, for example from iostat’s standard out, using a simple M-x calc-grab-from-rectangle and M-x calc-vector-mean.

Brave browser and the Basic Attention Token (BAT): This could be big. Or not. It’s at least interesting.

Brave is a new(ish) browser also based on the Chrome engine.

I knew they were doing something with cryptocurrency, and paying or getting paid for the consumption of content and/or advertising, but I was, as you can see, quite vague on the details.

What I learned last week taking it for a quick spin is the following:

Brave out of the box is massively privacy-focused. Without installing any plugins, it blocks every single advertisement and tracking cookie known to humankind. It also automatically switches to secure SSL wherever that’s possible.

More interestingly, in Brave you can opt in to “Brave Payments“, which looks like it might soon be renamed to Brave Rewards, but don’t quote me on that.

One part of this system, is that you as a user contribute a set amount of BAT tokens (these are tokens on the ethereum chain) per month. At the end of each month, Brave will pay out your tokens to the websites that you visited, based on the amount of time you spend on each site.

In this way, publishers can get recompensed for their content in hard cash, without having to resort to advertising. (It does look like Brave also supports the model where advertisers can pay, in BAT tokens of course, for your eyeball time.)

Brave already has 4 million monthly active users (MAU).

If they’re able to grow this user base, and get a significant portion to participate in the payment system, this could be a game changer. Imagine being able to pay your favourite content creators in this seamless way, and being able to switch off ads in  the process!

RunAlyze where have you been all my life?

I publish my runs to Strava, as I have a bunch of friends there, and I like the idea of a social network where you have pay with a bucket of sweat before you’re allowed to say anything.

However, I was also relying on Strava to keep track of my shoe mileage. Recently, it started losing the miles I put on my Xero Genesis sandals (the most unforgiving shoes in the universe), and I was not able to coax the system into correctly tracking those terrible, terrible kilometres.

Because I use HealthFit to push my data to Strava, I took a look at some of its other endpoints and then, again extremely serendipitously, ran into:

RUNALYZE

It’s a site made by two running nerds (and it really shows) from Germany.

It keeps track of my shoes (the goal of this… exercise, bad pun, sorry) but the authors have also implemented a bunch of metrics from academic papers, some metrics of their own, and they show tables of your data sliced and diced in many different ways ON ALL FOUR WALLS of their website.

<Dr Evil voice>It’s breathtaking.</Dr Evil voice>

Anyways, if you’re a running nerd too, you should probably take a peek.

Fin

See you soon brothers and sisters. I am grateful for our time together.

 

Weekly Head Voices #150: The Road not Taken.

Photo of a cotula lineariloba flower, taken by GOU#1, age 12.

This edition of the WHV covers the week from Monday, July 23 up to and including Sunday, July 29.

Running update

Strava says I’ve just passed the 300km threshold in my Luna Mono 2 sandals.

It also says I’ve done 27km in my Xero Genesis sandals, or as I have begun to call them, Xero Tolerance.

You make one mistake, and something will break. You do get to keep all the bloody pieces.

In any case, when I started on this barefoot-style / natural running adventure, I had subconsciously set myself the limit of 200km before evaluating the success of the experiment.

At 200km, the experiment was still unsuccessful (different parts of feet and ankles were taking turns complaining) so I moved the threshold to 300km, with the plan to move it to 400km if required.

I call this The Stubborn Scientific Method(tm): You keep running the experiment (harr harr) until it says what you want it to say.

To be fair, in this specific case an injury would have (and still can), stop the experiment. Most fortunately the muscles, bones and tendons in my feet, ankles and calves, although complaining quite audibly, have held up.

This past Sunday I did a long(ish) run where it felt for the first time like my feet and ankles had finally toughened up enough (and perhaps my form had also improved slightly) to just keep on propelling me forward quietly and efficiently.

Together with the brilliant sunny winter morning conditions, this conspired to reconfigure my face machine into a rather long-lasting grin.

I am carefully optimistic that I might be able to make this specific adventure a more permanent one, and that makes me really happy.

The Emacs Section

NERD-ALERT. SKIP TO THE NEXT SECTION IF YOU ARE NOT INTO TEXT EDITORS!

A friend from work sent me a ZIP file with research data.

I was super surprised that I could easily decompress the ZIP file using Emacs Dired (Dired is of course the file-manager built into Emacs, doh), but that there was no easy way to mark and extract specific files from the archive.

I found an SO answer with a piece of Emacs Lisp code that someone had put together and integrated it with my Emacs.

It worked, but it didn’t default to the opposite Dired file-list pane as all commander-style tools should do, and by default it re-created relative paths, which is the opposite of the default in most two-pane commanders I know.

As is the wont of Emacs users, I reshaped the code ever so slightly to work like I thought it should.

Shaping Emacs Lisp code has a pleasant fluid feeling to it. Code is data, code is configuration, data flows through code.

I’m telling you this story, because it was a nice little reminder of one of the reasons I like this software so much.

You can find my modified version of archive-extract-to-file.el as a github gist.

The Odd Bits of Interesting News Section

  • Differentiable Image Parameterizations, a beautiful machine learning article on Distill that surveys and showcases different techniques for generating beautiful images with deep learning. These networks sort of learn to see in order to solve specific tasks, but you can tickle them in different ways to get them to show you the insides of their visual circuitry, and it’s quite beautiful.
  • The Prophylactic Extraction of Third Molars: A Public Health Hazard is an article which was published all the way back in 2007. It makes the claim that at least two thirds of wisdom tooth extraction are unnecessary. One could say that their only function is to… extract your money. BA DUM TSSSSS! To that I would like to add: WHY DENTISTRY WHY? HAVE YOU NOT HURT US ENOUGH?!
  • A colleague at work emailed this TechCrunch post about a 3D printed neural network that diffracts light going through in order to do its trained inference work on incoming images. Although it’s a retro-futuro-mind-bending idea to do it with a whole neural network, and it smacks of hell-yeah-this-is-what-scifi-promised-me-that-AI-would-look-like, I could not help but recall a certain Very Flat Cat telling us about this sort of passive light-based computation almost 20 years ago.

The Poetry Section

GOU#1 had to select an English poem to recite for class.

From the depths of my memory bubbled up The Road not Taken by Robert Frost.

I had forgotten how much subtlety and recognisable human complexity this poem was able to pack into such a petite little frame. If you have the time, read the analysis linked above after spending some time with the poem itself.

Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;

Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,

And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and I—
I took the one less traveled by,
And that has made all the difference.

Friends, no matter which paths you take this week, I hope that we may meet again.

Weekly Head Voices #149: I forgot to proof-read this.

Part of the Sunday morning trail. Although I really enjoy these, I’m at my happiest running down antelope on the savannah.  Antelope strictly-speaking not required, but those wide open plains on the other hand…

This, the one hundred and forty ninth edition of the Weekly Head Voices, covers the week from Monday July 16 to Sunday July 22 of the year 2018.

This week, we have apple watch running adventures, deep learning in production (finally), yet another focus tip and finally a youtube poetry reading.

Enjoy!

The Apple Watch, Vitality and You

On Monday, I became the owner of a brand new Apple Watch 3, FOR FREE(ish).

I feel that two points are worth mentioning:

  1. Having one’s work macbook unlock automatically as one prepares to put one’s hands on the keyboard, with a sweet little unlock sound emitting from one’s watch, is much more fun than I had expected.
  2. One was looking forward to using third party running apps on the watch, such as iSmoothRun which does real-time reporting of cadence, which can be shown together with a number of other stats on a number of configurable screens a la Garmin . One has had to cancel these plans, because Vitality, the shadowy organisation responsible for the FOR FREE(ish) nature of the watch, only recognises runs submitted by the built-in Workouts app.
    • The September watchOS update will include runtime (haha) cadence, which is great. However, some technical system for the support of third party apps would have been even better. I’ll live.
    • Runs logged with the built-in Workouts app can be easily and automatically submitted to other platforms, such as Strava, where many of my running peeps hang out, and even to one’s own Dropbox in FIT format, with the HealthFit iOS app, a very reasonable once-off purchase.

DeepLearning Inside(tm)

On Friday, we shipped a new version of the most important work project I am currently involved in.

Again I feel that two points are worth mentioning:

  1. We now also have deep learning, albeit a humble example, out in actual production. I was starting to feel a little left out. Anonymous shout-out (because top secret) to the team members who made this happen!
  2. They say one should never deploy or ship on Friday. Because I come from the I-won’t-do-what-you-tell-me generation, I cut the final release on Friday evening after the traditional weekend-starter braai.
    • To be honest, this was only necessary because I had promised our client that we would release, and it was only possible because we have a fairly good test-suite, with end-to-end being most crucial in this specific scenario, and a checklist-style release procedure.

SoBSoDSiT-CIPWOB-FBA

As part of my chaotic but ever-evolving constellation of systems for maintaining work focus, I have renamed the shorter focus blocks approach to the short-but-specially-defined-so-that-completion-is-possible-within-one-block focus blocks approach (SBSDSTCIPWOB-FBA).

This adds the incentive of a small but probable shot of dopamine at the end of the focus block, and sometimes even leads to its unwitting extension by the woefully undersized (not to mention super lazy) rider sometimes sitting atop my mental elephant.

It sometimes feels like I’m slowly reinventing GTD.

(This blog post is an emotional roller coaster ride for me. This is the first time I’m feeling something.)

I used to be a fan of GTD when I still believed that my function in life was to answer emails really quickly, and master multi-tasking.

Since then however, I’ve slowly had to come to the realisation that, at least in my case, the amount of email processed is more or less exactly inversely correlated to the actual value that I produce.

The impotence of proof-reading

The following poetry reading made various subsets of my neurons fire in extremely pleasant ways.

I hope that you experience similar effects. See you next time!

Weekly Head Voices #148: Data stylist.

Ridiculously fun trail in Paarl somewhere. (Photo taken by Trail Friend #1. Trail Friend #2 cropped from picture, because no permission to appear on the internets!)

This post covers the week from Monday July 9 to Sunday July 15.

The business part of my week was unfairly dominated by far too much after-work obsessing over programming languages, with which I seem to have an unhealthy (or perhaps not) obsession.

I will externalise some of these thoughts further down in this post.

I’m starting with a weekend / running update, which should be reasonably safe for non-nerds to read. However, after that, the nerd dial will go up to 11 with stuff about tools and programming languages right up to the end of the post.

I would have wanted to use the adjective “face-melting”, but I’m not sure if any intensity of nerdery could ever reach that level.

We can dream.

Weekend running update

Most fortunately the weekend had other plans and supplied us with at least 2.5 parties, the first of which even culminated in a ridiculously fun trail run in the mountains on the winter morning after.

The winter morning sun was just perfect, the company was great, and I had forgotten all forms of performance tracking devices at home.

Readers with bionic eyes might notice the Lunas on my feet.

I have now ran just over 260km in them, but, in a surprise twist to the regular readers of this blog, my biological equipment has still not yet completely adjusted to the new style of locomotion.

The latest victim seems to be one of Tom, Dick and Harry, the tendons running under the medial malleolus of my left foot, also known as that big knob on your inside ankle. Tom (the primary suspect in this case according to Trail Friend #1 who is knowledgable with regard to these matters, being a running foot surgeon and all), Dick and Harry are also known as the *T*ibialis posterior, flexor *D*igitorum longus and the flexor *H*allucis longus.

They currently have to work extra hard to stabilise my feet while running, because, you know, no shoes.

Because doing this thing was not hard enough already, and because the Lunas are perhaps still a bit too cushiony, and because my friend the Very Flat Cat forgot that I’m very suggestible after 11:00 in the morning when my prefrontal cortex takes the rest of the day off, I am now also the very shy owner of a pair of Xero Genesis running sandals:

Image result for xero genesis

The soles are only 5mm thick, and quite hard, being rated for a few thousand miles and all. The upshot of this is that one’s feet have to work even harder than in the Lunas.

My first run in these was amazing: I could feel my feet reacting to every little pebble, and my running style having to adapt even more to the terrain.

However, there was a price to pay for all of that additional terrain feel (and the fact that I took a much longer maiden run than I should have): The next day, the tendons in my feet felt even more (ab)used than usual.

WITH GREAT POWER COMES GREAT RESPONSIBILITY, it seems.

Due to these shoes being so powerful, I have had to resign to introducing Xero running far more gradually than I had initially thought.

Vacation-based-thinking-driven tool sharpening aka The WVV 2018 Data Science Toolbox(tm).

During the previously blogged-about Mpumalanga vacation, the lack of alarms, devices, and other work accoutrements, resulted in there being ample time for staring-into-space-grade thinking sessions.

During one of these thinking sessions, I realised that I had somehow neglected my data science toolbox for a while.

At some point a few years back, I was so into ipython notebooks (what has now become jupyter) that I used them as my main work lab notes modality.

However, in the meantime I had fallen slightly out of love with the computational notebook style of data programming, because I had begun to develop doubts about their role in the analysis pipeline.

interlude 1: jupyter notebooks are nice for initial data exploration, and they’re especially useful for remote computation with embedded graphics. However, that initial momentum of discovery risks devolving into an unwieldy monolith of code snippets, data transformations and experiments. There’s a fine line to be walked between flexible experimentation on the one hand, and version-controlled, time-stamped, permutational and scientific rigour on the other.

interlude 2: I have to apologise for using the term “data science” in a non-comedic context. In spite of the inherent humour, it has turned into a usable blanket term for computational data understanding.

Due to my growing doubts in the order of Jupyter, and due to being occupied with less traditionally data sciencey work projects, I had unfortunately let my data science toolbox gather perhaps a bit too much dust.

Slightly more worrying than falling out of love with the Jupyter Notebooks (I still like them, I’m just not that madly in love anymore), was the more specific issue that I’d even let the datavis parts get a bit dusty.

Anyways.

Although I should probably write a more complete post about this, here is the list of ingredients of the official 2018 WHV Data Science Toolbox(tm):

Programming language and library ecosystem: Python.

This language, in spite of its shortcomings, dominates the data science / machine learning world thanks to its STELLAR ecosystem.

numpy, pandas, scipy, scikit-*, tensorflow, pytorch, keras, cython… this snowball has turned into a pretty sizeable planet.

For this reason, it would be hard to justify any other choice for data science.

However, since I’ve been seeing more of Lisp and the rest of the ever-expanding programming language landscape, I can see (Python’s shortcomings as a programming language) clearly now.

In terms of interactive programming, Python beats the majority of practical programming languages, with Common Lisp being one notable exception.

However, it’s not functional enough, which engenders unnecessarily imperative, side-effecting code.  More specifically, it’s not expression-oriented.

More about this slightly further down. Maybe.

Datavis: Anything, as long as it’s Vega or Vega-Lite.

I spent a few years of my life wrangling d3.js, down to INNARD-LEVEL.

Mike Bostock’s idea of data-element-joins is genius, and internalising it was intellectually satisfying.

I thought that these d3 skillz would serve me well for decades (that’s WEEKS in javascript-time), but it turns out that there’s a new, even smarter kid in town.

(if it’s any consolation, the new kid can be considered the grand-child of d3.js.)

vega and vega-lite are so-called visualization grammars, or visualization DSLs (domain specific languages).

The upshot is that one codes up a chart, or a whole set of linked charts and their interactive behaviour, using a language that was designed for this purpose.

This chart code can be easily shared, or converted into interactive visual representations that can be embedded in applications, online or in print quality documents.

Genius!

With Altair, you can even send your pandas dataframes to vega and vega-lite charts all from the comfort of your slightly defective Python armchair.

Development Environment: PyCharm.

You knew it was not going to be Jupyter Notebooks, but you probably expected it to be Emacs.

Well it’s not. Surprise!

The remote interpreter support in PyCharm enables me to connect to a Python virtual environment anywhere on the planet, which I often do.

The JetBrains wizards have optimised the remote communication of code intelligence, so completion, documentation and general code understanding is almost indistinguishable from that on a completely local project.

Being able to step through a remote PyTorch neural network training iteration with the PyCharm debugger or any other remote Python algorithmics is insightful.

Two notable drawbacks are visualization and long-running jobs.

For the long-running jobs I do tend to use Jupyter Notebooks or when at all possible mosh, which is amazing. However, because the primary modality is not the notebook, my code is versioned and organised into separate libraries which I can call into from notebook or mosh.

For visualization, it’s either connecting to the altair chart server via SSH pipe, dumping the chart to the unison-synced project, and/or a Jupyter Notebook.

The rest.

Of course you use Postgres on an SSD for your data, and of course you know enough SQL to make short work of most of the heavy-weight transformations often required at the start your data crunching pipeline.

For all of my lab notes, reports, books, papers and blog posts, I use Emacs Org mode.

LaTeX math with live preview, live code snippets, SVG graphics, bibtex references, export to anything. This is one of the best ways to document your science.

Programming language addiction update.

I spend far too much obsessing over programming languages, old and new.

For the past two weeks, I wasted even more precious time than usual reading up about programming languages.

Because I would really like to spend more of my time on other, perhaps more valuable activities, I’ve been trying to better define what it is I’m actually looking for.

Of course there is no single best programming language, but a whole set of good languages that map in intricate ways to different problem domains.

In spite of this, I have been pining for a language with, in order of importance:

  1. A Functional Programming DNA, with which I’m referring to a) expression-orientedness, b) a preference for pure functions, and at a higher level, c) the modelling of reality as more or less explicit dataflows.
  2. Interactive programming, with Common Lisp being the textbook example of this.
  3. Great tooling and IDEs, meaning first-class support by something from JetBrains, Microsoft or Emacs.
  4. Great concurrency and parallelism stories.
  5. A great library ecosystem.
  6. Modest memory use.

Having just explicitly written this down for the first time (!! – it was consuming so much glucose just being kept amorphously swirling around in my brain) I can now mentally map some of my most recent language dalliances to these points.

go

This language is far too simple for my taste, but probably really great for teams.

I did recently take a more serious look when setting up a telegram bot using tbot and being amazed at how simple it was building web services like these using goroutines and channels.

Go satisfies points 3 to 6 from the list above. Makes sense that I decided to file this experiment away under “check when you need to put a webservice together REALLY QUICKLY”.

rust

When I saw up that rust, surprisingly, is an expression-oriented language, I flew through the O’Reilly Programming Rust book I had bought previously as part of a bundle.

Evaluating rust by the list above, we award it a fractional 1 because expression-oriented, 3 due to jetbrains plugin amongst others, 4(ish) – great memory safety, but compared to clojure, concurrency and parallelism stories still have much room to grow, a solid 5 thanks to cargo and a very strong 6.

I filed this one away under “re-evaluate whenever you reach for your trusty C++”. (also, actix-web looks amazing for super high performance microservices.)

f#

You didn’t see this one coming, did you?

Very strong 1 to 5 and a solid 6.

WAT?!

I’m currently working my way through Domain Modeling Made Functional by Scott Wlaschin, who is also the author of the brilliant f# for fun and profit website.

In addition to f# hitting all 6 of my 2018 PL-requirements above, I’m slowly starting to see the advantages of having a real type system under the hood.

f# is a member of the ML-family of functional languages, which have their origin in Lisp (some very naughty person removed all of the lovely parentheses I’m afraid…).

I hope that at some point I’ll have the opportunity to use f# in anger, at which point I’ll be able to report more concretely as to its suitability.

The End

Let me know in the comments what you think about any of this, or anything else.

I hope to meet you again in a few days, here or elsewhere.