Weekly Head Voices #148: Data stylist.

Ridiculously fun trail in Paarl somewhere. (Photo taken by Trail Friend #1. Trail Friend #2 cropped from picture, because no permission to appear on the internets!)

This post covers the week from Monday July 9 to Sunday July 15.

The business part of my week was unfairly dominated by far too much after-work obsessing over programming languages, with which I seem to have an unhealthy (or perhaps not) obsession.

I will externalise some of these thoughts further down in this post.

I’m starting with a weekend / running update, which should be reasonably safe for non-nerds to read. However, after that, the nerd dial will go up to 11 with stuff about tools and programming languages right up to the end of the post.

I would have wanted to use the adjective “face-melting”, but I’m not sure if any intensity of nerdery could ever reach that level.

We can dream.

Weekend running update

Most fortunately the weekend had other plans and supplied us with at least 2.5 parties, the first of which even culminated in a ridiculously fun trail run in the mountains on the winter morning after.

The winter morning sun was just perfect, the company was great, and I had forgotten all forms of performance tracking devices at home.

Readers with bionic eyes might notice the Lunas on my feet.

I have now ran just over 260km in them, but, in a surprise twist to the regular readers of this blog, my biological equipment has still not yet completely adjusted to the new style of locomotion.

The latest victim seems to be one of Tom, Dick and Harry, the tendons running under the medial malleolus of my left foot, also known as that big knob on your inside ankle. Tom (the primary suspect in this case according to Trail Friend #1 who is knowledgable with regard to these matters, being a running foot surgeon and all), Dick and Harry are also known as the *T*ibialis posterior, flexor *D*igitorum longus and the flexor *H*allucis longus.

They currently have to work extra hard to stabilise my feet while running, because, you know, no shoes.

Because doing this thing was not hard enough already, and because the Lunas are perhaps still a bit too cushiony, and because my friend the Very Flat Cat forgot that I’m very suggestible after 11:00 in the morning when my prefrontal cortex takes the rest of the day off, I am now also the very shy owner of a pair of Xero Genesis running sandals:

Image result for xero genesis

The soles are only 5mm thick, and quite hard, being rated for a few thousand miles and all. The upshot of this is that one’s feet have to work even harder than in the Lunas.

My first run in these was amazing: I could feel my feet reacting to every little pebble, and my running style having to adapt even more to the terrain.

However, there was a price to pay for all of that additional terrain feel (and the fact that I took a much longer maiden run than I should have): The next day, the tendons in my feet felt even more (ab)used than usual.

WITH GREAT POWER COMES GREAT RESPONSIBILITY, it seems.

Due to these shoes being so powerful, I have had to resign to introducing Xero running far more gradually than I had initially thought.

Vacation-based-thinking-driven tool sharpening aka The WVV 2018 Data Science Toolbox(tm).

During the previously blogged-about Mpumalanga vacation, the lack of alarms, devices, and other work accoutrements, resulted in there being ample time for staring-into-space-grade thinking sessions.

During one of these thinking sessions, I realised that I had somehow neglected my data science toolbox for a while.

At some point a few years back, I was so into ipython notebooks (what has now become jupyter) that I used them as my main work lab notes modality.

However, in the meantime I had fallen slightly out of love with the computational notebook style of data programming, because I had begun to develop doubts about their role in the analysis pipeline.

interlude 1: jupyter notebooks are nice for initial data exploration, and they’re especially useful for remote computation with embedded graphics. However, that initial momentum of discovery risks devolving into an unwieldy monolith of code snippets, data transformations and experiments. There’s a fine line to be walked between flexible experimentation on the one hand, and version-controlled, time-stamped, permutational and scientific rigour on the other.

interlude 2: I have to apologise for using the term “data science” in a non-comedic context. In spite of the inherent humour, it has turned into a usable blanket term for computational data understanding.

Due to my growing doubts in the order of Jupyter, and due to being occupied with less traditionally data sciencey work projects, I had unfortunately let my data science toolbox gather perhaps a bit too much dust.

Slightly more worrying than falling out of love with the Jupyter Notebooks (I still like them, I’m just not that madly in love anymore), was the more specific issue that I’d even let the datavis parts get a bit dusty.

Anyways.

Although I should probably write a more complete post about this, here is the list of ingredients of the official 2018 WHV Data Science Toolbox(tm):

Programming language and library ecosystem: Python.

This language, in spite of its shortcomings, dominates the data science / machine learning world thanks to its STELLAR ecosystem.

numpy, pandas, scipy, scikit-*, tensorflow, pytorch, keras, cython… this snowball has turned into a pretty sizeable planet.

For this reason, it would be hard to justify any other choice for data science.

However, since I’ve been seeing more of Lisp and the rest of the ever-expanding programming language landscape, I can see (Python’s shortcomings as a programming language) clearly now.

In terms of interactive programming, Python beats the majority of practical programming languages, with Common Lisp being one notable exception.

However, it’s not functional enough, which engenders unnecessarily imperative, side-effecting code.  More specifically, it’s not expression-oriented.

More about this slightly further down. Maybe.

Datavis: Anything, as long as it’s Vega or Vega-Lite.

I spent a few years of my life wrangling d3.js, down to INNARD-LEVEL.

Mike Bostock’s idea of data-element-joins is genius, and internalising it was intellectually satisfying.

I thought that these d3 skillz would serve me well for decades (that’s WEEKS in javascript-time), but it turns out that there’s a new, even smarter kid in town.

(if it’s any consolation, the new kid can be considered the grand-child of d3.js.)

vega and vega-lite are so-called visualization grammars, or visualization DSLs (domain specific languages).

The upshot is that one codes up a chart, or a whole set of linked charts and their interactive behaviour, using a language that was designed for this purpose.

This chart code can be easily shared, or converted into interactive visual representations that can be embedded in applications, online or in print quality documents.

Genius!

With Altair, you can even send your pandas dataframes to vega and vega-lite charts all from the comfort of your slightly defective Python armchair.

Development Environment: PyCharm.

You knew it was not going to be Jupyter Notebooks, but you probably expected it to be Emacs.

Well it’s not. Surprise!

The remote interpreter support in PyCharm enables me to connect to a Python virtual environment anywhere on the planet, which I often do.

The JetBrains wizards have optimised the remote communication of code intelligence, so completion, documentation and general code understanding is almost indistinguishable from that on a completely local project.

Being able to step through a remote PyTorch neural network training iteration with the PyCharm debugger or any other remote Python algorithmics is insightful.

Two notable drawbacks are visualization and long-running jobs.

For the long-running jobs I do tend to use Jupyter Notebooks or when at all possible mosh, which is amazing. However, because the primary modality is not the notebook, my code is versioned and organised into separate libraries which I can call into from notebook or mosh.

For visualization, it’s either connecting to the altair chart server via SSH pipe, dumping the chart to the unison-synced project, and/or a Jupyter Notebook.

The rest.

Of course you use Postgres on an SSD for your data, and of course you know enough SQL to make short work of most of the heavy-weight transformations often required at the start your data crunching pipeline.

For all of my lab notes, reports, books, papers and blog posts, I use Emacs Org mode.

LaTeX math with live preview, live code snippets, SVG graphics, bibtex references, export to anything. This is one of the best ways to document your science.

Programming language addiction update.

I spend far too much obsessing over programming languages, old and new.

For the past two weeks, I wasted even more precious time than usual reading up about programming languages.

Because I would really like to spend more of my time on other, perhaps more valuable activities, I’ve been trying to better define what it is I’m actually looking for.

Of course there is no single best programming language, but a whole set of good languages that map in intricate ways to different problem domains.

In spite of this, I have been pining for a language with, in order of importance:

  1. A Functional Programming DNA, with which I’m referring to a) expression-orientedness, b) a preference for pure functions, and at a higher level, c) the modelling of reality as more or less explicit dataflows.
  2. Interactive programming, with Common Lisp being the textbook example of this.
  3. Great tooling and IDEs, meaning first-class support by something from JetBrains, Microsoft or Emacs.
  4. Great concurrency and parallelism stories.
  5. A great library ecosystem.
  6. Modest memory use.

Having just explicitly written this down for the first time (!! – it was consuming so much glucose just being kept amorphously swirling around in my brain) I can now mentally map some of my most recent language dalliances to these points.

go

This language is far too simple for my taste, but probably really great for teams.

I did recently take a more serious look when setting up a telegram bot using tbot and being amazed at how simple it was building web services like these using goroutines and channels.

Go satisfies points 3 to 6 from the list above. Makes sense that I decided to file this experiment away under “check when you need to put a webservice together REALLY QUICKLY”.

rust

When I saw up that rust, surprisingly, is an expression-oriented language, I flew through the O’Reilly Programming Rust book I had bought previously as part of a bundle.

Evaluating rust by the list above, we award it a fractional 1 because expression-oriented, 3 due to jetbrains plugin amongst others, 4(ish) – great memory safety, but compared to clojure, concurrency and parallelism stories still have much room to grow, a solid 5 thanks to cargo and a very strong 6.

I filed this one away under “re-evaluate whenever you reach for your trusty C++”. (also, actix-web looks amazing for super high performance microservices.)

f#

You didn’t see this one coming, did you?

Very strong 1 to 5 and a solid 6.

WAT?!

I’m currently working my way through Domain Modeling Made Functional by Scott Wlaschin, who is also the author of the brilliant f# for fun and profit website.

In addition to f# hitting all 6 of my 2018 PL-requirements above, I’m slowly starting to see the advantages of having a real type system under the hood.

f# is a member of the ML-family of functional languages, which have their origin in Lisp (some very naughty person removed all of the lovely parentheses I’m afraid…).

I hope that at some point I’ll have the opportunity to use f# in anger, at which point I’ll be able to report more concretely as to its suitability.

The End

Let me know in the comments what you think about any of this, or anything else.

I hope to meet you again in a few days, here or elsewhere.

Weekly Head Voices #133: Onder in my Whiskeyglas.

The legendary Koos Kombuis (aka André Letoit) performing with Schalk Joubert on bass and Vernon Swart on percussion in the Helderberg Nature reserve, eponymous mountain visible through the trees on the right. This was a surprisingly amazing end to the week.

What a week.

It was beautiful to see the whole team step up to the plate and engineer at about 110% throughput (software gets complicated quickly, and there’s always one more thing you need to get done before the deliverable is ready), all the while remaining calm and, most importantly, kind.

Pro-tip Special

I was of course the lucky winner of the manual-writing sub-project. I love writing code, but there’s also something quite satisfying about writing documentation for a technical product. Anyways, there are five tiny but hopefully useful lessons I extracted from this exercise which I would like to present here:

  1. I’ve lamented the sorry state of the Windows console before (in 2011 to be exact). In a surprise twist, the Windows console still sucks almost 7 years later. At least it’s reliable. Anyways, cmder is a great console replacement which makes some of the stupid go away, somewhat.
  2. The Windows 10 built-in screenshot facility … wait for it… sucks. When you’re writing documentation you need a tool that fits into your workflow. Keyboard shortcut – window or region – image ends up in a directory of your choice. Greenshot is an open source screenshotting tool that does this with aplomb.
  3. You need to show a CHM (Windows Help) file to the user of your wxPython application when they hit F1. How hard could it be? Well, you could spend a number of hours trying to come up with a wx-y cross-platform solution, or you could use that time for something else worth your while and just use the Python win32 package to call into the official Windows help API. (cross-platform does work, it’s just really ugly)
  4. Sphinx is a much better tool to write technical manuals than is Markdown and related tools. I briefly considered Markdown because I always have to look up reStructuredText syntax, but fortunately ran into enough other places warning against using Markdown for documentation. For the record, I prefer orgmode over all of these puny formats in most other cases, but the documentation story of Sphinx with reStructuredText is admittedly much better.
  5. Start writing the manual as early as possible. It was amazing to see how this helped me to see the software we are designing at a more integrated (user) level. This knowledge was useful in driving more valuable improvements. If you can’t explain the flow of some procedure in a manual, that’s a good sign the procedure might need some refinement.

Humble Book Bundle and Rust

I bought the Humble Bundle of (O’Reilly) Functional Programming Books for a super affordable $15. I was primarily interested in the Programming Rust book by Blandy and Orendorff, but the other titles on Scala, Clojure, Erlang, Elixir, Haskell, Javascript and general functional programming are welcome additions to my library. Speaking of which, I emailed O’Reilly to ask if the books in the bundle could be added to my member library, which they promptly did!

I have avoided Rust up to now due to natural hype suppression circuitry, and because I grew up with C++, but its zero-overhead memory safety and trustworthy concurrency story makes it hard to ignore any longer. Even although Andrei Alexandrescu once called Rust the language that skips leg day, it’s certainly interesting seeing the constructs the language designers have come up to build a really fast compiled language with the lowest number of foot-guns per line of code.

Anyways, when this blog gets published, you should still have about 22 hours to make use of the Humble Bundle deal if you too see something that you like.

Life is continuous practice

I wanted to conclude with something that I’ve been thinking about recently. It has to do with explicitly treating one’s life as continuous practice. As I’ve mentioned before on this blog and people much smarter than me have been pointing out since forever, goals are no good and (lasting) happiness is probably not attainable.

Discarding as many as possible of these sorts of fetters is liberating (you Buddhist), but can seem to leave holes in one’s  life narrative. However, treating your life as a super long practice session is an interesting perspective.

There is also no end point, and no real life goal.

The only point of the whole exercise (yes, I see what I did there) is to try to improve continuously. Every day, we try to become a little better at our jobs, or at running, or at being a good human, or a partner, or a parent.

Practice means that you have good days and bad days. It means that you sometimes look back and think that you were a better person then than you are now. Practice means that when you pick one activity, another will temporarily languish until you can make time for it again.

All of this is ok, because tomorrow you have a whole new day to try again.

Weekly Head Voices #130-2: Direct experience dopamine.

Photogenic and non-camera-shy dragonfly I met in Paarl over the weekend.

As I went through my notes to extract material for this week’s post, I noticed a small discrepancy between the task description for the previous post and the published version: #129 in my notes versus #130 in the published post!

It’s too late now to rename #130, so in this reality I’m just going to have to deal with the fact that WHV #129 will never exist. I have decided to name this edition #130-2 so that eventually (well, in about a week), we will be back to uninflated post numbers. Nobody likes inflation. Except perhaps tyres. And balloons.

Your brain at work part 2: Dopamine and more mindfulness

Ironically, the incorrectly numbered post #130 dealt with the many ways in which our brains fail us every day. (Now that I’ve finally gotten around to installing the WP Anchor Header plugin, we can link directly down to any heading in any post, as demonstrated in the previous sentence.)

At least some clouds do seem to have a silver lining.

Your Brain at Work, the book I mentioned last week, has turned out to be a veritable treasure trove of practical human neuroscience, and I still have about 30% to go. My attempt at meteorological humour above was inspired by part of the book’s treatment of the important role of dopamine in your daily life.

For optimal results, one is supposed to remain mildly optimistic about expected future rewards, but not too much, which will result in a sharp dopamine drop when those rewards don’t crystallise, and a greater increase when they do. For optimal results, one should try to remain in a perpetual state of mildly optimistic expectations, but also in a state of being continually pleasantly surprised when those expectations are slightly exceeded.

More generally, the book deals really well with the intricacies of trying to keep one’s various neural subsystems happy and in balance. Too much stress, and the limbic system starts taking over (you want to run away, more or less), blocking your ability to think and make new connections, which in this modern life could very well be your only ticket out of Stress Town.

To my pleasant surprise (argh, I’ll stop), mindfulness made its appearance at about 40% into the book, shortly after I had published last week’s WHV.  In my favourite mindfulness book, Mindfulness: A Practical Guide to Peace in a Frantic World by Mark Williams and Danny Penman, two of the major brain states are called doing, the planning and execution mode we find ourselves in most of the time, also in the middle of the night when we’re worrying about things we can do nothing about at that point, and being, the mode of pure, unjudgemental observation the activation and cultivation of which is practised in mindfulness.

In David Rock’s book, these two states are described as being actual brain networks, and they have different but complementary names: The narrative network corresponds to the doing mode, and the direct experience network corresponds to the being mode.

The narrative network processes all incoming sensory information through various filters, moulding it to fit into one’s existing mental model of the world. David Rock describes it in the book and in this HuffPost piece as follows:

When you experience the world using this narrative network, you take in information from the outside world, process it through a filter of what everything means, and add your interpretations. Sitting on the dock with your narrative circuit active, a cool breeze isn’t a cool breeze, it’s a sign than summer will be over soon, which starts you thinking about where to go skiing, and whether your ski suit needs a dry clean.

This is certainly useful most of the time, but it can get tiring and increase stress when you least need it.

The much-more attractively named direct experience network is active when you feel all of your senses opening up to the outside world to give you that full HD IMAX(tm) surround sound VR experience. No judging, no mental modelling, just sensory bliss and inner calm. Rock sez:

When this direct experience network is activated, you are not thinking intently about the past or future, other people, or yourself, or considering much at all. Rather, you are experiencing information coming into your senses in real time. Sitting on the jetty, your attention is on the warmth of the sun on your skin, the cool breeze in your hair, and the cold beer in your hand.

Again, these two systems are on opposite sides of a neurophysiological see-saw. When you are worrying and planning, no zen for you! On the other hand, when you’re feeling the breeze flowing and and through each individual hair on your arms and the sun rays seemingly feeding energy directly into your cells, your stress is soon forgotten.

Fortunately, mindfulness gives us practical tools to distinguish more easily when we’re on which path, and, more importantly, to switch mental modes at will.

I hope you don’t mind me concluding this piece by recursively quoting David Rock quoting John Teasdale, one of the three academic founders of Mindfulness Based Cognitive Therapy (MBCT):

Mindfulness is a habit, it’s something the more one does, the more likely one is to be in that mode with less and less effort… it’s a skill that can be learned. It’s accessing something we already have. Mindfulness isn’t difficult. What’s difficult is to remember to be mindful.

(If the book has any more interesting surprises, I’ll be sure to report on them in future WHV editions.)

Miscellany at the end of week 5 of 2018

  • The rather dire water situation has not changed much, except that due to more citizens putting their backs into the water saving efforts, day zero (when municipal water is to be cut off) has been postponed by 4 days to April 16. We are now officially limited to 50 litres per person per day, for everything. Practically, this means even more buckets of grey water are being carried around in my house every day in order to be re-used.
  • I ran 95km in January, which is nicely on target for my modest 2018 goal. Although January was a long month, and Winter Is Coming (And Then We Run Much Less Often), I am mildly optimistic that I might be able to keep it up.
  • Python type hinting is brilliant. I have started using it much more often, but I only recently discovered how to specify a type which can have a value or None, an often-occurring pattern:
from typing import Optional, Tuple
def get_preview_filename(attachment: Attachment) -> Tuple[Optional[str], Optional[str]]:
    pass
  • On Wednesday, January 31, GOU #3 had her first real (play) school day, that is, without any of us present at least for a while. We’re taking it as gradually as possible, but it must be pretty intense when you’re that young (but old enough to talk, more or less) and all of a sudden you notice that you’re all alone with all those other little human beings, none of which are the family members you’re usually surrounded with.

The End

Thank you dear reader for coming to visit me over here, I really do enjoy it when you do!

I hope to see you next again next week, same time, same place.

 

Weekly Head Voices #128: Water water everywhere, but not a drop to drink.

Hey friends, welcome back!

We have to talk about the water situation, seeing that Cape Town is now in the international news as being on track to be the first major city EVAR to run out of water.

In short, if it doesn’t rain in substantial amounts during the coming three months (which history and projections say it won’t), the municipal water supply will be shut off on April 21, a date festively referred to as Day Zero.

This means when we try to open any tap, no water will come out. This situation might continue for quite a while, which is pretty intense.

On that day, we will be celebrating by dressing up as Kevin Costner and running around barefoot shouting “NOTHING’S FREE IN WATERWORLD!”. Those who are not big fans of Kevin are allowed to dress up as Imperator Furiosa.

At my house, we stopped watering our garden with municipal water months ago. We installed a grey water recovery system: Shower and bath water ends up in the only remaining green corner of the garden.

We also installed a rain water recovery system three months ago, which has fortunately enabled us to collect a few thousand litres of rain water via the rerouted gutters and pipework from the roof. This water we will probably use after Day Zero to be able to wash and to flush a toilet now and then.

(Flushing frequency has necessarily decreased significantly. Around these parts we now have the saying: “If it’s yellow, let it mellow. If it’s br***, flush it down.” Please excuse the mental graphics.)

We have been managing to keep our use of municipal water under the requested 87 litres per person per day. Starting on February 1, we will have to stay consistently under 50 litres per person per day, including drinking, cooking and washing. I guess 2 minute showers were wasting too much of my time in any case.

I have to do more research and corroboration (fingers are being pointed in all directions), but it seems the fundamental issue is not so much the current drought alone, but to a large extent mismanagement by both local and national government. It’s complicated, and politics is involved, so read at least this (otherwise good piece, but author is a DA / local government apologist), this (DA / local government IS to blame) and this (a longer, more balanced piece) to start with.

That being said, I am happy that a large part of the populace has become much more water efficient. If we get through this, in spite of “this” being called “the new normal”, I hope that we retain our mad Dune-grade water saving skills.

With that out of the way, it would be sort of anti-climactic for me to talk extensively about what-I-did-last-week, so I’m going to limit it to a REAL bullet list (ping me in the comments if something interests you):

  • pipenv is the bee’s knees, I have switched my non-miniconda projects.
  • convincingly but fortunately only temporarily locked myself out of my one laptop due to TCG-Opal hardware encryption, UEFI32, UEFI64 and legacy boot incompatibilities. I’m getting old, I used to NOT lock me out of my laptop in my sleep.
  • A compulsive twitch made me fix years of old-style broken youtube shortcodes using the wordpress regex plugin. The regexp you are looking for is /\[youtube\](.*)\[\/youtube\]/ which you can replace with \1.
  • People dislike really smart leaders. See water crisis above for one possible reason why this is a bad thing.
  • In spite of having invested a significant amount of time in deciding on the Office UI Fabric React components for my most major side-project (#38465 if you’ll recall), I switched to Semantic UI React (which was also in the running, together with Palantir’s blueprint, HP’s grommet, Alibaba’s Ant Design of React and more) at the last minute. I am happier now.

That’s it from me for now. Have fun this week kids, I hope to see you soon!

 

Weekly Head Voices #102: High on life.

The week of Monday January 11 to Sunday January 17, 2016 got off to a brilliant start with a business lunch at Bodega, a restaurant that finds itself on the Dornier Wine Estate. The view looked something like this:

view_from_bodega

… and the company was suitably awesome. (This is not the first time that Bodega makes its appearance on this blog, or in the blog-free suburbs of my social calendar. The company might be different every time, but so far its level of awesomeness has been quite consistent.)

The rest of the (work) week was consumed by extreme nerdery, which is of course the way I love it. Besides more GPU shader fine-tuning (at least once I exclaimed on the Stone Three HipChat, hopefully soon the Stone Three Mattermost,  WITNESS THE POWER OF MATH!, before showing a rendering that was marginally better than the one where the POWER OF MATH had not yet been invoked a sufficient number of times), there was Javascript, d3.js (d3 is another fantastic example of what you can do with vectorised thinking and computation) and Python.

In break time I finally took a closer look at C++14 and beyond and came away super impressed. There’s a blog post in the pipeline on generic lambda expressions, because I think they’re brilliant. I don’t know why I love different programming languages so much, but I do.

On Saturday,  I got really high with one of my besties, a superb gentleman who also goes by the name of A Very Flat Cat. We reached this altered state by the old-fashioned but extremely reliable (and cheap!) method of physically increasing our altitude via ambulation up the west peak of the Helderberg. The walk (a few hours in 35 degrees Celsius…) was exhilarating, and the view from the top awe-inspiring. Check it (click for high-res):

20160116_100314-PANO

I’ve often wondered about the effect of one’s surroundings on one’s mindfulness. This was one of those cases where mother nature, without asking for permission or anything like that, simply brute-forced the being switch with her astonishing beauty. Very grateful I was.

Have a great week friends, see you on the other side!