Müller & Guido, Introduction to Machine Learning with Python (O’Reilly, 2016)

From O’Reilly and others, there’s been a profusion of data science books in the past few years. Given that many of these books are intended to introduce readers to data science methods and tools, it’s perhaps unsurprising that many of these books overlap at various points: you’ve got to introduce the reader to NumPy, pandas, matplotlib and the rest somehow, after all.

Müller & Guido’s Introduction to Machine Learning with Python is distinct from many of these other works in both its stated aims and in its execution. In contrast to many of the more introductory books on data science, Müller & Guido give readers with a serious interest in the practice of machine learning a thorough introduction to scikit-learn. That is to say, their Introduction largely eschews coverage of the data science tools often treated in introductory data science texts (though they briefly note the other tools they draw upon in Chapter 1). At the same time, because their book focuses on practice and scikit-learn, they neither discuss the mathematical underpinnings of machine learning, nor do they cover writing algorithms from scratch.

What is here is a comprehensive overview of things already implemented in scikit-learn (which is a considerable amount, as they show). More precisely, they focus on classification and regression in supervised learning, and clustering and signal decomposition in unsupervised learning. If your interest falls in those areas (particularly the former), their coverage is quite good. Chapters 2 and 3 discuss the algorithms for supervised and unsupervised learning respectively, and in considerable detail. That said– and though it’s somewhat less thorough– I might turn to the discussion of some of the same algorithms in Chapter 5 of VanderPlas’ Python Data Science Handbook before Müller & Guido’s; VanderPlas’ treatment is more conversational and less dry. (Note, however, that Müller & Guido do cover more territory.) Similarly, I was left wanting more from Chapter 7’s coverage of working with text.

Müller & Guido’s book really shines, though, when it discusses all of the other things that go into machine learning, beyond their march through the algorithms themselves. Chapter 4 discusses ways to numerically model categorical variables, also (briefly) covering ANOVA and other techniques of feature selection; Chapter 5 covers cross-validation and techniques for carefully tuning model parameters; Chapter 6 compellingly explains the importance of using the Pipeline class to prevent data leakage (during preprocessing, for example); and Chapter 8 discusses where scikit-learn and Python fit within the wider horizons of machine learning. The strongest parts of the book, then– and the parts where it’s the most fun to read– are where Müller & Guido discuss the practical details of machine learning. (One wonders if they felt a bit hamstrung by avoiding the mathematics of the algorithms they discuss.) There are points where the book is less engaging than other introductory data science books, but then it’s not really in the same category; rather than an introductory overview of the entire landscape, Müller & Guido provide a clear, comprehensive, detailed guidebook to one particular part of the map.


The Meh-ness of DA:I

What is wrong with Dragon Age: Inquisition? DA:I appears to have many of the right ingredients to make a satisfying game: a broad open world with different regions, characters, factions, some decent dialogue choices… And yet, much of the experience of playing DA:I resembles watching paint dry. What’s wrong with this game?

A few things, at least.

First, Bioware went with the most generic, least interesting overarching plot they could have to tie together an open world game: there are generic defined demons spilling out into the world from portals throughout the world! Please, explore the countryside, interact with the often nonplussed locals, and close these portals whenever you might feel the inclination! It’s a pretty shameless rehash of Oblivion, made still more generic by the fact that none of these demons are interesting in the least. DA:O had demons, but it did a nice job of characterizing them: they had personalities, they sometimes gave people what they wanted to get their way, and they illustrated a somewhat sophisticated, nuanced approach to a demon-haunted world. DA:I (as far as I can tell, almost 20 hours in) lacks all of these niceties.

Second, the “open world” is not much of an open world, by modern gaming standards. First, the regions themselves are pretty weak: Oh, here’s a vaguely French region! It rains a lot in this other part of the world! Wowee, forests and wild animals! But more than that, because of the way that gameplay and fighting work (more on that shortly), you’re always tied to your camps, the generic Inquisition, and the dry-as-dust plot. You can’t just stock up on potions, head for the hills, and go exploring: healing potions have to be restocked in your camps, you can’t rough it by resting out in the wilderness, and regions have to be unlocked by fulfilling the missions of the Inquisition. Even unexplored, difficult bits of territory are blocked by impassable landscape and near-unbeatable enemies. Skyrim is far from perfect, but it occasionally gives you the experience of coming upon extremely difficult enemies, seeing them before they see you, and then backing away very, very slowly (or getting seen and swiftly killed). For a nominally open world game, DA:I is annoyingly railroaded.

And finally, the gameplay and combat. The other sins of DA:I could be overlooked if it weren’t for the sheer boredom of its gameplay and combat. Tanks get some tweaking: tanks get armor when they successfully taunt enemies. As little sense as that makes from any logical perspective (and wargaming grognards are surely spinning in their graves), it does make combat with a tank a bit less annoying. That said, it’s the only tweak to a system that has little to recommend it. In DA:O, the choice of which companions to bring often presented interesting tactical options: bring Wynne along to play defense, or play all-in offense with AoE heavy hitters? Furthermore, many of the characters from DA:O have personalities and tactical roles when you meet them. The “characters” in DA:I largely lack any such niceties beyond their basic classes, which means Vivienne could be either a offensive badass or a defensive force; which role she might prefer or be better at, I have absolutely no idea. If the game is going to give you utterly generic NPCs, why stop at four? Why not allow the flexibility that comes with full, D&D-style parties?

Carefully defined tactical roles are unlikely to be a concern, however, because the difficulty level of the game is broken: most combats are entirely too easy on normal difficulty. The next difficulty level up does not change the character of combat much, but means that you’re going to burn through your (limited) potions and maybe see an NPC or two get knocked out in the fight because of their idiotic AI. This seems like a minor flaw, but a core feature of mainstream RPGs has been taking a character from impotent rat killer to demigod (or god– cf. Morrowind): how did they miss that?

Furthermore, DA:I eschews the dragon shouts and perks that add flavor to the experience of Skyrim and the modern Fallouts. But the broken difficulty level means that any time thinking through skills and specializations is wasted: who cares if your tank can tank incrementally better? Who cares if your mage has a slightly beefier attack?

If the tactical system weren’t hugely tedious to use, you might want to manage combat a bit more carefully. Overall, though, it’s best to just turn the difficulty level down and get through the combat as quickly as possible.

My feelings for this game don’t rise to the level of hatred; like a lot of Bethesda games, it pretty clearly represents an effort to split the baby, with some pandering to the market, a handful of things for Bioware diehards, and some experimentation around the edges. It is surely disappointing, however, especially in a franchise that had a lot of potential at the beginning.

Why Twitter has lost; Or, against brevity

In the wake of the rumors that Twitter was going to up its character limit, I started spiffing up my Twitter profiles: I added a few photos, started adding people to my various lists, and even started using it a bit more. Then, of course, it seems that those rumors provoked such a backlash within the hardcore Twitter community that Jack Dorsey was forced to shelve any modifications to the format. Here we have, in a nutshell, the reason that Twitter has lost: it’s utterly unwilling to make any modifications to its established product that might make it attractive or useful to those who aren’t already committed users.

For better or worse, for example, I’m friends with a lot of colleagues on Facebook. This is annoying– sometimes I just want to post something silly or random, and it’s annoying that I have so many professional colleagues mixed up in my FB. (And yes, I know there are ways to tweak that, but who has the time?)

In a way that’s only rivaled by a very few high quality email listservs, Facebook is the place I go to hear what people in my field are talking about and working on (and from a usability standpoint, it’s actually easier to skim and follow discussions on Facebook than in my Gmail). My colleagues make comments about work they’ve been doing, share fellowship, grant, and job postings, pose questions, and generally take advantage of the fact that we’re all working on our computers N hours a day. My colleagues from grad school have a really phenomenal little group that often contains very specialized questions: requests for bibliography, questions about translations, and the like. It would be nice, in some ways, if some of these discussions were on Twitter: we could draw on the breadth of Twitter’s userbase, have discussions in real time to a greater extent, and get away from some of the ickiness that attaches to FB (and perhaps bring in people who stay away from FB because of said ickiness).

But just as a for instance, I was messing around this morning with looking at the character length of these discussions. These aren’t Tolstoyan ruminations or Herodotean digressions: most of these discussions are sparked by a brief, sometimes humorous comment someone has made about their work or something they’ve found in their research. Perhaps unsurprisingly, almost all of them are over 140 characters. Even just setting up the necessary context for many of these comments takes more than 140 characters. The only comments by my colleagues that fall under the 140 character limit are quick, humorous, and usually relate to popular culture (and so don’t have anything to do with professional communication at all).

Let’s be clear here: these are professional writers, and ones who’ve had a lot of success, too. These are people who write books and articles, and who and communicate for a living, who– as their posts make clear– are constantly engaged in the process of moving their ideas from insights to well crafted arguments, and for a variety of audiences, too. The argument that these people are incapable of concision and brevity strikes me as completely off base. (I make no such claims about my own capacity for brevity, however.)

The reality, I think, is that Twitter works great for subjects where everyone knows what you’re talking about: if you’re just railing about the latest idiotic or offensive thing that Trump has said, or some piece of celebrity gossip (and we all do), you don’t need any context. If you’re have something to say on subjects that require context or nuance, forget it.

But it’s not just that: I’m frequently astonished at how often Twitter falls down at its core functions: many, many times the most salient or compelling quotation on the news just won’t fit into 140 characters: I found a great analysis of some of the religious freedom legislation that’s been going through legislatures around the country a while ago, but the best quotations from and the core insights of the article just wouldn’t fit into 140 characters, and so I never ended up posting that analysis.

The result is that other services are eating Twitter’s lunch. Facebook, as I said, is pretty standard for a lot of scholars in my field. People in visually or design-oriented fields make a lot more use of Instagram than they do of Twitter. But it’s more than the fact that people self-select into platforms tailored to what they do instead of Twitter: it’s that these platforms are constructed in such a way that allows for novel kinds of use, and meaningful discussions beyond (and perhaps in spite of) the intentions of the platform’s designers in particular. I was surprised by how much substantive discussion there was on Instagram, for example, after the Freddie Gray murder and the unrest in Baltimore, and in a way that totally changed my feelings about the platform. Facebook allows for (even if it does not always brilliantly facilitate) real moments of connection: a friend going through medical difficulties, contact after a long period of disconnection, political debate that (sometimes) goes beyond kneejerk reaction. (And these are just things that have happened to me in the past week or so.) Every time I’ve tried to recommit to Twitter, I’ve had the opposite reaction: a lack of users beyond a narrow band of journalists, technology writers, and bots; friends with accounts who never post anything (and whose tweets get lost pretty quickly in the maelstrom); and, above all, the utter lack of any meaningful contact or communication through the platform, and the sheer disinterest of the company in fostering it. Twitter’s decision to stick with the current design of their broken platform may keep its users in the short term, but will do little to win anyone else over.

The Baldur’s Gate series

So I’m finally playing through Baldur’s Gate 2 in earnest, after several started (and aborted) attempts. It has me reflecting on what I think of the series as a whole, especially given the series’ hallowed status among diehard CRPG players.

To begin with, there are really a lot of things I like about the Baldur’s Gate series. I think the Infinity Engine is nicely polished and allows for a lot of flexibility in the way the game is played. I enjoy the flexibility of the magic system, although I think it feels a bit clunky compared to later Bioware magic implementations (I’m thinking of DA:O in particular). And one of the clearest strengths of the series, I think, is the way combat manages to feel both tactical and satisfying, but also with a sense of the underlying mechanics– in DA:O, for example, the combat system always feels a bit overly hectic and chancy; in BG and BG2, by contrast, combat seems like a puzzle that can be cracked with the right application of spells and tactical deployment.

For me, though, the BG games seem less successful in comparison to later Bioware offerings. For whatever reason, the characters seem more one dimensional, the plots a bit flimsier, the ideas a bit less compelling. By contrast, DA:O takes one of the laziest tropes in CRPGs and executes it with memorable characters, compelling places, and satisfying fights (despite their tactical limitations). (I’ve frequently been impressed at the way that the DA games manage to make the lore of the DA world memorable and meaningful.) The BG games shoot high, but never quite get there. Maybe some of that is precisely because of the tactical depth of the game– there’s an ever-present temptation to minmax a little bit more, to tweak your party composition, to rethink how your spellcasters are kitted out.

At the end of the day, I think the BG games are ultimately a close CRPG relation to those classic D&D modules you’ve got in a box somewhere in your attic. When they work, they give you a nicely packaged adventure with a memorable place and some good fights (and maybe some sweet, sweet loot, too). But the contrast here is telling, too, because the strength of tabletop games is that they can be played in a whole host of ways: you can come up with novel strategies to defeat an enemy or solve a quest, or the game can even drift substantially from the author’s original conception. Bored of combat, but love the color text? Your DM can adjust. Hate the color text and want to skip to the combat? Your DM can adjust to that, too. (At one point, I had a memorable DM who would just say “color text, color text” whenever running D&D modules.) But the BG games lack this flexibility: their plots are on rails and their problems often have one solution. They’re an achievement, to be sure, but one that often seems limited in hindsight.

Cranky about OS X

So I needed a laptop for work, and, perhaps against my better judgment, I decided to go with a Macbook Pro. I’ve had mixed experiences with PC laptops, where I think build quality varies wildly and it’s hard to consistently get a good piece of hardware (though I am somewhat curious about the Dell XPS ultrabook line, though unenthusiastic about its weak keyboard).

Along with this is that I’ve always liked the fact that OS X is built on top of Unix; not a variety of Unix I know well, to be frank, but still. What do a few cosmetic changes matter compared to a shared POSIX heritage?

Welp. I am now regretting my decision, and to a not inconsiderable degree. Where to begin? The lack of a decent package manager? The fact that in El Capitan Apple has made the Disk Utility completely worthless for partitioning? The fact that it’s impossible to write an .iso to a flash drive without mucking around with the dd command?

My naive belief was that OS X is an operating system that hides some of the complexity of its internals from the end user, but that makes up for that with a polished user experience; my hope was that, beneath the hood, OS X still had a lot of good functional tools for the power user. I’m finding myself somewhat disappointed in the first respect, and deeply frustrated in the latter. The iOS-ification of OS X is much, much worse than I thought, and has really made me miss a lot of the functionality of Linux. Hell, I’d settle, in some respects, for the advances that Windows has made in 7 and 10.

Dragon Age: Origins, cold turkey edition

In general, I am a better starter than a finisher, and I always find that when I have a big project to finish up, I have to stop doing the things I am normally doing so that I can concentrate. At the moment, I have had to stop playing DA:O while I finish a project that needs to get done in the next week. Tomorrow or so, it will be out of my system and I’ll be able to have a clear head about the task at hand. Today, though, I still have that residual twinge of wanting to fire up DA:O and have a nice tactical battle to blow off some steam.

There are a lot of things I like about DA:O. The combat is about as difficult as it should be: tough, but not as brutal as some of the Infinity Engine games were at points. Much of the dialogue is well written, and there are some really thoughtful, interesting encounters along the way (the characterization of the various demons is particularly nice). I think it’s easier to appreciate the dialogue in DA:O than it is in the Infinity Engine games. For one, it’s always tempting to skim, but I also think the Bioware games may be better written now than they used to be. On occasion, the Bioware games achieve a level of polish that most other games gesture towards but never quite achieve.

Four character slots seemed too few at first, but I think one strength of DA:O is that most NPC combinations are actually quite viable– even the characters that are less powerful add interesting play dynamics.

It’s not perfect, of course. Rogues are a particular weakness, I think. They’re much less fun than a lot of other RPGs– they have less to do, have fewer interesting tactical options, and the stealth experience is somewhat underwhelming. There’s less tactical depth to playing mages than in the Infinity Engine stuff, but that’s made up for by the visceral fun of spellcasting. And finally, their decision to go with a threat system is disappointing. I get the appeal– aggro-based combat systems make for good action sequences, when your opponent turns on the mage who’s just smacked them– but it makes for combat that’s too removed from the wargaming roots of RPGs for my taste.

Playing DA:O has brought me to a realization, though, which is that I’m just generally weak at coming up with good characters. I would like to link it to what Auden says about poets (I wanted to be one, once!) in his letter to Lord Byron: “His sense of other people’s very hazy.” I could make some claim that some of my male characters are Don Draper figures, ciphers that allow for the thinking-through of ideas of masculinity. That’s probably bullshit, honestly– I think they’re really just recycled noir/Hemingway/Western characters, because I’m kind of “unobservant, immature, and lazy” by temperament. (Though I hope that there’s some nuance there.)

I think it’s particularly difficult to play a CRPG with a character concept, though. For one, pragmatic concerns often deform my character concepts: I frequently want to hit as much content in on a single playthrough, and therefore my morally dubious mage has a spate of pious behavior, or my paladin takes a chaotic neutral turn halfway through. On the other hand, it’s hard to form compelling, independent narratives within a Bioware-style CRPG narrative, and maybe in CRPGs in general. You can have a character trajectory in mind, but follow-through becomes difficult without opportunities for your character to move in that direction. You can’t throw in with a particular faction unless the game has specifically allowed for that possibility, for example, and there’s little occasion for your character to show her animus against the goblins who killed her brother if it never comes up (or if you just kill them like everything else). At best, you can make your character into a broad archetype: paladin, mercenary, asshole. I appreciate that DA:O compels you to make role-playing choices, but you’re still left with a broad, fluid character at the end of the day.

Klemens, 21st Century C (2nd ed., O’Reilly, 2014)

As everyone knows, Mark Twain defined a classic as a book that everyone wants to have read and no one wants to read. Everybody who does some programming knows that K&R is a classic, by any standard– it’s the Rosetta stone of modern C programming, but it also helps to clarify the design principles (and of course the syntax) of many of the modern programming languages that are derived from C. Even beyond languages with a clear C lineage, it’s easy to see the way that a whole host of other modern programming languages have been written to simplify things that are tedious or risky in C. At the same time– and like a lot of other people, I’d imagine– K&R sits on my shelf and stares at me most of the time. When I open it, sometimes I think “What an impressively written book! What concision and clarity!” Most of the time, though, I think, “Wow, how gnarly– thank God for Python!”; or I wonder whether any of the details K&R fuss over are relevant today.

Ben Klemens’ 21st Century C is intended to resolve some of this shock, and to serve as an introduction to modern ways of working in C. The first part of the book presents tools and best practices for this (including debugging, testing, and version control), while the second half discusses how to write modern C– C in a world where, among other things, the rigid memory constraints taken for granted in K&R no longer apply. Some parts of the book (like the discussion of pointers) are clearly meant as an introduction or refresher for readers who aren’t comfortable in C, and the book includes a handy appendix on the basics of C. Other parts live up to the book’s billing as an explanation of what has changed in the world of C. The discussion of new structures in modern C, for example, is a highlight of the book. Klemens’ discussion of string handling in Chapter 9 was also interesting, though briefer than it might have been. (Perhaps with good reason: as someone who works almost exclusively with strings– and even though Unicode in modern languages isn’t always fun, either– I remain unconvinced that working with strings in C is something I want to do on a regular basis.)

As my comments above suggest, I am not an experienced C programmer (despite the occasional stab at the exercises in K&R), and am thus rather unqualified to pass judgment on the soundness of any of Klemens’ code. I can only assume that the infelicities and problems mentioned in reviews of the first edition of the work have been resolved. As a C tyro, though, I felt that Klemens effectively explains the ways that different practices– and the C standards, as well– have evolved over the years. It would be tempting, I think, for the book to remain at the level of vague generalities, but the book strikes a nice balance between high-level discussion of the way C programming has changed over the years and detailed discussion of what’s going on under the hood. It helps immensely, I think, that Klemens has a light, humorous touch– he notes that the manual memory model “is why Jesus weeps when he has to code in C”– and the humorous asides help to leaven some of the necessarily technical passages of the book.

Klemens’ book has the unenviable task of competing with K&R, and there are parts where 21st Century C suffers for the comparison. I still prefer K&R’s discussion of pointers; and I felt that there were a handful of sections that add little to what’s already in K&R. Klemens is fond of comparing C to punk rock, and upon reflection, I believe the comparison is an apt one. To push the metaphor further, there are ways in which K&R is, like a classic punk album, indelible in its simplicity and directness. To my mind, Klemens’ book is a worthy attempt to take that simplicity and directness and make it speak to a changed world. Klemens’ book isn’t perfect; if we’re honest with ourselves, though, even the hardiest classics aren’t always, either.