<?xml version="1.0" encoding="utf-8"?>

<feed xmlns="http://www.w3.org/2005/Atom">
<title>David Crawshaw</title>
<subtitle>Blog</subtitle>
<link href="https://crawshaw.io/atom.xml" rel="self" />
<link href="https://crawshaw.io" />
<id>urn:uuid:6055e6b0-8eb1-4046-999e-2c021e87824e</id>
<updated>2024-02-06T19:28:03Z</updated>

	<entry>
	<title>jsonfile: a quick hack for tinkering</title>
	<link href="https://crawshaw.io/blog/jsonfile" />
	<id>https://crawshaw.io/blog/jsonfile</id>
	<updated>2024-02-06T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>jsonfile: a quick hack for tinkering</h1>

<p><em>2024-02-06</em></p>

<p>The year is 2024.
I am on vacation and dream up a couple of toy programs I would like to build.
It has been a few years since I built a standalone toy, I have <a href="https://crawshaw.io/blog/remembering-the-lan">been busy</a>.
So instead of actually building any of the toys I think of, I spend my time researching if anything has changed since the last time I did it.
Should pick up new tools or techniques?</p>

<p>It turns out lots of things have changed!
There’s some great stuff out there, including decent quorum-write regional cloud databases now.
Oh and the ability to have a fascinating hour-long novel conversation with transistors.
But things are still awkward for small fast tinkering.</p>

<p>Going back in time, I struggled constantly rewriting the database for the prototype for Tailscale, so I ended up writing my in-memory objects out as <a href="https://tailscale.com/blog/an-unlikely-database-migration">a JSON file</a>.
It went far further than I planned.
Somewhere in the intervening years I convinced myself it must have been a bad idea even for toys, given all the pain migrating away from it caused.
But now that I find myself in an empty text editor wanting to write a little web server, I am not so sure.
The migration was painful, and a lot of that pain was born by others (which is unfortunate, I find handing a mess to someone else deeply unpleasant).
Much of that pain came from the brittle design of the caching layers on top (also my doing), which came from not moving to an SQL system soon enough.</p>

<p>I suspect, considering the process retrospect, a great deal of that pain can be avoided by committing to migrating directly to an SQL system the moment you need an index.
You can pay down a lot of exploratory design work in a prototype before you need an index, which n is small, full scans are fine.
But you don’t make it very far into production before one of your values of n crosses something around a thousand and you long for an index.</p>

<p>With a clear exit strategy for avoiding big messes, that means the JSON file as database is still a valid technique for prototyping.
And having spent a couple of days remembering what a misery it is to write a unit test for software that uses postgresql (mocks? docker?? for a database program I first ran on a computer with less power than my 2024 wrist watch?) and struggling figuring out how to make my cgo sqlite cross-compile to Windows, I’m firmly back to thinking a JSON file can be a perfectly adequate database for a 200-line toy.</p>

<h1>Consider your requirements!</h1>

<p>Before you jump into this and discover it won’t work, or just as bad, dismiss the small and unscaling as always a bad idea, consider the requirements of your software.
Using a JSON file as a database means your software:</p>

<ol>
<li>Doesn’t have a lot of data. Keep it to a few megabytes.</li>
<li>The data structure is boring enough not to require indexes.</li>
<li>You don’t need something interesting like full-text search.</li>
<li>You do plenty of reads, but writes are infrequent. Ideally no more than one every few seconds.</li>
</ol>

<p>Programming is the art of tradeoffs.
You have to decide what matters and what does not.
Some of those decisions need to be made early, usually with imperfect information.
You may very well need a powerful SQL DBMS from the moment you start programming, depending on the kind of program you’re writing!</p>

<h1>A reference implementation</h1>

<p>An implementation of jsonfile (which Brad called JSONMutexDB, which is cooler because it has an x in it, but requires more typing) can fit in about 70 lines of Go.
But there are a couple of lessons we ran into in the early days of Tailscale that can be paid down relatively easily, growing the implementation to 85 lines.
(More with comments!)
I think it’s worth describing the interesting things we ran into, both in code and here.</p>

<p>You can find the implementation of jsonfile here: <a href="https://github.com/crawshaw/jsonfile/blob/main/jsonfile.go">https://github.com/crawshaw/jsonfile/blob/main/jsonfile.go</a>. The interface is:</p>

<pre><code class="language-go">type JSONFile[Data any] struct { … }

func New[Data any](path string) (*JSONFile[Data], error)
func Load[Data any](path string) (*JSONFile[Data], error)
func (p *JSONFile[Data]) Read(fn func(data *Data))
func (p *JSONFile[Data]) Write(fn func(*Data) error) error
</code></pre>

<p>There is some experience behind this design.
In no particular order:</p>

<h2>Transactions</h2>

<p>One of the early pain points in the transition was figuring out the equivalent of when to <code>BEGIN</code>, <code>COMMIT</code>, and <code>ROLLBACK</code>.
The first version exposed the mutex directly (which was later converted into a RWMutex).</p>

<p>There is no advantage to paying this transition cost later.
It is easy to box up read/write transactions with a callback.
This API does that, and provides a great point to include other safety mechanisms.</p>

<h2>Database corruption through partial writes</h2>

<p>There are two forms of this. The first is if the write fn fails half-way through, having edited the db object in some way. To avoid this, the implementation first creates an entirely new copy of the DB before applying the edit, so the entire change set can be thrown away on error. Yes, this is inefficient. No, it doesn’t matter. Inefficiency in this design is dominated by the I/O required to write the entire database on every edit. If you are concerned about the duplicate-on-write cost, you are not modeling I/O cost appropriately (which is important, because if I/O matters, switch to SQL).</p>

<p>The second is from a full disk. The easy to write a file in Go is to call os.WriteFile, which the first implementation did. But that means:</p>

<ol>
<li>Truncating the database file</li>
<li>Making multiple system calls to <code>write(2)</code>.</li>
<li>Calling <code>close(2)</code>.</li>
</ol>

<p>A failure can occur in any of those system calls, resulting in a corrupt DB.
So this implementation creates a new file, loads the DB into it, and when that has all succeeded, uses <code>rename(2)</code>.
It is not a panacea, our operating systems do not make all the promises we wish they would about rename.
But it is much better than the default.</p>

<h2>Memory aliasing</h2>

<p>A nasty issue I have run into twice is aliasing memory. This involves doing something like:</p>

<pre><code class="language-go">list := []int{1, 2, 3}
db.Write(func() { db.List = list })
list[0] = 10 // editing the database!
</code></pre>

<h2>Some changes you may want to consider</h2>

<p><strong>Backups.</strong> An intermediate version of this code kept the previous database file on write.
But there’s an easier and even more robust strategy: never rename the file back to the original.
Always create a new file, <code>mydb.json.&lt;timestamp&gt;</code>.
On starting, load the most recent file.
Then when your data is worth backing up (if ever), have a separate program prune down the number of files and send them somewhere robust.</p>

<p><strong>Constant memory.</strong> Not in this implementation but you may want to consider, is removing the risk of a Read function editing memory.  You can do that with View* types generated by the <a href="https://github.com/tailscale/tailscale/blob/main/cmd/viewer/viewer.go">viewer</a> tool. It’s neat, but more than quadruples the complexity of JSONFileDB, complicates the build system, and initially isn’t very important in the sorts of programs I write. I have found several memory aliasing bugs in all the code I’ve written on top of a JSON file, but have yet to accidentally write when reading. Still, for large code bases Views are quite pleasant and well-worth considering about the point when a project should move to a real SQL.</p>

<p>There is some room for performance improvements too (using cloner instead of unmarshalling a fresh copy of the data for writing), though I must point out again that needing more performance is a good sign it is time to move on to SQLite, or something bigger.</p>

<p>It’s a tiny library.
Copy and edit as needed.
It is an all-new implementation so I will be fixing bugs as I find them.</p>

<p>(As a bonus: this was my first time using a Go generic! 👴 It went fine. Parametric polymorphism is ok.)</p>

<h1>A final thought</h1>

<p>Why go out of my way to devise an inadequate replacement for a database?</p>

<p>Most projects fail before they start.
They fail because the
<a href="https://en.wikipedia.org/wiki/Activation_energy">activation energy</a>
is too high.
Our dreams are big and usually too much, as dreams should be.</p>

<p>But software is not building a house or traveling the world.
You can realize a dream with the tools you have on you now, in a few spare hours.
This is the great joy of it, you are free from physical and economic constraint.</p>

<p>If you start. Be willing to compromise almost everything to start.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>new year, same plan</title>
	<link href="https://crawshaw.io/blog/new-year" />
	<id>https://crawshaw.io/blog/new-year</id>
	<updated>2022-12-31T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>new year, same plan</h1>

<h1><em>2022-12-31</em></h1>

<p>Some months ago, the bill from GCE for hosting this blog jumped from nearly nothing to far too much for what it is, so I moved provider and needed to write a blog post to test it all.</p>

<p>I could have figured out why my current provider hiked the price. Presumably I was Holding It Wrong and with just a few grip adjustments I could get the price dropped. But if someone mysteriously starts charging you more money, and there are other people who offer the same service, why would you stay?</p>

<p>This has not been a particularly easy year, for a variety of reasons. But here I am at the end of it, and beyond a few painful mistakes that in retrospect I did not have enough information to get right, I made mostly the same decisions I would again. There were a handful of truly wonderful moments.</p>

<p>So the plan for 2023 is the same: keep the kids intact, and make programming more fun.</p>

<p>There is also the question of Twitter. It took me a few years to develop the skin to handle the generally unpleasant environment. (I can certainly see why almost no old Twitter employees used their product.) The experience recently has degraded, there are still plenty of funny tweets, but far less moments of interesting content. Here is a recent exception, but it is notable because it&apos;s the first time in weeks I learned anything from twitter: <a href="https://twitter.com/lrocket/status/1608883621980704768">https://twitter.com/lrocket/status/1608883621980704768</a>. I now find more new ideas hiding in HN comments than on Twitter.</p>

<p>Many people I know have sort-of moved to Mastodon, but it has a pretty horrible UX that is just enough work that I, on the whole, don&apos;t enjoy it much. And the fascinating insights don&apos;t seem to be there yet, but I&apos;m still reading and waiting. On the writing side, it might be a good idea to lower the standards (and length) of my blog posts to replace writing tweets. But maybe there isn&apos;t much value in me writing short notes anyway, are my contributions as fascinating as the ones I used to sift through Twitter to read? Not really.</p>

<p>So maybe the answer is to give up the format entirely. That might be something new for 2023.</p>

<p>Here is something to think about for the new year:</p>

<blockquote>
<p>DAVID BRANCACCIO: There&apos;s a little sweet moment, I&apos;ve got to say, in a very intense book&ndash; your latest&ndash; in which you&apos;re heading out the door and your wife says what are you doing? I think you say&ndash; I&apos;m getting&ndash; I&apos;m going to buy an envelope.</p>

<p>KURT VONNEGUT: Yeah.</p>

<p>DAVID BRANCACCIO: What happens then?</p>

<p>KURT VONNEGUT: Oh, she says well, you&apos;re not a poor man. You know, why don&apos;t you go online and buy a hundred envelopes and put them in the closet? And so I pretend not to hear her. And go out to get an envelope because I&apos;m going to have a hell of a good time in the process of buying one envelope.
I meet a lot of people. And, see some great looking babes. And a fire engine goes by. And I give them the thumbs up. And, and ask a woman what kind of dog that is. And, and I don&apos;t know. The moral of the story is, is we&apos;re here on Earth to fart around.
And, of course, the computers will do us out of that. And, what the computer people don&apos;t realize, or they don&apos;t care, is we&apos;re dancing animals. You know, we love to move around. And, we&apos;re not supposed to dance at all anymore.</p>
</blockquote>

<p><a href="http://www.shoppbs.pbs.org/now/transcript/transcriptNOW140_full.html">http://www.shoppbs.pbs.org/now/transcript/transcriptNOW140_full.html</a></p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>log4j: between a rock and a hard place</title>
	<link href="https://crawshaw.io/blog/log4j" />
	<id>https://crawshaw.io/blog/log4j</id>
	<updated>2021-12-11T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>log4j: between a rock and a hard place</h1>

<h1><em>2021-12-11</em></h1>

<p>There is more than enough written on the mechanics of and mitigations for the recent <a href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-44228">severe RCE in log4j</a>. On prevention, this is the most interesting widely-reshared <a href="https://twitter.com/yazicivo/status/1469349956880408583?s=21">insight</a> I have seen:</p>

<blockquote>
<p>Log4j maintainers have been working sleeplessly on mitigation measures; fixes, docs, CVE, replies to inquiries, etc. Yet nothing is stopping people to bash us, for work we aren&apos;t paid for, for a feature we all dislike yet needed to keep due to backward compatibility concerns.</p>
</blockquote>

<p>This is making the rounds because highly-profitable companies are using infrastructure they do not pay for. That is a worthy topic, but not the most interesting thing in this particular case because it would not clearly have contributed to preventing this bug. It is the second statement in this tweet that is worthy of attention: the <em>maintainers of log4j would have loved to remove this bad feature</em> long ago, but could not because of the backwards compatibility promises they are held to.</p>

<p>I am often heard to say that I love backwards compatibility, and that it is underrated. But what exactly do I mean? I don&apos;t mean that whenever I upgrade a dependency, I expect zero side effects. If a library function gets two times faster in an upgrade, that is a change in behavior that might break my software! But obviously the exact timings of functions can change between versions. In some extreme cases I need libraries to promise the algorithmic complexity of run time or memory usage, where I am providing extremely large inputs, or need constant-time algorithms to avoid timing attacks. But I don&apos;t need that from a logging library. So let me back up and describe what is important.</p>

<h1>What does backwards compatibility mean to me?</h1>

<h2>I want to not spend much time upgrading a dependency</h2>

<p>The ideal version of this is I run my package manager&apos;s upgrade command, execute the tests, commit the output, and not think about it any more. This means the API/ABI stays similar enough that the compiler won&apos;t break, the behavior of the library routines is similar enough the tests will pass, and no other constraints, such as total binary size limits, are exceeded.</p>

<p>This is impossible in the general case. The only way to achieve it is to not make any changes at all. When we write down a promise, we leave lots of definitional holes in the promise. E.g. take the (generally excellent) <a href="https://go.dev/doc/go1compat">Go compatibility promise</a>:</p>

<blockquote>
<p>It is intended that programs written to the Go 1 specification will continue to compile and run correctly, unchanged, over the lifetime of that specification.</p>
</blockquote>

<p>Here &quot;correctly&quot; means according to the Go language specification and the API documentation. The spec and the docs do not cover run time, memory use, or binary size. The next version of Go can be 10x slower and be compatible! But I can assure you if that were the case I would fail my goal of not spending much time upgrading a dependency.</p>

<p>But the Go team know this, and work to the spirit of their promise. Very <em>very</em> occasionally they break things, for security reasons, and when they do I have to spend time upgrading a dependency for a really good reason: my program needs it.</p>

<h2>I want any problems caused by the upgrade to be caught early, not in production.</h2>

<p>If I want my program to work correctly I should write tests for all the behaviors I care about. But like all programmers, I am short on hours in the day to do all that needs doing, and never have enough tests. So whenever a change in behavior happens in an upstream library that my tests don&apos;t catch but makes it into production, my instinct is to blame upstream. This is of course unfair, the burden for releasing good programs is borne by the person pressing the release button. But it is an expression of a programming social contract that has taken hold: a good software project tries to break downstream as little as possible, and when we do break downstream, we should do our best to make the breakage obvious and easy to fix.</p>

<p>No compatibility promise I have seen covers the spirit of minimizing breakage and moving it to the right part of the process. As far as I can tell, engineers aren&apos;t taught this in school, and many have never heard the concept articulated. So much of best practice in releasing libraries is learned on the job and not well communicated (yet). Good upstream dependencies are maintained by people who have figured this out the hard way and do their best by their users. As a user, it is extremely hard to know what kind of library you are getting when you first consider a dependency, unless it is a very old and well established project.</p>

<h2>I want to be able to build knowledge of the library over a long time, to hone my craft</h2>

<p>This is where software goes wrong the most for me. I want, year after year, to come back to a tool and be able to apply the knowledge I acquired the last time I used it, to new things I learn, and build on it. I want to hone my craft by growing a deep understanding of the tools I use.</p>

<p>Some new features are additive. If I buy a new <a href="https://en.wikipedia.org/wiki/Speed_square">speed square</a> for framing, and it has a notch on it my old one didn&apos;t that I can use as a shortcut in marking up a beam, its presence does not invalidate my old knowledge. If the new interior notch replaces a marking that was on the outside of the square, then when I go to find my trusty marking I remember from years ago, and it&apos;s missing, I need to stop and figure out a new way to solve this old problem. Maybe I will notice the new feature, or, more likely, I&apos;ll pull out the tape measure measure I know how to use and find my mark that (slower) way. If someone who knew what they were doing saw me they could correct me! But like programming, I&apos;m usually making a mess with wood alone in a few spare hours on a Saturday.</p>

<p>When software &quot;upgrades&quot; invalidate my old knowledge, it makes me a worse programmer. I can spend time getting back to where I was, but that&apos;s time I am not spending improving on where I was. To give a concrete example: I will never be an expert at developing for macOS or iOS. I bounce into and out of projects for Apple devices, spending no more than 10% of my hours on their platform. Their APIs change constantly. The buttons in Xcode move so quickly I sometimes wonder if it&apos;s happening before my eyes. Try looking up some Swift syntax on stack overflow and you&apos;ll find the answers are constantly edited for the latest version of Swift. At this point, I assume every time I come back to macOS/iOS, that I know nothing and I am learning the platform for the first time.</p>

<p>Compare the shifting sands of Swift with the stability of awk. I have spent not a tenth of the time learning awk that I have spent relearning Swift, and yet I am about as capable in each language. An awk one-liner I learned 20 years ago still works today! When I see someone use awk to solve a problem, I&apos;m enthusiastic to learn how they did it, because I know that 20 years from now the trick will work.</p>

<h1>Backwards compatibility should not have forced log4j to keep LDAP/JNDI URLs</h1>

<p>By what backwards compatibility means to me, a project like log4j will break fewer people by removing a feature like the JNDI URLs than by marking an old API method with some mechanical deprecation notice that causes a build process&apos;s equivalent of <code>-Wall</code> to fail and moving it to a new name. They will in practice, break fewer people removing this feature than they would by slowing down a critical path by 10%, which is the sort of thing that can trivially slip into a release unnoticed.</p>

<p>But the spirit of compatibility promises appears to be poorly understood across our industry (as software updates demonstrate to me every week), and so we lean on the pseudo-legalistic wording of project documentation to write strongly worded emails or snarky tweets any time a project makes work for us (because most projects don&apos;t get it, so surely every example of a breakage must be a project that doesn&apos;t get it, not a good reason), and upstream maintainers become defensive and overly conservative. The result is now everyone&apos;s Java software is broken!</p>

<p>We as a profession misunderstand and misuse the concept of backwards compatibility, both upstream and downstream, by focusing on narrow legalistic definitions instead of outcomes.</p>

<h1>The other side of compatibility: being cautious adding features</h1>

<p>This is a harder, longer topic that maybe I&apos;ll find enough clarity to write properly about one day. It should be easy to hack up code and share it! We should also be cautious about adding burdensome features. This particular bug feels impossibly strange to me, because my idea of a logging API is file descriptor number 2 with the <em>write</em> system call. None of the bells and whistles are necessary and we should be conservative about our core libraries. Indeed libraries like these are why I have been growing ever-more skeptical of using any depdendencies, and now force myself to read a big chunk of any library before adding it to a project.</p>

<p>But I have also written my share of misfeatures, as much as I would like to forget them. I am thankful my code I don&apos;t like has never achieved the success or wide use of log4j, and I cannot fault diligent (and unpaid!) maintainers doing their best under those circumstances.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Software I’m thankful for</title>
	<link href="https://crawshaw.io/blog/thankful-for-technology" />
	<id>https://crawshaw.io/blog/thankful-for-technology</id>
	<updated>2021-11-25T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>Software I’m thankful for</h1>

<p><em>2021-11-25</em></p>

<p>A few of the things that come to mind, this thanksgiving.</p>

<h2>open/read/write/close</h2>

<p>Most Unix-ish APIs, from files to sockets are a bit of a mess today. Endless poorly documented sockopts, unexpected changes in write semantics across FSs and OSes, good luck trying to figure out <a href="https://apenwarr.ca/log/20181113">mtimes</a>. But despite the mess, I can generally wrap my head around open/read/write/close. I can strace a binary and figure out the sequence and decipher what’s going on. Sprinkle in some printfs and state is quickly debuggable. Stack traces are useful!</p>

<p>Enormous effort has been spent on many projects to replace this style of I/O programming, for efficiency or aesthetics, often with an asynchronous bent. I am thankful for this old reliable standby of synchronous open/read/write/close, and hope to see it revived and reinvented throughout my career to be cleaner and simpler.</p>

<h2>goroutines</h2>

<p>Goroutines are coroutines with compiler/runtime optimized yielding, to make them behave like threads. This breathes new life into the previous technology I’m thankful for: simple blocking I/O. With goroutines it becomes cheap to write large-scale blocking servers without running out of OS resources (like heavy threads, on OSes where they’re heavy, or FDs). It also makes it possible to use blocking interfaces between “threads” within a process without paying the ever-growing price of a context switch in the post-<a href="https://en.wikipedia.org/wiki/Spectre_(security_vulnerability">spectre</a> world.</p>

<h2>Tailscale</h2>

<p>This is the first year where the team working on Tailscale has outgrown and eclipsed me to the point where I can be thankful for Tailscale without feeling like I’m thanking myself. Many of the wonderful new features that let me easily wire machines together wherever they are, like userspace networking or MagicDNS, are not my doing. I’m thankful for the product, and the opportunity to work with the best engineering team I’ve ever had the privilege of being part of.</p>

<h2>SQLite</h2>

<p>Much like open/read/write/close, SQLite is an island of stability in a constantly changing technical landscape. Techniques I learned 10 or 15 years ago using SQLite work today. As a bonus, it does so much more than then: WAL mode for highly-concurrent servers, advanced SQL like window functions, excellent ATTACH semantics. It has done all of this while keeping the number of, in the projects own language, “goofy design” decisions to a minimum and holding true to its mission of being “lite”. I aspire to write such wonderful software.</p>

<h2>JSON</h2>

<p>JSON is the worst form of encoding — except for all the others that have been tried. It’s complicated, but not too complicated. It’s not easily read by humans, but it can be read by humans. It is possible to extend it in intuitive ways. When it gets printed onto your terminal, you can figure out what’s going on without going and finding the magic decoder ring of the week. It makes some things that are extremely hard with XML or INI easy, without introducing accidental Turing completeness or turning <a href="https://medium.com/@lefloh/lessons-learned-about-yaml-and-norway-13ba26df680">country codes into booleans</a>. Writing software is better for it, and shows the immense effect carefully describing something can do for programming. JSON was everywhere in our JavaScript before the term was defined, the definition let us see it and use it elsewhere.</p>

<h2>WireGuard</h2>

<p>WireGuard is a great demonstration of why the total complexity of the implementation ends up affecting the UX of the product. In theory I could have been making tunnels between my devices for years with IPSec or TLS, in practice I’d completely given it up until something came along that made it easier. It didn’t make it easier by putting a slick UI over complex technology, it made the underlying technology simpler, so even I could (eventually) figure out the configuration. Most importantly, by not eating my entire complexity budget with its own internals, I could suddenly see it as a building block in larger projects. Complexity makes more things possible, and fewer things possible, simultaneously. WireGuard is a beautiful example of simplicity and I’m thankful for it.</p>

<h2>The speed of the Go compiler</h2>

<p>Before Go became popular, the fast programming language compilers of the 90s had mostly fallen by the wayside, to be replaced with a bimodal world of interpreters/JITs on one side and creaky slow compilers attempting to produce extremely optimal code on the other. The main Go toolchain found, or rediscovered, a new optimal point in the plane of tradeoffs for programming languages to sit: ahead of time compiled, but with a fast less-than-optimal compiler. It has managed to continue to hold that interesting, unstable equilibrium for a decade now, which is incredibly impressive. (E.g. I personally would love to improve its inliner, but know that it’s nearly impossible to get too far into that project without sacrificing a lot of the compiler’s speed.)</p>

<h2>GCC</h2>

<p>I’ve always been cranky about GCC: I find its codebase nearly impossible to modify, it’s slow, the associated ducks I need to line up to make it useful (binutils, libc, etc) blow out the complexity budget on any project I try to start before I get far, and it is associated with GNU, which I used to view as an oddity and now view as a millstone around the neck of an otherwise excellent software project.</p>

<p>But these are all the sorts of complaints you only make when using something truly invaluable. GCC is invaluable. I would never have learned to program if a free C compiler hadn’t been available in the 90s, so I owe it my career. To this day, it vies neck-and-neck with LLVM for best performing object code. Without the competition between them, compiler technology would stagnate. And while LLVM now benefits from $10s or $100s of millions a year in Silicon Valley salaries working on it, GCC does it all with far less investment. I’m thankful it keeps on going.</p>

<h2>vim</h2>

<p>I keep trying to quit vim. I keep ending up inside a terminal, inside vim,  writing code. Like SQLite, vim is an island of stability over my career. While I wish IDEs were better, I am extremely thankful for tools that work and respect the effort I have taken to learn them, decade after decade.</p>

<h2>ssh</h2>

<p>SSH gets me from here to there, and has done since ~1999. There is a lot about ssh that needs reinventing, but I am thankful for stable, reliable tools. It takes a lot of work to keep something like ssh working and secure, and if the maintainers are ever looking for someone to buy them a round they know where to find me.</p>

<h2>The public web and search engines</h2>

<p>How would I get anything done without all the wonderful information on the public web and search engines to find it? What an amazing achievement.</p>

<p>Thanks everyone, for making computers so great.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Remembering the LAN</title>
	<link href="https://crawshaw.io/blog/remembering-the-lan" />
	<id>https://crawshaw.io/blog/remembering-the-lan</id>
	<updated>2020-01-28T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>Remembering the LAN</h1>

<p><em>2020-01-28</em></p>

<p>A memory and a dream.</p>

<h1>How it was</h1>

<p>I started programming in the 1990s living above my parent&apos;s medical practice.
We had 15 PCs for the business, and one for me.
The standard OS was MS-DOS.
The network started off using IPX over coax to a Novell Netware server,
the fanciest software we ever owned.
IPX was so much easier than TCP/IP.
No DHCP and address allocation, it just worked.</p>

<p>Eventually the PCs would run Windows, and a Windows NT server took
over file sharing over TCP/IP.
The business software survived this transition unchanged,
though there was more operational overhead.
We assigned IPs manually.</p>

<p>Home was a small town in Northern Australia.
The internet was far off for me at this point,
and would remain so longer than it did elsewhere.
Eventually we would be able to make long-distance phone calls 2000
miles to try it out for a few minutes here and there.
(At this point Americans had AOL.)</p>

<p>Before we had internet there were some lackluster local BBSs,
and at one point a local university account my father acquired
(somehow or other, none of us were students or university employees)
that we could dial into and try out my first Unix on a Sun box.
It was a limited experience even though technically it was on an
internet link.
My distance from university culture meant I wouldn&apos;t see Linux until
the mid-90s, when we picked up a copy of Slackware on a trip to
Hong Kong.
I didn&apos;t really get Unix until I used OpenBSD, which put enough of the
pieces together for me for Unix to finally make sense.</p>

<p>I wouldn&apos;t see root on a Sun box for more than a decade, now the early
2000s, when I bought half a dozen UltraSPARC servers as a lot second
hand in Berkeley (for around $100, a lot of money for me at the time).
I assembled a working machine from the carcasses and used it to write
a sparc64 C compiler backend. Even though more time has passed since
than between these times, it is hard for me to hold both lives in my
head simultaneously.
They were different worlds.</p>

<h2>The childhood magic</h2>

<p>The LAN was a magical place to learn about computers.
Besides the physical aspect of assembling and disassembling machines,
I could safely do things unthinkable on the modern internet:
permission-less file sharing, experimental servers with no security,
shared software where any one machine could easily bring down the
network by typing in an innocuous command. Even when I did bring
down the network the impact never left the building.
I knew who I had to apologise to.</p>

<p>(Two decades later when I took down a borgmaster with a misconfigured
MapReduce as an engineer at Google, I could not figure out who should
get an apology email.)</p>

<p>With our LAN easy things were easy, and some hard things were possible.
There were high-level interpreted languages where UIs were
straightforward, and scary languages which could make the computer
really shine.</p>

<p>A 200MHz Pentium Pro felt blazing fast and 32MB of RAM could do anything.
By the time I had OpenBSD I could recompile Apache httpd in a few
minutes with my own bad ideas.
My wrist watch could compile it faster today, as long as I stuck to
GCC 2.95.</p>

<p>Later I would carry a PC to houses of my friends where we would build
ephemeral LANs and play games like Starcraft.
(Cathode-ray monitors were heavy.)
The LAN was an education and a lifestyle.</p>

<h2>The small business magic</h2>

<p>My father, a general practitioner, used this infrastructure of cheap
286s, 386s, and 486s (with three expensive laser printers) to write
the medical record software for the business.
It was used by a dozen doctors, a nurse, and receptionist.
You can do a lot with file-based database software (in this case,
Clipper) and a mouse-less curses interface.</p>

<p>There are several astonishing facts about this.
As an engineer, it is astonishing that Netware file locking, then SMB
file locking, worked well enough to implement a database used by ~15
concurrent users.
I suspect most career programmers today have never used file locking,
let alone seen it work correctly.</p>

<p>The business story is even more astonishing.
Here is a non-programming professional, who was able to build the
software to run their small business in between shifts at their day
job using skills learned from a book.</p>

<p>Today a professional could surely pick up the skills to build a CRUD
app, but they would be hard pressed to tune their software so
relentlessly to minimize the keystrokes a receptionist needs to use
to check-in a patient, or support a magnetic card reader, or teach
laser printers to precisely print onto specialized prescription paper
(the printers spoke postscript, but the MS-DOS programming language
had an easier instruction layer over PS that made this possible).</p>

<p>The result in the 90s was the business needed fewer staff than everyone
thought a medical practice of that size required, doctor&apos;s time
was used more efficiently than any other software allowed, so
productivity increased.</p>

<p>My father made more money as a part-time programmer optimizing his
small business than he did as a doctor seeing patients.</p>

<h1>How it is</h1>

<p>If my 90s childhood were transported to today, so many new things
would be possible.
I could draw high-quality graphics easily with JavaScript.
It is not clear that would be more compelling than the pixelated
gorillas and bananas I played with in BASIC. I could develop apps
for my phone.
In theory at least. In practice, I wasn&apos;t particularly patient with
slow compilers as a kid, and as an adult I still have trouble
stomaching the development environment for apps, so that&apos;s off the table.</p>

<p>I wouldn&apos;t build a toy website to put school stuff on,
because I would have facebook for that.</p>

<p>Games would be easier to play with friends.
We wouldn&apos;t have to lug heavy boxes or learn to debug our TCP/IP
configuration or actually see each other in person to play.
I guess some people would see that as an improvement: more candy,
less content.</p>

<p>All the technology is better. The resources to learn are better.
But it is not clear to me I would program at all today.
Learning how to store passwords or add OAuth2 to your toy web
site is not fun.
So much of programming today is busywork, or playing defense
against a raging internet.
You can do so much more, but the activation energy reqired to
start writing fun collaborative software is so much higher you
end up using some half-baked SaaS instead.</p>

<p>What about my father?</p>

<p>Could a part-time programmer like my father write small-business
software today?
Could he make it as safe and productive as our LAN was? Maybe.
If he was canny, and stuck to old-fashioned desktops of the 90s and
physically isolated the machines from the internet.
But there is no chance you could get the records onto a modern phone
safely (or even legally under HIPPA) with the hours my father gave
the project.</p>

<p>If confronted with the build v. buy decision today, I strongly
suspect he would buy. Or even more likely, subscribe.
The practice would be less productive for it.</p>

<p>The programmers of the world have built this fantastic internet,
full of magic.
Free inter-continental video calls.
&quot;Micro&quot; VMs available for free from Cloud providers with more
processing power and memory than anything I could have bought when
I started programming.</p>

<p>For all our mastery, something has been lost.
If programming a LAN in the 1990s was the care-free tending to a
garden in the countryside, then programming on the internet today
is tending a planter box on Madison Avenue in midtown.
Anyone can experience your work. You will also have your tilling
judged by thousands of passersby, any of whom may ruin your work
because the dog they&apos;re walking hasn&apos;t been city trained.</p>

<h1>A dream: How it will be</h1>

<p>In some moments the right threads of change meet and create something
special.
Many of these moments are short-lived and will not repeat, destined
to be, at best, remembered.</p>

<p>The magic moment of small trusted networks and care-free programs
does not need be relegated to memory.
With enough work, we can bend technology to recreate the magic.</p>

<p>We can have the LAN-like experience of the 90&apos;s back again,
and we can add the best parts of the 21st century internet.
A safe small space of people we trust, where we can program away
from the prying eyes of the multi-billion-person internet.
Where the outright villainous will be kept at bay by good identity
services and good crypto.</p>

<p>The broader concept of virtualizing networks has existed forever:
the Virtual Private Network.
New protocols make VPNs better than before,
<a href="https://wireguard.com">Wireguard</a> is pioneering easy and efficient
tunneling between peers.
Marry the VPN to identity, and make it work anywhere, and you can have
a virtual 90s-style LAN made up of all your 21st century devices.
Let the internet be the dumb pipe, let your endpoints determine who
they will talk to based on the person at the other end.</p>

<p>The result is a system with properties that work with today&apos;s
internet to give us the pleasant, simple programming environment
of the &apos;90s LAN:</p>

<ul>
<li>Use the global internet identity system of your choice for
authentication, and do cryptographic authorization at the IP level.</li>
<li>Keys are generated and rotated for you automatically.</li>
<li>People map directly to unspoofable IP addresses.</li>
<li>Run custom servers on your network and access is limited to only
those people on the network.</li>
<li>Your data is protected by the simple yet powerful social dynamics
of small groups.</li>
</ul>

<p>We can build this.</p>

<p>First, we have to prove that the user experience creates the
environment we want for simpler programming.
That means getting the product in the hands of customers and
making them happy.
This is our current focus at <a href="https://tailscale.com">Tailscale</a>,
build a great product and make customers happy.</p>

<p>Second, we need to stabilize and publish the protocols used to
build this mesh overlay network so that it can be used anywhere.</p>

<p>Third, I need to help new programmers who never got to experience
simple, pleasurable programming in a safe environment understand
that programming can be fun.
You can set up your environment so you can focus on being creative.
Writing a web service for use by your friends should not be a form
of combat, where you spend your days worrying about XSS attacks
or buffer overflows.
You should be focused on creating something new and wonderful
in a place without bad people hounding you.</p>

<p>We are going to rebuild the LANs (and BBSs and MUDs) of the 90s
as a world of mesh networks on top of today&apos;s internet.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>The asymmetry of Internet identity</title>
	<link href="https://crawshaw.io/blog/identity-stack" />
	<id>https://crawshaw.io/blog/identity-stack</id>
	<updated>2019-09-29T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>The asymmetry of Internet identity</h1>

<p><em>2019-09-29</em></p>

<p>Identity on the internet is messy.
The result is some things that should be easy are hard.</p>

<h2>The Identity Stack</h2>

<p>This is an attempt to document how we define <em>a person</em> on the modern
Internet. It is analogous to an
<a href="https://en.wikipedia.org/wiki/OSI_model">OSI model</a> for identity.</p>

<h3>Layer 1: IP addresses</h3>

<p>The story so far: In the beginning the IP address was created.
This has made a lot of people very angry and been
<a href="https://en.wikipedia.org/wiki/The_Restaurant_at_the_End_of_the_Universe">widely regarded</a>
as a bad move.</p>

<p>IP addresses give us everything, and yet surprisingly little.
Scribble one on a packet, send it out, and maybe it will get somewhere.
Where is anybody&apos;s guess. After a while some packets come back with
the that IP address in the sender field.
Maybe it&apos;s from them, maybe it makes sense, maybe some got lost on
the way.</p>

<h3>Layer 2: Brands</h3>

<p>Next come the true foundation of the modern internet: brands.
A brand is a domain name that you recognize.
These used to be organizations like Universities and Military Labs
and other Very Serious relics of ages past, but these days it is
Facebook and Google and Disney.</p>

<p>In theory you resolve brands using DNS, which maps names to IP
addresses. But that part of the internet is almost trivial and it
doesn&apos;t really work. The heavy lifting is done by TLS certificates,
who thanks to a cast of about
<a href="https://en.wikipedia.org/wiki/Certificate_authority#Providers">180 questionable characters</a>,
you can be assured that the packets coming from somewhere, which were
resolved by someone to be something, actually belong to a particular Brand.</p>

<h3>Layer 3: Registering personhood</h3>

<p>Now that a person has established they are talking to the Brand,
the Brand must work out they are talking to a person.</p>

<p>This is wonderfully vague.</p>

<p>It could be as simplistic as a land rush for a particular email
address and defining a password, or uploading a photo of a government
ID and a selfie, bouncing a text through a somewhat-registered phone
system, or performing a financial transaction.
All that matters is it is done to the satisfaction of the Brand, and
that is determined by whatever metrics are in the software the
Brand&apos;s fraud team has deployed.</p>

<p>By Layer 3, a person can talk to a brand and a brand can talk to people.</p>

<h3>Layer 4: People</h3>

<p>Brands define people.</p>

<p>If Alice wants to talk to Bob, she needs a Brand to facilitate it.
Typically this is done entirely within the confines of the domain
name and software of the brand, no data ever leaves their databases
and apps.</p>

<h3>Layer 5: Inter-brand Identity protocols</h3>

<p>Some Brands outsource their identity to other Brands.
&quot;Login with Google&quot; or &quot;Login with Twitter&quot; abound.
The key protocols here are
<a href="https://en.wikipedia.org/wiki/OAuth">OAuth2</a> and
<a href="https://en.wikipedia.org/wiki/Security_Assertion_Markup_Language">SAML</a>.
Due to the inherent vagueness of Layer 3, this typically requires
bouncing the person through the other Brand&apos;s domain.</p>

<h2>What is missing and what does not work well</h2>

<p>People can talk to Brands.</p>

<p>People can talk to other people via Brands.</p>

<p>That means if Alice wants to write some software and use it to talk
to Bob, and doesn&apos;t want to delegate to a megacorp, then she must
first define a brand.
Buy a domain name, get a
<a href="https://letsencrypt.org/">TLS certificate</a>, give the Brand name to Bob.
Bob can identify himself with Alice&apos;s Brand using OAuth2, which works,
but is always more work than it should be.</p>

<p>All this effort is perfectly fine if Alice spent three weeks writing
sophisticated software. The extra day of work to do all of this is a
tiny fraction of the whole project. But what if Alice spent 30 minutes
writing a nice statistical analysis in Mathematica and wants to invite
Bob to try it? Or she spent 60 minutes writing a text adventure in PHP?
These are worthy pursuits and making a world where sharing them is
easy is important.
She can shrink the hours wrestling with OAuth2 down to minutes by
hardcoding in a password and sending it to Bob.
But <strong>the remaining work of setting up the domain name and launching
the service will be more effort than the original programming project</strong>.</p>

<p>This is the fundamental asymmetry of the internet&apos;s identity stack.
Brands are few and easily identified by people.
People are many, and cannot be identified without Brands.</p>

<p>Additionally, how does Bob know what Alice&apos;s Brand is? She told him.
Probably in a text message or email, that is, via some other Brand&apos;s
communication system.
<strong>There is no good way for a person to identify another person without
first mutually agreeing on Brand identities.</strong></p>

<p>We should be able to do better.
We have big existing Brands with open(-ish) identity systems.
We should be able to write quick and dirty programs, whether it&apos;s
running on your laptop, a Raspberry Pi in a tree in your yard, or the
Cloud, and share them with people.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Zero Trust Networks</title>
	<link href="https://crawshaw.io/blog/zero-trust" />
	<id>https://crawshaw.io/blog/zero-trust</id>
	<updated>2019-09-10T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>Zero Trust Networks</h1>

<p><em>2019-09-10</em></p>

<p>I am leery of jargon. I am as guilty of using it as the next engineer, but there comes a point where there are just too many precise, narrowly-understood terms polluting your vocabulary. The circle of people you can talk to shrinks until going to the store to buy milk feels like an exercise in speaking a foreign language you took one intro course to in college. Less jargon is better.</p>

<p>Thus the first few times I heard the terms <em>zero trust network</em> and <em>microsegments</em> I ignored them. The conversation went on even though I was a bit confused. Eventually I heard these enough that I had to figure out what these words mean. Turns out they are useful!</p>

<p>So what are they?</p>

<h2>Zero Trust Networking</h2>

<p>The term <em>zero trust</em> <a href="https://www.ndm.net/firewall/pdf/palo_alto/Forrester-No-More-Chewy-Centers.pdf">originated in 2010</a> with John Kindervag. It came as a fully-formed concept: we need to give up on the idea of trusted networks.</p>

<p>Traditional production and corporate networks have a notion of <em>perimeter security</em>, the big bad world is outside, and inside is a safer space with lax rules.</p>

<p><img src="network-traditional.svg" alt="" /></p>

<p><svg version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0" y="0" width="600" height="750" viewBox="0, 0, 400, 500">
  <defs>
    <linearGradient id="Gradient_1" gradientUnits="userSpaceOnUse" x1="246.675" y1="480.789" x2="226.325" y2="283.211">
      <stop offset="0" stop-color="#FFFFFF"/>
      <stop offset="1" stop-color="#D9D9D9"/>
    </linearGradient>
    <linearGradient id="Gradient_2" gradientUnits="userSpaceOnUse" x1="214.447" y1="8.447" x2="258.553" y2="260.886">
      <stop offset="0" stop-color="#FFFFFF"/>
      <stop offset="0.51" stop-color="#F1F1F1"/>
      <stop offset="1" stop-color="#D3D3D3"/>
    </linearGradient>
  </defs>
  <g id="Layer_1">
    <g>
      <path d="M236.5,481.167 C172.021,481.167 119.75,436.768 119.75,382 C119.75,327.232 172.021,282.833 236.5,282.833 C300.979,282.833 353.25,327.232 353.25,382 C353.25,436.768 300.979,481.167 236.5,481.167 z" fill="url(#Gradient_1)"/>
      <path d="M236.5,481.167 C172.021,481.167 119.75,436.768 119.75,382 C119.75,327.232 172.021,282.833 236.5,282.833 C300.979,282.833 353.25,327.232 353.25,382 C353.25,436.768 300.979,481.167 236.5,481.167 z" fill-opacity="0" stroke="#483D32" stroke-width="2"/>
    </g>
    <g>
      <path d="M236.5,262.833 C166.36,262.833 109.5,205.451 109.5,134.667 C109.5,63.882 166.36,6.5 236.5,6.5 C306.64,6.5 363.5,63.882 363.5,134.667 C363.5,205.451 306.64,262.833 236.5,262.833 z" fill="url(#Gradient_2)"/>
      <path d="M236.5,262.833 C166.36,262.833 109.5,205.451 109.5,134.667 C109.5,63.882 166.36,6.5 236.5,6.5 C306.64,6.5 363.5,63.882 363.5,134.667 C363.5,205.451 306.64,262.833 236.5,262.833 z" fill-opacity="0" stroke="#000000" stroke-width="2"/>
    </g>
    <g>
      <g>
        <path d="M153.833,69.5 L212.5,69.5 L212.5,114.5 L153.833,114.5 L153.833,69.5 z" fill="#FFFFFF"/>
        <path d="M153.833,69.5 L212.5,69.5 L212.5,114.5 L153.833,114.5 L153.833,69.5 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 182.667, 91.5)">
        <tspan x="-22.153" y="3.438" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Frontend</tspan>
      </text>
    </g>
    <g>
      <g>
        <path d="M243.833,41.5 L302.5,41.5 L302.5,86.5 L243.833,86.5 L243.833,41.5 z" fill="#FFFFFF"/>
        <path d="M243.833,41.5 L302.5,41.5 L302.5,86.5 L243.833,86.5 L243.833,41.5 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 273, 63.5)">
        <tspan x="-22.153" y="3.438" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Database</tspan>
      </text>
    </g>
    <g>
      <g>
        <path d="M243.833,109.833 L302.5,109.833 L302.5,154.833 L243.833,154.833 L243.833,109.833 z" fill="#FFFFFF"/>
        <path d="M243.833,109.833 L302.5,109.833 L302.5,154.833 L243.833,154.833 L243.833,109.833 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 273, 131.833)">
        <tspan x="-19.82" y="-2.188" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Internal</tspan>
        <tspan x="-19.82" y="9.812" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Service</tspan>
      </text>
    </g>
    <g>
      <g>
        <path d="M153.833,135 L212.5,135 L212.5,180 L153.833,180 L153.833,135 z" fill="#FFFFFF"/>
        <path d="M153.833,135 L212.5,135 L212.5,180 L153.833,180 L153.833,135 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 183, 157)">
        <tspan x="-22.153" y="-2.188" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Load </tspan>
        <tspan x="-22.153" y="9.812" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Balancer</tspan>
      </text>
    </g>
    <g>
      <g>
        <path d="M243.833,176.833 L302.5,176.833 L302.5,221.833 L243.833,221.833 L243.833,176.833 z" fill="#FFFFFF"/>
        <path d="M243.833,176.833 L302.5,176.833 L302.5,221.833 L243.833,221.833 L243.833,176.833 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 273, 198.833)">
        <tspan x="-22.153" y="-2.188" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">VPN</tspan>
        <tspan x="-22.153" y="9.812" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Gateway</tspan>
      </text>
    </g>
    <text transform="matrix(1, 0, 0, 1, 239.417, 29)">
      <tspan x="-32.979" y="3" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Prod Network</tspan>
    </text>
    <g>
      <g>
        <path d="M215.417,296.167 L274.083,296.167 L274.083,341.167 L215.417,341.167 L215.417,296.167 z" fill="#FFFFFF"/>
        <path d="M215.417,296.167 L274.083,296.167 L274.083,341.167 L215.417,341.167 L215.417,296.167 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 244.583, 318.167)">
        <tspan x="-22.153" y="-2.188" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">VPN</tspan>
        <tspan x="-22.153" y="9.812" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Gateway</tspan>
      </text>
    </g>
    <g>
      <g>
        <path d="M167.75,354.5 L226.417,354.5 L226.417,399.5 L167.75,399.5 L167.75,354.5 z" fill="#FFFFFF"/>
        <path d="M167.75,354.5 L226.417,354.5 L226.417,399.5 L167.75,399.5 L167.75,354.5 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 196.583, 375.979)">
        <tspan x="-22.153" y="3" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Windows</tspan>
      </text>
    </g>
    <g>
      <g>
        <path d="M270.917,362.5 L329.583,362.5 L329.583,407.5 L270.917,407.5 L270.917,362.5 z" fill="#FFFFFF"/>
        <path d="M270.917,362.5 L329.583,362.5 L329.583,407.5 L270.917,407.5 L270.917,362.5 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <g>
        <path d="M265.083,369.667 L323.75,369.667 L323.75,414.667 L265.083,414.667 L265.083,369.667 z" fill="#FFFFFF"/>
        <path d="M265.083,369.667 L323.75,369.667 L323.75,414.667 L265.083,414.667 L265.083,369.667 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <g>
        <g>
          <path d="M259.417,377 L318.084,377 L318.084,422 L259.417,422 L259.417,377 z" fill="#FFFFFF"/>
          <path d="M259.417,377 L318.084,377 L318.084,422 L259.417,422 L259.417,377 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
        </g>
        <text transform="matrix(1, 0, 0, 1, 288.251, 398.479)">
          <tspan x="-22.153" y="3" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Laptops</tspan>
        </text>
      </g>
    </g>
    <g>
      <g>
        <path d="M167.75,409.5 L226.417,409.5 L226.417,454.5 L167.75,454.5 L167.75,409.5 z" fill="#FFFFFF"/>
        <path d="M167.75,409.5 L226.417,409.5 L226.417,454.5 L167.75,454.5 L167.75,409.5 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 196.583, 430.979)">
        <tspan x="-22.153" y="3" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">macOS</tspan>
      </text>
    </g>
    <path d="M244.083,341.167 L244.083,433.5" fill-opacity="0" stroke="#000000" stroke-width="1"/>
    <path d="M226.417,433.5 L244.083,433.5" fill-opacity="0" stroke="#000000" stroke-width="1"/>
    <path d="M244.083,377 L226.417,377" fill-opacity="0" stroke="#000000" stroke-width="1"/>
    <path d="M244.083,399.5 L259.417,399.5" fill-opacity="0" stroke="#000000" stroke-width="1"/>
    <path d="M183.5,135 L183.5,114.5" fill-opacity="0" stroke="#000000" stroke-width="1"/>
    <path d="M212.5,92 L243.833,64.5" fill-opacity="0" stroke="#000000" stroke-width="1"/>
    <path d="M273,109.833 L273,86.5" fill-opacity="0" stroke="#000000" stroke-width="1"/>
    <path d="M273,176.833 L273,154.833" fill-opacity="0" stroke="#000000" stroke-width="1"/>
    <g>
      <path d="M244.083,296.167 L269.643,230.225" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      <path d="M272.441,231.309 L272.535,222.766 L266.846,229.141 z" fill="#000000" fill-opacity="1" stroke="#000000" stroke-width="1" stroke-opacity="1"/>
    </g>
    <text transform="matrix(1, 0, 0, 1, 240.718, 466.5)">
      <tspan x="-32.135" y="3" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Corp Network</tspan>
    </text>
    <text transform="matrix(1, 0, 0, 1, 49.715, 285.333)">
      <tspan x="-27.875" y="3" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">The Internet</tspan>
    </text>
    <g>
      <g>
        <path d="M20.882,221.833 L79.548,221.833 L79.548,266.833 L20.882,266.833 L20.882,221.833 z" fill="#FFFFFF"/>
        <path d="M20.882,221.833 L79.548,221.833 L79.548,266.833 L20.882,266.833 L20.882,221.833 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 49.715, 246.49)">
        <tspan x="-22.87" y="-3.677" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Untrusted</tspan>
        <tspan x="-10.745" y="8.323" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">User</tspan>
      </text>
    </g>
    <g>
      <path d="M79.548,243 L175.803,184.665" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      <path d="M177.358,187.23 L182.645,180.518 L174.248,182.099 z" fill="#000000" fill-opacity="1" stroke="#000000" stroke-width="1" stroke-opacity="1"/>
    </g>
    <text transform="matrix(1, 0, 0, 1, 49.715, 48)">
      <tspan x="-38.715" y="3" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Crunchy Outside</tspan>
    </text>
    <g>
      <path d="M93,48 L121.675,53.735" fill-opacity="0" stroke="#000000" stroke-width="1" stroke-dasharray="3,2"/>
      <path d="M121.086,56.677 L129.519,55.304 L122.263,50.793 z" fill-opacity="0" stroke="#000000" stroke-width="1" stroke-opacity="1"/>
    </g>
    <text transform="matrix(1, 0, 0, 1, 53.605, 81.167)">
      <tspan x="-42.605" y="3" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Soft Chewy center</tspan>
    </text>
    <g>
      <path d="M100,83.5 L128.736,104.52" fill-opacity="0" stroke="#000000" stroke-width="1" stroke-dasharray="3,2"/>
      <path d="M126.965,106.941 L135.193,109.243 L130.507,102.098 z" fill-opacity="0" stroke="#000000" stroke-width="1" stroke-opacity="1"/>
    </g>
  </g>
</svg></p>

<p>Perimeter security does not work. Eventually, someone will find their way in. Usually through a forgotten service hiding in the corner of your network. Once they are in, the lax rules and default trust of the internal network makes your adversary&apos;s job easy: they jump from your forgotten tiny service to the the critical, valuable services.</p>

<p>Zero Trust networking means treating the internal network just like an external network: authenticate every connection, encrypt all traffic, log everything. Plan as if every machine (virtual or otherwise) as if it is sitting on a public IP address.</p>

<p><svg version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0" y="0" width="600" height="750" viewBox="0, 0, 400, 500">
  <g id="Layer_1">
    <g>
      <g>
        <path d="M153.833,69.5 L212.5,69.5 L212.5,114.5 L153.833,114.5 L153.833,69.5 z" fill="#FFFFFF"/>
        <path d="M153.833,69.5 L212.5,69.5 L212.5,114.5 L153.833,114.5 L153.833,69.5 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 182.667, 91.5)">
        <tspan x="-22.153" y="3.438" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Frontend</tspan>
      </text>
    </g>
    <g>
      <g>
        <path d="M243.833,41.5 L302.5,41.5 L302.5,86.5 L243.833,86.5 L243.833,41.5 z" fill="#FFFFFF"/>
        <path d="M243.833,41.5 L302.5,41.5 L302.5,86.5 L243.833,86.5 L243.833,41.5 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 273, 63.5)">
        <tspan x="-22.153" y="3.438" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Database</tspan>
      </text>
    </g>
    <g>
      <g>
        <path d="M243.833,109.833 L302.5,109.833 L302.5,154.833 L243.833,154.833 L243.833,109.833 z" fill="#FFFFFF"/>
        <path d="M243.833,109.833 L302.5,109.833 L302.5,154.833 L243.833,154.833 L243.833,109.833 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 273, 131.833)">
        <tspan x="-19.82" y="-2.188" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Internal</tspan>
        <tspan x="-19.82" y="9.812" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Service</tspan>
      </text>
    </g>
    <g>
      <g>
        <path d="M153.833,135 L212.5,135 L212.5,180 L153.833,180 L153.833,135 z" fill="#FFFFFF"/>
        <path d="M153.833,135 L212.5,135 L212.5,180 L153.833,180 L153.833,135 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 183, 157)">
        <tspan x="-22.153" y="-2.188" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Load </tspan>
        <tspan x="-22.153" y="9.812" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Balancer</tspan>
      </text>
    </g>
    <g>
      <path d="M172.25,221.833 L230.917,221.833 L230.917,266.833 L172.25,266.833 L172.25,221.833 z" fill="#FFFFFF"/>
      <path d="M172.25,221.833 L230.917,221.833 L230.917,266.833 L172.25,266.833 L172.25,221.833 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
    </g>
    <text transform="matrix(1, 0, 0, 1, 201.083, 243.833)">
      <tspan x="-22.153" y="-3" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Corp</tspan>
      <tspan x="-22.153" y="9" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Windows</tspan>
    </text>
    <g>
      <g>
        <path d="M284.417,262.333 L343.083,262.333 L343.083,307.333 L284.417,307.333 L284.417,262.333 z" fill="#FFFFFF"/>
        <path d="M284.417,262.333 L343.083,262.333 L343.083,307.333 L284.417,307.333 L284.417,262.333 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <g>
        <path d="M278.583,269.5 L337.25,269.5 L337.25,314.5 L278.583,314.5 L278.583,269.5 z" fill="#FFFFFF"/>
        <path d="M278.583,269.5 L337.25,269.5 L337.25,314.5 L278.583,314.5 L278.583,269.5 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <g>
        <path d="M272.917,276.833 L331.584,276.833 L331.584,321.833 L272.917,321.833 L272.917,276.833 z" fill="#FFFFFF"/>
        <path d="M272.917,276.833 L331.584,276.833 L331.584,321.833 L272.917,321.833 L272.917,276.833 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 300.751, 302.333)">
        <tspan x="-22.153" y="-3" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Corp</tspan>
        <tspan x="-22.153" y="9" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Laptops</tspan>
      </text>
    </g>
    <g>
      <g>
        <path d="M172.25,284.833 L230.917,284.833 L230.917,329.833 L172.25,329.833 L172.25,284.833 z" fill="#FFFFFF"/>
        <path d="M172.25,284.833 L230.917,284.833 L230.917,329.833 L172.25,329.833 L172.25,284.833 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 201.083, 309.833)">
        <tspan x="-22.153" y="-3" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Corp</tspan>
        <tspan x="-22.153" y="9" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">macOS</tspan>
      </text>
    </g>
    <text transform="matrix(1, 0, 0, 1, 49.715, 285.333)">
      <tspan x="-27.875" y="3" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">The Internet</tspan>
    </text>
    <g>
      <g>
        <path d="M20.882,221.833 L79.548,221.833 L79.548,266.833 L20.882,266.833 L20.882,221.833 z" fill="#FFFFFF"/>
        <path d="M20.882,221.833 L79.548,221.833 L79.548,266.833 L20.882,266.833 L20.882,221.833 z" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      </g>
      <text transform="matrix(1, 0, 0, 1, 49.715, 246.49)">
        <tspan x="-22.87" y="-3.677" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">Untrusted</tspan>
        <tspan x="-10.745" y="8.323" font-family="HelveticaNeue-Medium" font-size="10" fill="#252222">User</tspan>
      </text>
    </g>
    <g>
      <path d="M79.548,243 L175.803,184.665" fill-opacity="0" stroke="#000000" stroke-width="1"/>
      <path d="M177.358,187.23 L182.645,180.518 L174.248,182.099 z" fill="#000000" fill-opacity="1" stroke="#000000" stroke-width="1" stroke-opacity="1"/>
    </g>
    <g>
      <path d="M183.5,135 L183.5,123.5" fill-opacity="0" stroke="#0008FF" stroke-width="1"/>
      <path d="M186.5,123.5 L183.5,115.5 L180.5,123.5 z" fill="#0008FF" fill-opacity="1" stroke="#0008FF" stroke-width="1" stroke-opacity="1"/>
    </g>
    <g>
      <path d="M212.5,92.5 L237.281,69.169" fill-opacity="0" stroke="#1B00FF" stroke-width="1"/>
      <path d="M239.337,71.354 L243.105,63.685 L235.224,66.985 z" fill="#1B00FF" fill-opacity="1" stroke="#1B00FF" stroke-width="1" stroke-opacity="1"/>
    </g>
    <g>
      <path d="M272.917,109.833 L272.917,95.5" fill-opacity="0" stroke="#1B00FF" stroke-width="1"/>
      <path d="M275.917,95.5 L272.917,87.5 L269.917,95.5 z" fill="#1B00FF" fill-opacity="1" stroke="#1B00FF" stroke-width="1" stroke-opacity="1"/>
    </g>
    <g>
      <path d="M197.5,221.833 L246.79,161.79" fill-opacity="0" stroke="#1B00FF" stroke-width="1"/>
      <path d="M249.108,163.693 L251.866,155.606 L244.471,159.886 z" fill="#1B00FF" fill-opacity="1" stroke="#1B00FF" stroke-width="1" stroke-opacity="1"/>
    </g>
    <g>
      <path d="M230.917,297 L259.137,163.638" fill-opacity="0" stroke="#1B00FF" stroke-width="1"/>
      <path d="M262.072,164.259 L260.793,155.812 L256.202,163.017 z" fill="#1B00FF" fill-opacity="1" stroke="#1B00FF" stroke-width="1" stroke-opacity="1"/>
    </g>
    <g>
      <path d="M311,262.333 L272.275,163.216" fill-opacity="0" stroke="#1B00FF" stroke-width="1"/>
      <path d="M275.069,162.125 L269.364,155.765 L269.481,164.308 z" fill="#1B00FF" fill-opacity="1" stroke="#1B00FF" stroke-width="1" stroke-opacity="1"/>
    </g>
    <g>
      <path d="M203.5,69.5 L203.5,64.5" fill-opacity="0" stroke="#FF0000" stroke-width="1"/>
      <path d="M203.5,64.5 L206.5,60.5 L203.5,56.5 L200.5,60.5 z" fill-opacity="0" stroke="#FF0000" stroke-width="1" stroke-opacity="1"/>
    </g>
    <g>
      <path d="M203.5,135 L203.5,130" fill-opacity="0" stroke="#FF0000" stroke-width="1"/>
      <path d="M203.5,130 L206.5,126 L203.5,122 L200.5,126 z" fill-opacity="0" stroke="#FF0000" stroke-width="1" stroke-opacity="1"/>
    </g>
    <g>
      <path d="M296.5,109.833 L296.5,104.833" fill-opacity="0" stroke="#FF0000" stroke-width="1"/>
      <path d="M296.5,104.833 L299.5,100.833 L296.5,96.833 L293.5,100.833 z" fill-opacity="0" stroke="#FF0000" stroke-width="1" stroke-opacity="1"/>
    </g>
    <g>
      <path d="M296.5,41.5 L296.5,36.5" fill-opacity="0" stroke="#FF0000" stroke-width="1"/>
      <path d="M296.5,36.5 L299.5,32.5 L296.5,28.5 L293.5,32.5 z" fill-opacity="0" stroke="#FF0000" stroke-width="1" stroke-opacity="1"/>
    </g>
    <g>
      <path d="M225,221.833 L225,216.833" fill-opacity="0" stroke="#FF0000" stroke-width="1"/>
      <path d="M225,216.833 L228,212.833 L225,208.833 L222,212.833 z" fill-opacity="0" stroke="#FF0000" stroke-width="1" stroke-opacity="1"/>
    </g>
    <g>
      <path d="M225,284.833 L225,279.833" fill-opacity="0" stroke="#FF0000" stroke-width="1"/>
      <path d="M225,279.833 L228,275.833 L225,271.833 L222,275.833 z" fill-opacity="0" stroke="#FF0000" stroke-width="1" stroke-opacity="1"/>
    </g>
    <g>
      <path d="M335,262.333 L335,257.333" fill-opacity="0" stroke="#FF0000" stroke-width="1"/>
      <path d="M335,257.333 L338,253.333 L335,249.333 L332,253.333 z" fill-opacity="0" stroke="#FF0000" stroke-width="1" stroke-opacity="1"/>
    </g>
    <g>
      <path d="M296.5,369.5 L296.5,364.5" fill-opacity="0" stroke="#FF0000" stroke-width="1"/>
      <path d="M296.5,364.5 L299.5,360.5 L296.5,356.5 L293.5,360.5 z" fill-opacity="0" stroke="#FF0000" stroke-width="1" stroke-opacity="1"/>
    </g>
    <g>
      <path d="M243.833,369.5 L302.5,369.5 L302.5,414.5 L243.833,414.5 L243.833,369.5 z" fill="#FFFFFF"/>
      <path d="M243.833,369.5 L302.5,369.5 L302.5,414.5 L243.833,414.5 L243.833,369.5 z" fill-opacity="0" stroke="#FF0000" stroke-width="2"/>
    </g>
    <text transform="matrix(1, 0, 0, 1, 273, 392.312)">
      <tspan x="-19.82" y="-3" font-family="HelveticaNeue-Medium" font-size="10" fill="#FF0000">Control</tspan>
      <tspan x="-19.82" y="9" font-family="HelveticaNeue-Medium" font-size="10" fill="#FF0000">Plane</tspan>
    </text>
    <text transform="matrix(1, 0, 0, 1, 112.52, 371)">
      <tspan x="-84.48" y="-3" font-family="HelveticaNeue-Medium" font-size="10" fill="#0200FF">Blue links are authenticated</tspan>
      <tspan x="-84.48" y="9" font-family="HelveticaNeue-Medium" font-size="10" fill="#0200FF">(via the control plane) and encrypted</tspan>
    </text>
    <text transform="matrix(1, 0, 0, 1, 112.52, 410.5)">
      <tspan x="-84.48" y="-3" font-family="HelveticaNeue-Medium" font-size="10" fill="#FF0000">Red links to the control plane </tspan>
      <tspan x="-84.48" y="9" font-family="HelveticaNeue-Medium" font-size="10" fill="#FF0000">service establishes trust zone</tspan>
    </text>
  </g>
</svg></p>

<p>Coincidentally, I learned about these concepts in a parallel universe at around the time the Zero Trust term was coined, in the network infrastructure of Google. The same ideas where developed both for <a href="https://cloud.google.com/security/encryption-in-transit/application-layer-transport-security/resources/alts-whitepaper.pdf">prod network security</a> and the corp network. The latter got the cute project name <a href="https://cloud.google.com/beyondcorp/">BeyondCorp</a> which has made its way into public awareness.</p>

<h2>Microsegmentation</h2>

<p>This one is a little trickier.</p>

<p>Microsgementation is a technique for transitioning from classic a chewy-center trusting network to Zero Trust network.</p>

<p>The process: take a traditional network. You have one segment. Now find a set of machines with a small surface area and cut them off from the larger network. Use access control rules to designate precisely how the rest of the network is allowed to communicate with the machines you have cut off. Now you have two segments.</p>

<p>Repeat the process, segmenting your traditional network and your new segments, until the segments are so small each only contains only a tiny number of machines. That is microsegmenting.</p>

<p>When each microsegment contains only one machine, congratulations you have a Zero Trust network.</p>

<p>This process is entirely possible today with the tools we have known for years: routers, firewalls, VPNs. But the process is daunting. Segmenting part of a network take months of archeology, calling retired employees, finding software engineers to modify services everyone had forgotten about (well, everyone but the handful of people who use them to make a large part of your company&apos;s revenue).</p>

<p>Microsegmentation is extremely difficult with today&apos;s tools.</p>

<h2>Why am I talking about these concepts?</h2>

<p>It turns out we (<a href="https://tailscale.io">Tailscale</a>) are building a new Zero Trust networking product, designed specifically to make microsegmentation much easier.</p>

<p>Funnily enough, we did not realize that was the name for what we were doing until very recently. I knew of the principles behind BeyondCorp: authenticate everyone, encrypt every packet, log everything (Zero Trust), and we decided that companies need help reaching that goal incrementally (microsegmentation).</p>

<p>Sure, you could rewrite everything to be Zero Trust from day one, but almost no-one can afford the massive costs of such a multi-year project. Indeed, it is nearly impossible to develop an estimate for how expensive the process would be in a major company: turn over a rock and you will find a new server.</p>

<p>I find this particular problem space very interesting, because solving it well is not just about making existing software work well, it is about reclaiming a way of easy, cheap programming that has been made unsafe by the growing threats from the internet to the traditional trusted network model.</p>

<p>If you are going through a migration away from perimeter defense and want to talk about it, please send me an email.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Go 1.13: xerrors</title>
	<link href="https://crawshaw.io/blog/xerrors" />
	<id>https://crawshaw.io/blog/xerrors</id>
	<updated>2019-04-28T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>Go 1.13: xerrors</h1>

<p><em>2019-04-28</em></p>

<p>Part of the Go 2 series of language changes is a new
<a href="https://go.googlesource.com/proposal/+/master/design/29934-error-values.md">error inspection proposal</a>.</p>

<p>The error inspection proposal adds several features to errors that
have been tried elsewhere
(in packages such as <a href="https://github.com/pkg/errors">github.com/pkg/errors</a>),
with some new implementation tricks.
The proposal has been implemented in tip as preperation for Go 1.13.
You can try it out today by working with Go from tip, or by using
the package <a href="https://golang.org/x/xerrors">golang.org/x/xerrors</a> with
Go 1.12.</p>

<p>The extra features are entirely library-based, no changes to the
compiler or runtime are involved.
One big new feature is error wrapping.</p>

<h2>A worked example: wrapping &quot;key not found&quot;</h2>

<p>A product we are building for <a href="https://tailscale.io">Tailscale</a>
includes a simple key-value-store called taildb.
As with many simple KV-stores, you can read key-values.
Nothing fancy:</p>

<pre><code>// Get fetches and unmarshals the JSON blob for the key k into v.
// If the key is not found, Get reports a &quot;key not found&quot; error.
func (tx *Tx) Get(k string, v interface{}) (err error)
</code></pre>

<p>Let&apos;s talk about &quot;key not found.&quot;</p>

<h3>Version 1</h3>

<p>The very first API version defined the &quot;key not found&quot; error as:</p>

<pre><code>var ErrNotFound = errors.New(&quot;taildb: key not found&quot;)
</code></pre>

<p>Code that used taildb could use it easily:</p>

<pre><code>var val Value
if err := tx.Get(&quot;my-key&quot;, &amp;val); err == taildb.ErrNotFound {
	// no such key
} else if err != nil {
	// something went very wrong
} else {
	// use val
}
</code></pre>

<p>This was fine until I was doing some debugging and ran across a log
entry that boiled down to:</p>

<pre><code>my_http_handler: taildb: key not found
</code></pre>

<p>…which is not a very informative error message.</p>

<h3>Version 2</h3>

<p>Given that the <code>Get</code> method has the key name, it would be nice to
include it in the error message.</p>

<p>So I followed a common strategy in Go of introducing an error type
into the taildb package:</p>

<pre><code>type KeyNotFoundError struct {
	Name string
}

func (e KeyNotFoundError) Error() string {
	return fmt.Errorf(&quot;taildb: key %q not found&quot;)
}
</code></pre>

<p>This works well!
The code that checks for this specific error is a tiny bit messier, but it works:</p>

<pre><code>var val Value
err := tx.Get(&quot;my-key&quot;, &amp;val)
if err != nil {
	if _, isNotFound := err.(taildb.KeyNotFoundError); isNotFound {
		// no such key
	} else {
		// something went very wrong
	}
} else {
	// use val
}
</code></pre>

<p>But this style of direct matching has a flaw.
If any intermediate code adds information to the error we can no longer check
the type of the error.
Consider a function like:</p>

<pre><code>func accessCheck(tx *taildb.Tx, key string) error {
	var val Value
	if err := tx.Get(key, &amp;val); err != nil {
		return fmt.Errorf(&quot;access check: %v&quot;, err)
	}
	if !val.AccessGranted {
		return errAccessDenied
	}
	return nil
}
</code></pre>

<p>Here we are implementing logic on top of the database, checking if
the user has some sort of access.
Reporting a nil error grants access, otherwise access is denied.
The reason for denying access might be <code>!AccessGranted</code> or
some underlying database error.
All the textual information about the error is preserved, but
the use of <code>fmt.Errorf</code> means that we can no longer type-check to see
if the access error was a <code>KeyNotFoundError</code>.</p>

<h3>Version 3</h3>

<p>The new xerrors library fixes this by providing a version
of Errorf that preserves the underlying error object inside the new
error:</p>

<pre><code>	if err := tx.Get(key, &amp;val); err != nil {
		return xerrors.Errorf(&quot;access check: %w&quot;, err)
	}
</code></pre>

<p>%w for wrap.</p>

<p>On the surface this implementation of Errorf works exactly as the one
in fmt does.
Under the hood, the preserved type means we can now check the
cause chain for our KeyNotFoundError:</p>

<pre><code>var val Value
if err := accessCheck(tx, &quot;my-key&quot;); err != nil {
	var notFoundErr taildb.KeyNotFoundError
	if xerrors.As(err, &amp;notFoundErr) {
		// no such key
	} else {
		// something went very wrong
	}
} else {
	// use val
}
</code></pre>

<p>Great!</p>

<h3>Version 4</h3>

<p>We can do even better.
The only reason we replaced the exported KeyNotFoundError was so
we could put a little extra text in the error message while making
the type testable.
The new xerrors gives us an easier way to do that.</p>

<p>So let&apos;s return to the very first definition:</p>

<pre><code>var ErrNotFound = errors.New(&quot;key not found&quot;)
</code></pre>

<p>Inside taildb we can write:</p>

<pre><code>func (tx *Tx) Get(k string, v interface{}) (err error) {
	// ...
	if noSuchKey {
		return xerrors.Errorf(&quot;taildb: %q: %w&quot;, k, ErrNotFound)
	}
}
</code></pre>

<p>All the information we want is here.
When we print the error to a log we see <code>taildb: &quot;my-key&quot;: key not found</code>.
To check the returned error from <code>accessCheck</code> we can write:</p>

<pre><code>var val Value
if err := accessCheck(tx, &quot;my-key&quot;); xerrors.Is(err, taildb.ErrNotFound) {
	// no such key
} else if err != nil {
	// something went very wrong
} else {
	// use val
}
</code></pre>

<p>Easy!</p>

<h2>Go 1.13</h2>

<p>The new xerrors is due to be promoted into the standard library&apos;s errors
package in Go 1.13.</p>

<p>Instead of xerrors.Errorf, the chaining is being built directly into the
<a href="https://tip.golang.org/pkg/fmt/#Errorf">fmt.Errorf</a> function we use today:</p>

<pre><code>If the last argument is an error and the format string ends with &quot;: %w&quot;,
the returned error implements errors.Wrapper with an Unwrap method returning it.
</code></pre>

<p>Certainly this looks nice.
However, Go 1.13 is only three months away!
After that, all of these new changes (and this post only covers one)
will be frozen forever in the standard library under the
<a href="https://golang.org/doc/go1compat">Go 1 compatibility promise</a>.
For such a high standard, this package is
<a href="https://godoc.org/golang.org/x/xerrors?importers">woefully under-tested</a>.</p>

<p>I would encourage you to start using golang.org/x/xerrors today,
or even better, start developing directly against Go tip by
<a href="https://golang.org/doc/install/source">installing from source</a>.
More people need to give this a go.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Fast compilers for fast programs</title>
	<link href="https://crawshaw.io/blog/fast-compilers" />
	<id>https://crawshaw.io/blog/fast-compilers</id>
	<updated>2019-04-14T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>Fast compilers for fast programs</h1>

<p><em>2019-04-14</em></p>

<p>Compiler authors face a tradeoff between compiler speed and executable speed.
Take longer to build a binary and you can build a better binary.</p>

<p>There is a counter-intuitive condition where this tradeoff breaks down.
There are programs where making the compiler faster (by building a
worse binary) make the program faster!
Under the right conditions you can also make the compiler slower
(by building a better binary) and make the program slower.</p>

<p>When you have a small team that write tests and a compiler that is on
the boundary of interactive speed, compiler speed dominates performance.</p>

<h2>An example</h2>

<p>Here is how it works. I am writing some code and I run the tests:</p>

<pre><code>$ go test tailscale.io/taildb
ok      tailscale.io/taildb     0.024s
</code></pre>

<p>Test execution took 0.024s and total compile + test time was under 200ms.</p>

<p>Now I make a change,</p>

<pre><code>$ go test tailscale.io/taildb
ok      tailscale.io/taildb     0.324s
</code></pre>

<p>Test execution now takes 0.324s and total compile + test time was about
half a second.</p>

<p>This is not a jump in test execution time I am likely to notice
skimming stdout as I am developing.
This is ordinary code, it needs to run at an adequate speed but I am
not explicitly working on performance.</p>

<p>But I feel it.</p>

<p>Suddenly the command that finished as soon as I pressed enter is
stuttering.
Is my computer doing something else? What&apos;s wrong?
Oh, I wrote something slow.</p>

<p>The delta between the old fast code and the new slow code is in my
head because I am working on it right now.
The stuttering command line is a subtle UX poke, hey, you just
ruined a nice program.
I can spend a minute to see if there is any obvious mistake,
and fix it right now.
If need be I can ignore it and plow on.</p>

<p>It is here, at the edge of interactive programming that compiler
speed is vital to program performance.
If a new compiler release adds enough compile time that I no
longer can feel when I break a program, then I will start missing
these moments.
My code will get slower.</p>

<p>Similarly if a new compiler release makes the compiler faster,
I will start noticing my own bad code more often.</p>

<h1>Compiler author incentives</h1>

<p>Different programming languages are used to write different kinds
of programs.
(A great deal of programmer time is spent discussing this topic,
I suspect the strongest forces that affect this are path-dependence
and aesthetics, so I stay out of it.)
Programming languages that are optimized for big projects tend to
have slower compilers that produce higher-quality executables.</p>

<p>When compiling C or C++ with gcc or llvm, programs quickly reach a
point where project compile time is non-interactive.
This is fine, because the projects where C/C++ programmers congregate
typically take 10s of minutes to hours to build.</p>

<p>Large teams invest in tooling to get performance.
That test execution number I skimmed earlier, 24ms, is logged by
software that records all compilations, and is tracked against its
historical execution time.
When it jumps to 324ms, something lights up in red on a dashboard,
and a release engineer sends an email.
The commit is found, and someone goes and fixes it.</p>

<p>This works!
Chromium has a huge team and shockingly-slow compile times, and yet
the product is beautifully fast.
Large organized teams can make these investments.</p>

<p>Indeed, the slow compiler is exactly what these teams want.
If you can add a minute to Chromium&apos;s build time with a sophisticated
compiler analysis pass that shaves 1ms off the typical frame
rendering time of Chrome, everyone will cheer.</p>

<h1>Aside: languages get caught in the middle</h1>

<p>I have heard the argument that we should simply make the sophisticated
tooling of Chromium easy for small teams to adopt, but this does not work.
If I get a dashboard notification tomorrow that I made the code in my
example slower, I will ignore it.
I am not in the moment, thinking about the exact problem, and it is
almost never worth revisiting it for performance.
There was exactly one moment when I was willing to consider the
performance of that code, and it was the moment my shell command
stuttered.</p>

<p>So small teams and large teams need different compilers.</p>

<p>Our industry has aligned compilers with programming languages.
That means in practice it is best if small teams and large teams use
different programming languages.
This is unfortunate, both because I have to listen to people tell me
that some languages are slow and others are fast (when the language
is being conflated with its compiler), and more importantly: sometimes
small teams with small programs become big teams with big programs.
Now they are using the wrong compiler.
They need a slow sophisticated compiler, but none is available for
their language.</p>

<p>I don&apos;t know how to solve this.
Fortunately it is an uncommon problem.
There is however, another related problem that is extremely common and
deserves more attention.</p>

<h1>The &quot;medium-sized project&quot; moment</h1>

<p>No matter how fast the compiler, there comes a point at which project
has grown such that compile times are no longer interactive.
To pick on a particular language again (because I spent some time in
its compiler and will make fewer mistakes discussing it specifically),
I have experienced this several times when writing Go programs because
the Go compiler is on the knife-edge of interactive performance.
Every release the compiler gets faster, or slower!, and the line moves.</p>

<p>I believe this point should be measured.</p>

<p>A lot of engineering effort goes into making the Go compiler fast, but
the measurements don&apos;t reflect the user experience.
Compiler performance is measured as percentage changes of the time it
takes to build significant body of code.</p>

<p>It does not matter at all if a project that takes two minutes to build
is 5% faster this release.
It takes two minutes!
Make it one minute or three minutes, it does not matter.
On the other hand, it is a really big deal if a project that takes one
second to build now takes 500ms.</p>

<p>If a compiler team want to focus on interactive compiler experience then they should:</p>

<ol>
<li>Determine via UX study what compiler speeds matter for interactive development. I really want to know where this line is: is it 200ms compile times, or 2s? Right now I can only guess.</li>
<li>Find a series of example projects that currently fall just either side of the interactive speed.</li>
<li>Set up automated compiler speed testing on these projects.</li>
<li>Dedicate engineering time to moving projects from the non-interactive to the interactive side of the dividing line.</li>
</ol>

<p>Interactive compilation is wonderful UX.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>UTF-7: a ghost from the time before UTF-8</title>
	<link href="https://crawshaw.io/blog/utf7" />
	<id>https://crawshaw.io/blog/utf7</id>
	<updated>2018-10-31T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>UTF-7: a ghost from the time before UTF-8</h1>

<p><em>2018-10-31</em></p>

<p>On Halloween this year I learned two scary things.
The first is that a young toddler can go trick-or-treating in your
apartment building and acquire a huge amount of candy.
When they are this young they have no interest in the candy itself,
so you are left having to eat it all yourself.</p>

<p>The second scary thing is that in the heart of the ubiquitous
IMAP protocol lingers a ghost of the time before UTF-8.
Its name is Modified UTF-7.</p>

<h2>UTF-7</h2>

<p>UTF-7 is described in <a href="https://tools.ietf.org/html/rfc2152">RFC 2152</a>.
It lets you encode all of Unicode, much like the other UTF encoding
schemes, though it adds a neat property: it only uses printable ASCII
characters to do it.
Unfortunately you pay a price: it is complicated and inefficient.</p>

<p>First, most ASCII characters are represented by themselves.
The important exception is the shift character <code>+</code>.
Instead of <code>+</code> we now write <code>+-</code>.</p>

<p>Any sequence of non-ASCII characters (or disallowed ASCII characters
such as <code>~</code>) are first converted to UTF-16BE,
then encoded as base64, and placed between a <code>+</code> and a <code>-</code>.</p>

<p><em>(Even though this is 2018, occasionally someone will try to claim in
conversation with me that UTF-16 is better than UTF-8.
The obvious response is to point to the surrogate pairs mess,
but many people defending UTF-16 don&apos;t realize those are necessary.
I have found I can skip over the long explanation of surrogates
by simply asking: &quot;do you mean UTF-16LE or UTF-16BE?&quot;)</em></p>

<p>There is something immediately appealing about this definition of
UTF-7.
You can describe it in three sentences, it is built on the popular
encoding scheme base64, and it is ASCII printable.</p>

<p>An example:</p>

<pre><code>&quot;Hello, 世界&quot;           (UTF-8)
&quot;Hello, \u4E16\u754C&quot;  (ASCII with unicode hex literals)
&quot;Hello, +ThZ1TA-&quot;      (UTF-7)
</code></pre>

<p>UTF-7 is not a particularly appealing wire format.
In the example above UTF-7 uses 8 bytes to represent what UTF-8 does
in 6 bytes.
It becomes even less efficient if ASCII is regularly mixed in
with non-ASCII code points as we need to constantly add escape
characters.
And while it is ASCII printable, the printing is inscrutable.
Relating <code>ThZ1TA</code> back to anything is beyond my mind, so I may
as well use something non-printable.</p>

<p>To make matters worse, this is not base64. It is <strong>modified base64</strong>.
The base64 padding character <code>=</code> cannot appear in UTF-7.
To avoid it the RFC tells us to pad the UTF-16BE with zero bits
until you reach a length that can be base64 encoded without padding:</p>

<pre><code>      Next, the octet stream is encoded by applying the Base64 content
      transfer encoding algorithm as defined in RFC 2045, modified to
      omit the &quot;=&quot; pad character. Instead, when encoding, zero bits are
      added to pad to a Base64 character boundary. When decoding, any
      bits at the end of the Modified Base64 sequence that do not
      constitute a complete 16-bit Unicode character are discarded.
</code></pre>

<p>That sounds fishy.</p>

<p>Base64 encodes every block of 3 bytes to 4 bytes.
If what you are encoding is not divisible by 3 then what you
have is encoded and the base64 string padded so it is divisible
by four using <code>=</code>.
This means you may get up to two <code>=</code> characters at the end of
a base64 string.
If we are going to pad the input as the RFC suggests so that we
never use <em>=</em>, we may have to pad up to two bytes of input with
zeros.
That would form a valid UTF-16 NULL!</p>

<p>So how do we handle this padding?</p>

<p>I looked inside three UTF-7 encoders and found they don&apos;t follow
the RFC at all on this.
Instead, they encode the UTF-16 to modified base64 without any
zero bit padding, and then remove any base64 <code>=</code> padding from
the result.</p>

<p>This works and it produces shorter results with no ambiguous NULL
than the RFC process.
But it sure would be nice if someone had documented it.</p>

<p>To explain with an example, the initial base64 output for the string
<code>世界</code> is <code>ThZ1TA==</code>.
We removed the trailing <code>==</code> to produce UTF-7.</p>

<h2>Modified UTF-7</h2>

<p>UTF-7 is no more. It has long since
been replaced in SMTP and in MIME headers where many encodings
can be used, people choose other things.
However a modified version is still used in IMAP.
<a href="https://tools.ietf.org/html/rfc3501#section-5.1.3">RFC 3501</a>
describes it:</p>

<ul>
<li><p>Modified base64 is modified further, now in the encoded alphabet
<code>/</code> is replaced by <code>,</code>.
This is neither the standard nor URL base64 encoding scheme you
have seen before.</p></li>

<li><p>The ASCII characters <code>'\'</code> and <code>~</code> no longer need to be encoded.
In fact, they MUST not be encoded.</p></li>

<li><p>The escape character is now <code>&amp;</code> instead of <code>+</code>.</p></li>
</ul>

<p>So now we have modified-modified-base64 and our example above reads:</p>

<pre><code>&quot;Hello, &amp;ThZ1TA-&quot;      (Modified UTF-7)
</code></pre>

<h2>A simpler future</h2>

<p>IMAP is a living protocol with many RFCs adding extensions.
One of those is <a href="https://tools.ietf.org/html/rfc6855">RFC 6855</a>
which lets a server and client negotiate <code>UTF8=ACCEPT</code> capability
and drop all the UTF-7.</p>

<p>It even includes a negotiation mode for the future where servers
can announce <code>UTF8=ONLY</code> and refuse to talk any UTF-7 with clients.
Hopefully we can get there.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>One process programming notes (with Go and SQLite)</title>
	<link href="https://crawshaw.io/blog/one-process-programming-notes" />
	<id>https://crawshaw.io/blog/one-process-programming-notes</id>
	<updated>2018-07-30T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>One process programming notes (with Go and SQLite)</h1>

<p><em>2018 July 30</em></p>

<p><em>Blog-ified version of a talk I gave at <a href="https://gonorthwest.io">Go Northwest</a>.</em></p>

<p>This content covers my recent exploration of writing
internet services, iOS apps, and macOS programs as an
indie developer.</p>

<p>There are several topics here that should each have their own blog
post. But as I have a lot of programming to do I am going to put
these notes up as is and split the material out some time later.</p>

<p>My focus has been on how to adapt the lessons I have learned working in
teams at Google to a single programmer building small business work.
There are many great engineering practices in Silicon Valley&apos;s big
companies and well-capitalized VC firms, but one person does not have
enough bandwidth to use them all and write software.
The exercise for me is: what to keep and what must go.</p>

<p>If I have been doing it right, the technology and techniques described
here will sound easy. I have to fit it all in my head while having enough
capacity left over to write software people want.
Every extra thing has great cost,
especially rarely touched software that comes back to bite
in the middle of the night six months later.</p>

<p>Two key technologies I have decided to use are Go and SQLite.</p>

<h2>A brief introduction to SQLite</h2>

<p>SQLite is an implementation of SQL.
Unlike traditional database implementations like PostgreSQL or MySQL,
SQLite is a self-contained C library designed to be embedded into programs.
It has been built by D. Richard Hipp since its release in 2000,
and in the past 18 years other open source contributors have helped.
At this point it has been around most of the time I have been programming
and is a core part of my programming toolbox.</p>

<h3>Hands-on with the SQLite command line tool</h3>

<p>Rather than talk through SQLite in the abstract, let me show it to you.</p>

<p>A kind person on Kaggle has
<a href="https://www.kaggle.com/kingburrito666/shakespeare-plays">provided a CSV file</a>
of the plays of Shakespeare.
Let&apos;s build an SQLite database out of it.</p>

<pre><code>$ head shakespeare_data.csv
&quot;Dataline&quot;,&quot;Play&quot;,&quot;PlayerLinenumber&quot;,&quot;ActSceneLine&quot;,&quot;Player&quot;,&quot;PlayerLine&quot;
&quot;1&quot;,&quot;Henry IV&quot;,,,,&quot;ACT I&quot;
&quot;2&quot;,&quot;Henry IV&quot;,,,,&quot;SCENE I. London. The palace.&quot;
&quot;3&quot;,&quot;Henry IV&quot;,,,,&quot;Enter KING HENRY, LORD JOHN OF LANCASTER, the EARL of WESTMORELAND, SIR WALTER BLUNT, and others&quot;
&quot;4&quot;,&quot;Henry IV&quot;,&quot;1&quot;,&quot;1.1.1&quot;,&quot;KING HENRY IV&quot;,&quot;So shaken as we are, so wan with care,&quot;
&quot;5&quot;,&quot;Henry IV&quot;,&quot;1&quot;,&quot;1.1.2&quot;,&quot;KING HENRY IV&quot;,&quot;Find we a time for frighted peace to pant,&quot;
&quot;6&quot;,&quot;Henry IV&quot;,&quot;1&quot;,&quot;1.1.3&quot;,&quot;KING HENRY IV&quot;,&quot;And breathe short-winded accents of new broils&quot;
&quot;7&quot;,&quot;Henry IV&quot;,&quot;1&quot;,&quot;1.1.4&quot;,&quot;KING HENRY IV&quot;,&quot;To be commenced in strands afar remote.&quot;
&quot;8&quot;,&quot;Henry IV&quot;,&quot;1&quot;,&quot;1.1.5&quot;,&quot;KING HENRY IV&quot;,&quot;No more the thirsty entrance of this soil&quot;
&quot;9&quot;,&quot;Henry IV&quot;,&quot;1&quot;,&quot;1.1.6&quot;,&quot;KING HENRY IV&quot;,&quot;Shall daub her lips with her own children's blood,&quot;
</code></pre>

<p>First, let&apos;s use the sqlite command line tool to create a new
database and import the CSV.</p>

<pre><code>$ sqlite3 shakespeare.db
sqlite&gt; .mode csv
sqlite&gt; .import shakespeare_data.csv import
</code></pre>

<p>Done! A couple of SELECTs will let us quickly see if it worked.</p>

<pre><code>sqlite&gt; SELECT count(*) FROM import;
111396
sqlite&gt; SELECT * FROM import LIMIT 10;
1,&quot;Henry IV&quot;,&quot;&quot;,&quot;&quot;,&quot;&quot;,&quot;ACT I&quot;
2,&quot;Henry IV&quot;,&quot;&quot;,&quot;&quot;,&quot;&quot;,&quot;SCENE I. London. The palace.&quot;
3,&quot;Henry IV&quot;,&quot;&quot;,&quot;&quot;,&quot;&quot;,&quot;Enter KING HENRY, LORD JOHN OF LANCASTER, the EARL of WESTMORELAND, SIR WALTER BLUNT, and others&quot;
4,&quot;Henry IV&quot;,1,1.1.1,&quot;KING HENRY IV&quot;,&quot;So shaken as we are, so wan with care,&quot;
5,&quot;Henry IV&quot;,1,1.1.2,&quot;KING HENRY IV&quot;,&quot;Find we a time for frighted peace to pant,&quot;
6,&quot;Henry IV&quot;,1,1.1.3,&quot;KING HENRY IV&quot;,&quot;And breathe short-winded accents of new broils&quot;
7,&quot;Henry IV&quot;,1,1.1.4,&quot;KING HENRY IV&quot;,&quot;To be commenced in strands afar remote.&quot;
8,&quot;Henry IV&quot;,1,1.1.5,&quot;KING HENRY IV&quot;,&quot;No more the thirsty entrance of this soil&quot;
9,&quot;Henry IV&quot;,1,1.1.6,&quot;KING HENRY IV&quot;,&quot;Shall daub her lips with her own children's blood,&quot;
</code></pre>

<p>Looks good!
Now we can do a little cleanup.
The original CSV contains a column called AceSceneLine that uses dots
to encode Act number, Scene number, and Line number.
Those would look much nicer as their own columns.</p>

<pre><code>sqlite&gt; CREATE TABLE plays (rowid INTEGER PRIMARY KEY, play, linenumber, act, scene, line, player, text);
sqlite&gt; .schema
CREATE TABLE import (rowid primary key, play, playerlinenumber, actsceneline, player, playerline);
CREATE TABLE plays (rowid primary key, play, linenumber, act, scene, line, player, text);
sqlite&gt; INSERT INTO plays SELECT
	row AS rowid,
	play,
	playerlinenumber AS linenumber,
	substr(actsceneline, 1, 1) AS act,
	substr(actsceneline, 3, 1) AS scene,
	substr(actsceneline, 5, 5) AS line,
	player,
	playerline AS text
	FROM import;
</code></pre>

<p>(The <code>substr</code> above can be improved by using <code>instr</code> to find the &apos;.&apos; characters.
Exercise left for the reader.)</p>

<p>Here we used the <code>INSERT ... SELECT</code> syntax to build a table out of another table.
The <code>ActSceneLine</code> column was split apart using the builtin SQLite function <code>substr</code>,
which slices strings.</p>

<p>The result:</p>

<pre><code>sqlite&gt; SELECT * FROM plays LIMIT 10;
1,&quot;Henry IV&quot;,&quot;&quot;,&quot;&quot;,&quot;&quot;,&quot;&quot;,&quot;&quot;,&quot;ACT I&quot;
2,&quot;Henry IV&quot;,&quot;&quot;,&quot;&quot;,&quot;&quot;,&quot;&quot;,&quot;&quot;,&quot;SCENE I. London. The palace.&quot;
3,&quot;Henry IV&quot;,&quot;&quot;,&quot;&quot;,&quot;&quot;,&quot;&quot;,&quot;&quot;,&quot;Enter KING HENRY, LORD JOHN OF LANCASTER, the EARL of WESTMORELAND, SIR WALTER BLUNT, and others&quot;
4,&quot;Henry IV&quot;,1,1,1,1,&quot;KING HENRY IV&quot;,&quot;So shaken as we are, so wan with care,&quot;
5,&quot;Henry IV&quot;,1,1,1,2,&quot;KING HENRY IV&quot;,&quot;Find we a time for frighted peace to pant,&quot;
6,&quot;Henry IV&quot;,1,1,1,3,&quot;KING HENRY IV&quot;,&quot;And breathe short-winded accents of new broils&quot;
7,&quot;Henry IV&quot;,1,1,1,4,&quot;KING HENRY IV&quot;,&quot;To be commenced in strands afar remote.&quot;
8,&quot;Henry IV&quot;,1,1,1,5,&quot;KING HENRY IV&quot;,&quot;No more the thirsty entrance of this soil&quot;
9,&quot;Henry IV&quot;,1,1,1,6,&quot;KING HENRY IV&quot;,&quot;Shall daub her lips with her own children's blood,&quot;
</code></pre>

<p>Now we have our data, let us search for something:</p>

<pre><code>sqlite&gt; SELECT * FROM plays WHERE text LIKE &quot;whether tis nobler%&quot;;
sqlite&gt;
</code></pre>

<p>That did not work.
Hamlet definitely says that, but perhaps the text formatting is slightly off.
SQLite to the rescue.
It ships with a Full Text Search extension compiled in.
Let us index all of Shakespeare with FTS5:</p>

<pre><code>sqlite&gt; CREATE VIRTUAL TABLE playsearch USING fts5(playsrowid, text);
sqlite&gt; INSERT INTO playsearch SELECT rowid, text FROM plays;
</code></pre>

<p>Now we can search for our soliloquy:</p>

<pre><code>sqlite&gt; SELECT rowid, text FROM playsearch WHERE text MATCH &quot;whether tis nobler&quot;;
34232|Whether 'tis nobler in the mind to suffer
</code></pre>

<p>Success! The act and scene can be acquired by joining with our original table.</p>

<pre><code>sqlite&gt; SELECT play, act, scene, line, player, plays.text
	FROM playsearch
	INNER JOIN plays ON playsearch.playsrowid = plays.rowid
	WHERE playsearch.text MATCH &quot;whether tis nobler&quot;;
Hamlet|3|1|65|HAMLET|Whether 'tis nobler in the mind to suffer
</code></pre>

<p>Let&apos;s clean up.</p>

<pre><code>sqlite&gt; DROP TABLE import;
sqlite&gt; VACUUM;
</code></pre>

<p>Finally, what does all of this look like on the file system?</p>

<pre><code>$ ls -l
-rwxr-xr-x@ 1 crawshaw  staff  10188854 Apr 27  2017 shakespeare_data.csv
-rw-r--r--  1 crawshaw  staff  22286336 Jul 25 22:05 shakespeare.db
</code></pre>

<p>There you have it.
The SQLite database contains two full copies of the plays of Shakespeare,
one with a full text search index, and stores both of them in about twice the
space it takes the original CSV file to store one.
Not bad.</p>

<p>That should give you a feel for the i-t-e of SQLite.</p>

<p>And scene.</p>

<h2>Using SQLite from Go</h2>

<h3>The standard <code>database/sql</code></h3>

<p>There are a number of cgo-based <a href="https://golang.org/pkg/database/sql">database/sql</a>
drivers available for SQLite.
The most popular one appears to be
<a href="https://github.com/mattn/go-sqlite3">github.com/mattn/go-sqlite3</a>.
It gets the job done and is probably what you want.</p>

<p>Using the <code>database/sql</code> package it is straightforward to open an SQLite database
and execute SQL statements on it.
For example, we can run the FTS query from earlier using this Go code:</p>

<pre><code>package main

import (
	&quot;database/sql&quot;
	&quot;fmt&quot;
	&quot;log&quot;

	_ &quot;github.com/mattn/go-sqlite3&quot;
)

func main() {
	db, err := sql.Open(&quot;sqlite3&quot;, &quot;shakespeare.db&quot;)
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()
	stmt, err := db.Prepare(`
		SELECT play, act, scene, plays.text
		FROM playsearch
		INNER JOIN plays ON playsearch.playrowid = plays.rowid
		WHERE playsearch.text MATCH ?;`)
	if err != nil {
		log.Fatal(err)
	}
	var play, text string
	var act, scene int
	err = stmt.QueryRow(&quot;whether tis nobler&quot;).Scan(&amp;play, &amp;act, &amp;scene, &amp;text)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf(&quot;%s %d:%d: %q\n&quot;, play, act, scene, text)
}
</code></pre>

<p>Executing it yields:</p>

<pre><code>Hamlet 3:1 &quot;Whether 'tis nobler in the mind to suffer&quot;
</code></pre>

<h3>A low-level wrapper: <code>crawshaw.io/sqlite</code></h3>

<p>Just as SQLite steps beyond the basics of <code>SELECT, INSERT, UPDATE, DELETE</code> with
full-text search, it has several other interesting features and extensions that
cannot be accessed by SQL statements alone.
These need specialized interfaces, and many of the interfaces are not supported
by any of the existing drivers.</p>

<p>So I wrote my own.
You can get it from <a href="https://crawshaw.io/sqlite">crawshaw.io/sqlite</a>.
In particular, it supports the streaming blob interface,
the <a href="https://www.sqlite.org/sessionintro.html">session extension</a>,
and implements the necessary <code>sqlite_unlock_notify</code> machinery to make
good use of the <a href="https://www.sqlite.org/sharedcache.html">shared cache</a>
for connection pools.
I am going to cover these features through two use case studies: the
client and the cloud.</p>

<h3>cgo</h3>

<p>All of these approaches rely on cgo for integrating C into Go.
This is straightforward to do, but adds some operational complexity.
Building a Go program using SQLite requires a C compiler for the target.</p>

<p>In practice, this means if you develop on macOS you need to install
a cross-compiler for linux.</p>

<p>Typical concerns about the impact on software quality of adding C code
to Go do not apply to SQLite as it has an extraordinary degree of testing.
The quality of the code is exceptional.</p>

<h2>Go and SQLite for the client</h2>

<p>I am building an <a href="https://www.posticulous.com">iOS app</a>, with almost all
the code written in Go and the UI provided by a web view.
This app has a full copy of the user data, it is not a thin view onto an
internet server. This means storing a large amount of local, structured
data, on-device full text searching, background tasks working on the
database in a way that does not disrupt the UI, and syncing DB changes
to a backup in the cloud.</p>

<p>That is a lot of moving parts for a client.
More than I want to write in JavaScript, and more than I want to write
in Swift and then have to promptly rewrite if I ever manage to build an
Android app.
More importantly, the server is in Go, and I am one independent developer.
It is absolutely vital I reduce the number of moving pieces in my
development environment to the smallest possible number.
Hence the effort to build (the big bits) of a client using the exact
same technology as my server.</p>

<h3>The Session extension</h3>

<p>The session extension lets you start a session on an SQLite connection.
All changes made to the database through that connection are bundled into
a patchset blob.
The extension also provides method for applying the generated patchset
to a table.</p>

<pre><code>func (conn *Conn) CreateSession(db string) (*Session, error)

func (s *Session) Changeset(w io.Writer) error

func (conn *Conn) ChangesetApply(
	r          io.Reader,
	filterFn   func(tableName string) bool,
	conflictFn func(ConflictType, ChangesetIter) ConflictAction,
) error
</code></pre>

<p>This can be used to build a very simple client-sync system.
Collect the changes made in a client, periodically bundle them
up into a changeset and upload it to the server where it is applied
to a backup copy of the database.
If another client changes the database then the server advertises
it to the client, who downloads a changeset and applies it.</p>

<p>This requires a bit of care in the database design.
The reason I kept the FTS table separate in the Shakespeare
example is I keep my FTS tables in a separate attached database
(which in SQLite, means a different file).
The cloud backup database never generates the FTS tables,
the client is free to generate the tables in a background thread
and they can lag behind data backups.</p>

<p>Another point of care is minimizing conflicts.
The biggest one is AUTOINCREMENT keys.
By default the primary key of a <code>rowid</code> table is incremented,
which means if you have multiple clients generating rowids you
will see lots of conflicts.</p>

<p>I have been trialing two different solutions.
The first is having each client register a rowid range with
the server and only allocate from its own range.
It works.
The second is randomly generating int64 values, and relying
on the low collision rate.
So far it works too.
Both strategies have risks, and I haven&apos;t decided which is better.</p>

<p>In practice, I have found I have to limit DB updates to a single
connection to keep changeset quality high.
(A changeset does not see changes made on other connections.)
To do this I maintain a read-only pool of connections and a single
guarded read-write connection in a pool of 1.
The code only grabs the read-write connection when it needs it,
and the read-only connections are enforced by the read-only
bit on the SQLite connection.</p>

<h3>Nested Transactions</h3>

<p>The <code>database/sql</code> driver encourages the use of SQL transactions with
its <code>Tx</code> type, but this does not appear to play well with nested
transactions.
This is a concept implemented by <code>SAVEPOINT / RELEASE</code> in SQL, and
it makes for surprisingly composable code.</p>

<p>If a function needs to make multiple statements in a transaction,
it can open with a <code>SAVEPOINT</code>, then defer a call to <code>RELEASE</code>
if the function produces no Go return error, or if it does instead
call <code>ROLLBACK</code> and return the error.</p>

<pre><code>func f(conn *sqlite.Conn) (err error) {
	conn...SAVEPOINT
	defer func() {
		if err == nil {
			conn...RELEASE
		} else {
			conn...ROLLBACK
		}
	}()
}
</code></pre>

<p>Now if this transactional function <code>f</code> needs to call another transactional
function <code>g</code>, then <code>g</code> can use exactly the same strategy and <code>f</code> can call
it in a very traditional Go way:</p>

<pre><code>if err := g(conn); err != nil {
	return err // all changes in f will be rolled back by the defer
}
</code></pre>

<p>The function <code>g</code> is also perfectly safe to use in its own right,
as it has its own transaction.</p>

<p>I have been using this <code>SAVEPOINT + defer RELEASE or return an error</code>
semantics for several months now and find it invaluable.
It makes it easy to safely wrap code in SQL transactions.</p>

<p>The example above however is a bit bulky, and there are some edge cases
that need to be handled.
(For example, if the RELEASE fails, then an error needs to be returned.)
So I have wrapped this up in a utility:</p>

<pre><code>func f(conn *sqlite.Conn) (err error) {
	defer sqlitex.Save(conn)(&amp;err)

	// Code is transactional and can be stacked
	// with other functions that call sqlitex.Save.
}
</code></pre>

<p>The first time you see <code>sqlitex.Save</code> in action it can be a little
off-putting, at least it was for me when I first created it.
But I quickly got used to it, and it does a lot of heavy lifting.
The first call to <code>sqlitex.Save</code> opens a <code>SAVEPOINT</code> on the <code>conn</code>
and returns a closure that either <code>RELEASE</code>s or <code>ROLLBACK</code>s depending
on the value of err, and sets err if necessary.</p>

<h2>Go and SQLite in the cloud</h2>

<p>I have spent several months now redesigning services I have
encountered before and designing services for problems I would like to
work on going forward.
The process has led me to a general design that works for many problems
and I quite enjoy building.</p>

<p>It can be summarized as 1 VM, 1 Zone, <strong>1 process programming</strong>.</p>

<p>If this sounds ridiculously simplistic to you, I think that&apos;s good!
It is simple.
It does not meet all sorts of requirements that we would like our
modern fancy cloud services to meet.
It is not &quot;serverless&quot;, which means when a service is extremely
small it does not run for free, and when a service grows it does not
automatically scale.
Indeed, there is an explicit scaling limit.
Right now the best server you can get from Amazon is roughly:</p>

<ul>
<li>128 CPU threads at ~4GHz</li>
<li>4TB RAM</li>
<li>25 Gbit ethernet</li>
<li>10 Gbps NAS</li>
<li>hours of yearly downtime</li>
</ul>

<p>That is a huge potential downside of of one process programming.
However, I claim that is a livable limit.</p>

<p><em>I claim typical services do not hit this scaling limit.</em></p>

<p>If you are building a small business, most products can grow and
become profitable well under this limit for years.
When you see the limit approaching in the next year or two,
you have a business with revenue to hire more than one engineer,
and the new team can, in the face of radically changing business
requirements, rewrite the service.</p>

<p>Reaching this limit is a good problem to have because when it
comes you will have plenty of time to deal with it and the human
resources you need to solve it well.</p>

<p>Early in the life of a small business you don&apos;t, and every hour
you spend trying to work beyond this scaling limit is an hour
that would have been better spent talking to your customers
about their needs.</p>

<p>The principle at work here is:</p>

<p><strong>Don&apos;t use N computers when 1 will do.</strong></p>

<p>To go into a bit more technical detail,</p>

<p>I run a single VM on AWS, in a single availability zone.
The VM has three EBS volumes (this is Amazon name for NAS).
The first holds the OS, logs, temporary files,
and any ephemeral SQLite databases that are generated from
the main databases, e.g. FTS tables.
The second the primary SQLite database for the main service.
The third holds the customer sync SQLite databases.</p>

<p>The system is configured to periodically snapshot the system EBS
volume and the customer EBS volumes to S3, the Amazon geo-redundant
blob store. This is a relatively cheap operation that can be scripted,
because only blocks that change are copied.</p>

<p>The main EBS volume is backed up to S3 very regularly, by custom
code that flushes the WAL cache. I&apos;ll explain that in a bit.</p>

<p>The service is a single Go binary running on this VM.
The machine has plenty of extra RAM that is used by linux&apos;s disk cache.
(And that can be used by a second copy of the service spinning up
for low down-time replacement.)</p>

<p>The result of this is a service that has at most tens of hours of
downtime a year, about as much change of suffering block loss as
a physical computer with a RAID5 array, and active offsite backups
being made every few minutes to a distributed system that is built
and maintained by a large team.</p>

<p>This system is astonishingly simple.
I shell into one machine.
It is a linux machine.
I have a deploy script for the service that is ten lines long.
Almost all of my performance work is done with pprof.</p>

<p>On a medium sized VM I can clock 5-6 thousand concurrent requests
with only a few hours of performance tuning.
On the largest machine AWS has, tens of thousands.</p>

<p>Now to talk a little more about the particulars of the stack:</p>

<h3>Shared cache and WAL</h3>

<p>To make the server extremely concurrent there are two important
SQLite features I use.
The first is the shared cache, which lets me allocate one large
pool of memory to the database page cache and many concurrent
connections can use it simultaneously.
This requires some support in the driver for <code>sqlite_unlock_notify</code>
so user code doesn&apos;t need to deal with locking events, but that
is transparent to end user code.</p>

<p>The second is the Write Ahead Log.
This is a mode SQLite can be knocked into at the beginning of
connection which changes the way it writes transactions to disk.
Instead of locking the database and making modifications along
with a rollback journal, it appends the new change to a separate
file. This allows readers to work concurrently with the writer.
The WAL has to be flushed periodically by SQLite, which involves
locking the database and writing the changes from it.
There are default settings for doing this.</p>

<p>I override these and execute WAL flushes manually from a package
that, when it is done, also triggers an S3 snapshot.
This package is called <code>reallyfsync</code>, and if I can work out how
to test it properly I will make it open source.</p>

<h3>Incremental Blob API</h3>

<p>Another smaller, but important to my particular server feature,
is SQLite&apos;s
<a href="https://www.sqlite.org/c3ref/blob_open.html">incremental blob API</a>.
This allows a field of bytes to be read and written in the DB
without storing all the bytes in memory simultaneously, which
matters when it is possible for each request to be working with
hundreds of megabytes, but you want tens of thousands of
potential concurrent requests.</p>

<p>This is one of the places where the driver deviates from being
a close-to-cgo wrapper to be more
<a href="https://godoc.org/crawshaw.io/sqlite#Blob">Go-like</a>:</p>

<pre><code>type Blob
    func (blob *Blob) Close() error
    func (blob *Blob) Read(p []byte) (n int, err error)
    func (blob *Blob) ReadAt(p []byte, off int64) (n int, err error)
    func (blob *Blob) Seek(offset int64, whence int) (int64, error)
    func (blob *Blob) Size() int64
    func (blob *Blob) Write(p []byte) (n int, err error)
    func (blob *Blob) WriteAt(p []byte, off int64) (n int, err error)
</code></pre>

<p>This looks a lot like a file, and indeed can be used like a file,
with one caveat: the size of a blob is set when it is created.
(As such, I still find temporary files to be useful.)</p>

<h2>Designing with one process programming</h2>

<p>I start with: <strong>Do you really need N computers?</strong></p>

<p>Some problems really do.
For example, you cannot build a low-latency index of
the public internet with only 4TB of RAM.
You need a lot more.
These problems are great fun,
and we like to talk a lot about them,
but they are a relatively small amount of all the code written.
So far all the projects I have been developing post-Google
fit on 1 computer.</p>

<p>There are also more common sub-problems that are hard to solve
with one computer.
If you have a global customer base and need low-latency to
your server, the speed of light gets in the way.
But many of these problems can be solved with relatively
straightforward CDN products.</p>

<p>Another great solution to the speed of light is geo-sharding.
Have complete and independent copies of your service in multiple
datacenters, move your user&apos;s data to the service near them.
This can be as easy as having one small global redirect database
(maybe SQLite on geo-redundant NFS!) redirecting the user to a
specific DNS name like {us-east, us-west}.mservice.com.</p>

<p>Most problems do fit in one computer, up to a point.
Spend some time determining where that point is.
If it is years away there is a good chance one computer will do.</p>

<h2>Indie dev techniques for the corporate programmer</h2>

<p>Even if you do not write code in this particular technology stack and
you are not an independent developer, there is value here.
Use the <em>one big VM, one zone, one process Go, SQLite, and snapshot backup</em> stack
as a hypothetical tool to test your designs.</p>

<p>So add a hypothetical step to your design process:
If you solved your problem on this stack with one computers, how
far could you get?
How many customers could you support?
At what size would you need to rewrite your software?</p>

<p>If this indie mini stack would last your business years,
you might want to consider delaying the adoption of modern
cloud software.</p>

<p>If you are a programmer at a well-capitalized company, you may also
want to consider what development looks like for small internal or
experimental projects.
Do your coworkers have to use large complex distributed systems
for policy reasons?
Many of these projects will never need to scale beyond one computer,
or if they do they will need a rewrite to deal with shifting
requirements.
In which case, find a way to make an indie stack, linux VMs
with a file system, available for prototyping and experimentation.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Reasoning with Regret</title>
	<link href="https://crawshaw.io/blog/reasoning-with-regret" />
	<id>https://crawshaw.io/blog/reasoning-with-regret</id>
	<updated>2018-07-16T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>Reasoning with Regret</h1>

<p><em>2018-07-16</em></p>

<p>I avoid looking to biographies for advice, most of all for anyone in the
business world who is focused on spinning an origin story.
There is one however which I read by accident years ago that I keep
returning to:</p>

<blockquote>
<p>I was working at a financial firm in New York City with a bunch of very smart people, and I had a brilliant boss that I much admired. I went to my boss and told him I wanted to start a company selling books on the Internet. He took me on a long walk in Central Park, listened carefully to me, and finally said, “That sounds like a really good idea, but it would be an even better idea for someone who didn’t already have a good job.” That logic made some sense to me, and he convinced me to think about it for 48 hours before making a final decision. Seen in that light, it really was a difficult choice, but ultimately, I decided I had to give it a shot. I didn’t think I’d regret trying and failing. And I suspected I would always be haunted by a decision to not try at all. After much consideration, I took the less safe path to follow my passion, and I’m proud of that choice.</p>
</blockquote>

<p>…</p>

<blockquote>
<p>I will hazard a prediction. When you are 80 years old, and in a quiet moment of reflection narrating for only yourself the most personal version of your life story, the telling that will be most compact and meaningful will be the series of choices you have made.</p>
</blockquote>

<p>From the <a href="https://www.princeton.edu/news/2010/05/30/2010-baccalaureate-remarks">Princeton 2010 Baccalaureate Remarks</a>.</p>

<p>I find this a powerful tool for reasoning.
Project yourself into far-future you and ask,
what would I do differently?</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Searching the Creative Internet</title>
	<link href="https://crawshaw.io/blog/searching-the-creative-internet" />
	<id>https://crawshaw.io/blog/searching-the-creative-internet</id>
	<updated>2018-05-18T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>Searching the Creative Internet</h1>

<p>I had of late been lamenting the loss of the internet of the 1990s.
A place where everything was obscure or new.
If you saw something mainstream like a Disney princess, it was because
someone had taken the time to handcraft an ASCII art portrait.
Today if you type <code>[disney]</code> into the
<a href="https://google.com/search?q=disney">ubiquitous search engine</a>,
every link on the page is either crafted by the Disney corporation
or is the product of a major news outlet.</p>

<p>This was inevitable.</p>

<p>The internet is no longer a place where a relatively-small fraction of
the human population go to find something different than everyday life.
It <em>is</em> everyday life for billions of people.
As long as life is megacorps and information gatekeepers, so too is the
global ubiquitous internet.
Mission accomplished.</p>

<p>Usually when this kind of nostalgia or cynicism preoccupies me I
quickly realize how bone-headed I am being and figure out how
wrong I am.
It took a little longer than usual but I got there.</p>

<p>The internet of the 90s is still with us, hiding in plain sight.</p>

<h2>Page 7</h2>

<p>Dig a little bit.
If you click through all 14 pages of results Google returns for
<code>[disney]</code>, nothing I could conceive of as interesting appears.
Corporate website this,
chewing-gum news article that.
But if you refine it a little and search for <code>[disney blog]</code>,
then by result Page 7 things start to get interesting.
Half way down page 7 is a link to
<a href="http://www.rejectedprincesses.com">Rejected Princesses</a>,
a site filled with excellent original stories based on
historical figures.
Some Disney executive should buy them.</p>

<p>Now this example may not convince you.
Rejected Princesses is reasonably popular, and better than
almost all the original visual content you would find on
the internet in the 90s.
How could there be 14 pages of disney results without a single
original interesting work?
Is the old internet really there?</p>

<p>It is, you just can&apos;t find it using a search engine designed
for the modern internet.</p>

<h2>Page 2</h2>

<p>Take a far more obscure search term:
<code>[modern nuclear propulsion research]</code>.
We are really in the weeds now and have made our intentions clear.
This is a question about the state of the art in an obscure field of
human study.</p>

<p>The first link is gizmodo.</p>

<p>The second link is a NASA press release.
(Why does NASA even have those?)</p>

<p>The rest of the page is links to popular science, click-bait nonsense,
or one tangentially-related reddit thread made popular by linking
the topic to someone who garners a lot of media attention.
No research or original content creator in sight.</p>

<p>By page 2, things get interesting.
There you will find a link to the excellent
<a href="https://beyondnerva.wordpress.com">Beyond NERVA</a> blog.
This is the wild internet where people
make things and think things.</p>

<h2>How do we make the creative internet easy to navigate?</h2>

<p>What I miss about my &quot;90s internet&quot; wasn&apos;t it specifically,
with its slow data links, tiny JPEGs, buffering RealPlayer,
or the <code>&lt;blink&gt;</code> tag.
It did not have the tiniest fraction of the wonderful
content the internet has today.</p>

<p>What I miss is that I could &quot;go on the internet&quot; and be in
a creative corner of the human experience.
Today if you &quot;go on the internet&quot;, that means you pulled your
phone out of your pocket, dismissed some notification spam
and start reading click-bait shared by people you have met
on social media.</p>

<p>Today you have to choke your way through the money-making
miasma to find the joy.</p>

<p>I wish the internet of creative people and their works
had a front page and a search engine.
Something that made finding the blog about the
<a href="http://www.findplanetnine.com/2017/09/planet-nine-where-are-you-part-1.html">search for planet 9</a>
easy to find, and the New Yorker article on it hard to find.
A place where wikipedia articles came first,
where all the interesting technical stuff you might find
in <a href="http://twitter.com/whitequark">whitequark&apos;s feed</a>
was what you got instead of sidebar ads,
not buried away behind the popular and the profitable.
Where a D&amp;D podcast made by
<a href="http://www.maximumfun.org/shows/adventure-zone">three brothers and their dad in West Virginia</a> was as
easy to find as the podcasts produced by NPR&apos;s $200m/year machine.</p>

<p>There is enough interest the creative web to pay for its tools.
Wikipedia raises <a href="https://en.wikipedia.org/wiki/Wikimedia_Foundation#Financial_summary">$80m a year</a>
from donations!
(What they spend it on does not seem at all effective to me,
but it&apos;s not my money.
Your software does cost more when you have to spend time
<a href="https://www.mediawiki.org/wiki/Page_Previews/2017-18_A/B_Tests#Pageviews_and_Page_Interactions,_effects_on_fundraising">making sure it doesn&apos;t hurt your fundraising</a>.)</p>

<p>What is clear to me is that it is time for separate tools.
A search engine designed to be used by billions of people
every day to do daily tasks is not one that will be
appropriate for weekend meanderings though obscure topics.
A content-sharing site like Reddit that encourages links
to the New York Times will not generate thoughtful discussion.</p>

<p>What is not clear to me yet is how those tools should work.
How do we build a search engine that penalizes media
outlets and promotes blogs and podcasts?
How do we distinguish between a research paper or an
article written by someone about their daily life aboard ISS
on nasa.gov from their useless press releases?</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Service Throughput Tradeoffs</title>
	<link href="https://crawshaw.io/blog/service-throughput" />
	<id>https://crawshaw.io/blog/service-throughput</id>
	<updated>2018-04-13T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>Service Throughput Tradeoffs</h1>

<p><em>2018-04-13, David Crawshaw</em></p>

<p>I am currently writing a service that lets users upload files.
The typical file size is about 25KB, the maximum is 100MB,
with file sizes following a power-law distribution.</p>

<p>The nature of the service is that I need to buffer the contents
of the file until it is complete before processing it.
As the files can be 100MB, my very first version used temporary
files, which meant the service could handle a large number of
concurrent requests without significant memory pressure.</p>

<h2>Load testing for throughput</h2>

<p>Then I wrote a simple load test, blasting files as quickly as
I could at the service from a dozen threads.
The throughput after a little tuning was abysmal, on the order
of hundreds of QPS.</p>

<p>A quick look at
<a href="https://blog.golang.org/profiling-go-programs">pprof</a>
showed that the service spent almost all of its time writing and
then reading back temporary files.</p>

<p>A very easy way to make the load test faster is to store the
temporary files in RAM.
At 100MB, even a reasonably small slice of a modern server can
concurrently serve hundreds of simultaneous connections.
Now the service can handle thousands of QPS, using less CPU per
request.</p>

<p>So far this is fairly typical of the sorts of tradeoffs in resources
(RAM, CPU) and features (throughput) you see when designing services.
It gets more interesting.</p>

<h2>Is the load test representative of real load?</h2>

<p>Consider: what if the typical user connection is a slow link?
Transferring large files may not take the ~1 second we see in the
load test, but rather 5 minutes of trickling packets.</p>

<p>In this case, the RAM-heavy version of the service hurts.
It has limited the maximum concurrent uploads from tens of thousands
to hundreds, in the name of making those transfers more CPU-time
and wall-time efficient.
With low-bandwidth clients, we will always have plenty of CPU,
and with the RAM-heavy version we have significantly reduced QPS.</p>

<p>For a large number of slow clients, the first version is better.</p>

<h2>The distracting unreality of synthetic load tests</h2>

<p>A load test needs to reflect real loads, so do not write one if you
do not have real traffic to use as a baseline.
Just as with micro-benchmarks, attempting to construct one by reasoning
from a blank sheet of paper can mislead and confuse.</p>

<p>So rather than inventing more elaborate load tests, I am going
to spend my time writing more elaborate logs, with upload timing
information.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Sharp-Edged Finalizers in Go</title>
	<link href="https://crawshaw.io/blog/sharp-edged-finalizers" />
	<id>https://crawshaw.io/blog/sharp-edged-finalizers</id>
	<updated>2018-04-05T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>Sharp-Edged Finalizers in Go</h1>

<p><em>2018-04-05, David Crawshaw</em></p>

<p><em>For background, see my last post on why in general
<a href="/blog/tragedy-of-finalizers">finalizers do not work</a>.</em></p>

<p>We cannot use an object finalizer for resource management, because
finalizers are called at some unpredictable distant time long
after resources need to be reclaimed.</p>

<p>However, a finalizer does provide us with a bound on when
a managed resource needs to have been released.
If we reach an object finalizer, and a manually managed resource
has not been freed, then there is a bug in the program.</p>

<p>So we can use finalizers to detect resource leaks.</p>

<pre><code>package db 

func Open(path string, flags OpenFlags) (*Conn, error) {
	// ...

	runtime.SetFinalizer(conn, func(conn *Conn) {
		panic(&quot;open db connection never closed&quot;)
	})
	return conn, nil
}

func (c *Conn) Close() {
	// ...
	runtime.SetFinalizer(conn, nil) // clear finalizer
}
</code></pre>

<p>This is a <em>sharp-edged</em> finalizer.
Misuse the resource and it will cut your program short.</p>

<p>I suspect this kind of aggressive finalizer is off-putting to many,
who view resource management something nice to have.
But there are many programs for which correct resource management
is vital.
Leaking a resource can leave to unsuspecting crashes, or data loss.
For people in similar situations, you may want to consider a
panicing finalizer.</p>

<h2>Debugging</h2>

<p>One big problem with the code above is the error message is rubbish.
You leaked something. OK, great. Got any details?</p>

<p>Ideally the error would point at exactly where we need to release
the resource, but this is a hard problem.
One cheap and easy alternative is to point to where the resource was
originally acquired, which is straightforward:</p>

<pre><code>_, file, line, _ := runtime.Caller(1)
runtime.SetFinalizer(conn, func(conn *Conn) {
	panic(fmt.Sprintf(&quot;%s:%d: db conn not closed&quot;, file, line))
})
</code></pre>

<p>This prints the file name and line number of where the connection was
created, which is often useful in tracing a leaked resource.</p>

<h2>Why not just log?</h2>

<p>Because I have found myself ignoring logs, time and again.</p>

<p>While working on an sqlite wrapper package most of the leaked
resources I encountered were in the tests of programs using the
package.
A log line in the middle of <code>go test</code> will never be seen.
A panic will.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>The Tragedy of Finalizers</title>
	<link href="https://crawshaw.io/blog/tragedy-of-finalizers" />
	<id>https://crawshaw.io/blog/tragedy-of-finalizers</id>
	<updated>2018-04-04T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>The Tragedy of Finalizers</h1>

<p><em>2018-04-04, David Crawshaw</em></p>

<p>Like many garbage collected languages, <a href="https://golang.org">Go</a> lets
you register a
<a href="https://golang.org/pkg/runtime/#SetFinalizer">finalizer</a> on an object.
The finalizer is a function that the language runtime calls when the object
is garbage collected.</p>

<p>Finalizers are deeply unsatisfying. They are almost impossible to use well.</p>

<p>The obvious use of finalizers is to tie resource management to object
lifetime.
This is such an obvious use that you can even find an unfortunate example
of it in the Go standard library. In <a href="https://golang.org/src/os/file_unix.go#L132">package os</a>:</p>

<pre><code>runtime.SetFinalizer(f.file, (*file).close)
</code></pre>

<p>The idea here is an <code>*os.File</code> has a resource from the operating system,
a file descriptor.
At some point those OS resources need to be cleaned up.
Some short-lived small programs can ignore the problem because the OS
will clean up the descriptors when the process exits.
But for long-lived programs, or programs that want to open lots of
descriptors, the program eventually has to call the <code>Close</code> method.</p>

<p>Tracking when precisely a file should be closed is usually
straightforward, but occasionally it involves a great deal of busywork.
So a common instinct we all have is to say &quot;gee, it sure would be nice
if the file descriptor were closed when the <code>File</code> is garbage collected,
let&apos;s use a finalizer&quot;. This does not work.</p>

<p>Here is an example where the finalizer fails:</p>

<pre><code>package main

import (
	&quot;fmt&quot;
	&quot;io/ioutil&quot;
	&quot;os&quot;
	&quot;path/filepath&quot;
)

func fatal(err error) {
	fmt.Fprintf(os.Stderr, &quot;%v\n&quot;, err)
	os.Exit(2)
}

func main() {
	dir, err := ioutil.TempDir(&quot;&quot;, &quot;finalizers-&quot;)
	if err != nil {
		fatal(err)
	}
	defer os.RemoveAll(dir)
	fmt.Printf(&quot;temp directory: %s\n&quot;, dir)
	for i := 0; i &lt; 2000; i++ {
		path := filepath.Join(dir, fmt.Sprintf(&quot;tmp-%d&quot;, i))
		f, err := os.Create(path)
		if err != nil {
			fatal(err)
		}
		fmt.Fprintf(f, &quot;temp file %d\n&quot;, i)

		// f is no longer live, can be GCed
	}
}
</code></pre>

<p>On macOS the result is:</p>

<pre><code>$ go run junk.go 
temp directory: /tmp/finalizers-802262722
open /tmp/finalizers-802262722/tmp-252: too many open files
exit status 2
</code></pre>

<p>Oops.</p>

<h2>Garbage collectors keep their own hours</h2>

<p>The garbage collection contract says nothing about when
collection occurs.
So while there are GC algorithms that free memory the moment it is
no longer used (for example,
<a href="https://en.wikipedia.org/wiki/Reference_counting">reference counting</a>,
it is typical to batch process memory to reduce CPU cycles spent
collecting garbage.</p>

<p>If there is not a lot of demand for new heap space, the Go GC may take
a very long time to free memory, and thus get around to calling the
finalizer.
The cleanup time of objects is open ended.
<strong>It is a bug</strong> to depend on the GC to run before your resources are
exhausted.</p>

<h2>What are finalizers good for?</h2>

<p>Nothing generally.</p>

<p>If you are deeply familiar with the GC algorithm being used in your
runtime, you may be able to use them to manage a resource whose use
is tied closely to heap use.
Maybe C heap.</p>

<p>If you are reasonably sure the algorithm will eventually process all
objects (which is not required by the GC contract!), you could manage
effectively-inexhaustible resources with finalizers.
Maybe very small C objects.</p>

<p>Conceptually you could use finalizers for statistics and book-keeping,
but in practice they cost too many CPU cycles.</p>

<h2>A hypothetical Resource Collector</h2>

<p>I can imagine a programming language where finalizers are useful.</p>

<p>The language&apos;s runtime would, in addition to tracking statistics
about heap use, allow the programmer to specify custom resource
trackers.
An object that registers a finalizer would also tell the runtime
how much pressure the new object applies to a resource.
If the resource is near exhaustion, the runtime collects objects
that are both unreachable in memory and use the particular
resource and collects them.</p>

<p>If you build such a language, I would like to try it.
You could even do it as an extension to an existing language like
Go, though you would have to fork the runtime.</p>

<p>Otherwise, I believe I have one very small use for finalizers as
they exist today, which I will talk about in a followup blog post.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Go and SQLite: when database/sql chafes</title>
	<link href="https://crawshaw.io/blog/go-and-sqlite" />
	<id>https://crawshaw.io/blog/go-and-sqlite</id>
	<updated>2018-04-02T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>Go and SQLite: when database/sql chafes</h1>

<p><em>2018-04-02, David Crawshaw</em></p>

<p>The Go standard library includes
<a href="https://golang.org/pkg/database/sql">database/sql</a>,
a generic SQL interface.
It does a good job of doing exactly what it says it does,
providing a generic interface to various SQL database servers.
Sometimes that is what you want. Sometimes it is not.</p>

<p>Generic and simple usually means lowest-common-denominator.
Fancy database features, or even relatively common but
not widely used features, like nested transactions, are not
well supported.
And if your SQL database is conceptually different from the
norm, it can get awkward.</p>

<p>Using SQLite extensively, I am finding it awkward.
SQLite is an unusual SQL database because it lives in process.
There is no network protocol so some kinds of errors are not
possible, and when errors are possible, they fall into
better-understood categories (which means they sometimes should
be Go panics, not errors).
SQLite is often used as a
<a href="https://www.sqlite.org/fileformat.html">file format</a>,
so streaming access to blobs is very useful.</p>

<p>So I wrote my own SQLite interface: <a href="https://crawshaw.io/sqlite">https://crawshaw.io/sqlite</a>.</p>

<h2>Connection Oriented</h2>

<p>The most awkward part of database/sql when using SQLite is the
implicit connection pool.
An <code>*sql.DB</code> is many connections, and when you call Prepare the
<code>*sql.Stmt</code> you get back returns a statement that will execute on some
connection.
That&apos;s usually fine except for transactions, which <code>database/sql</code>
handles specially.
Unfortunately the interesting features of SQLite are far more
connection-oriented than most client-server databases, so working with
queries without holding the connection does not work well with SQLite.
The database/sql package does now expose an <code>*sql.Conn</code> object to help
with this, but this makes Stmt tracking difficult.</p>

<h2>Statement caching</h2>

<p>On the long-lived, frequently-executing paths through a program you
want any SQL that is executed to have already been parsed and planned
by the database engine.
Part of this is for the direct CPU savings from avoiding parsing.
Part is also to offer the database engine a chance to do more analysis
of the query or to change
<a href="https://www.sqlite.org/c3ref/c_prepare_persistent.html#sqlitepreparepersistent">allocation strategies</a>.
Almost all compilers have to trade off compilation time and execution time.</p>

<p>With <code>database/sql</code>, this is typically done by calling Prepare on a
 connection pool and producing an <code>*sql.Stmt</code>.
You can also call Prepare on an <code>*sql.Conn</code>, and get a statement object
specific to that connection.
It is then up to you to keep track of this object.</p>

<p>This has always irked me because it means defining a variable in some
long-lived object somewhere.
The name of that variable is never long enough to be useful, and never
short enough to stay out of the way.
To avoid this I tend to interpose an object that stores a mapping of
query strings to <code>*sql.Stmt</code> objects, so I can use the query string
itself inline in hot-path as the name of the statement.
Experience with this suggests it works, so I have made it the
foundation of the sqlite package:</p>

<pre><code>func doWork(dbpool *sqlite.Pool, id int64) {
	conn := dbpool.Get(ctx)
	defer dbpool.Put(conn)

	stmt := conn.Prep(&quot;SELECT Name FROM People WHERE ID = $id;&quot;)
	stmt.SetInt64(&quot;$id&quot;, id)
	if hasRow, err := stmt.Step(); err != nil {
		// ... handle err
	} else !hasRow {
		// ... handle missing user
	}
	name := stmt.GetText(&quot;Name&quot;)
	// ...
}
</code></pre>

<p>If the connection has never seen this query before, the Prep method
builds a prepared statement, parsing the SQL.
In the process it adds the statement to map keyed by the query string
and returns it on subsequent calls to Prep.
Thus after a handful of calls to <code>doWork</code> cycling through all the
connections in the  pool, calls to Prep are simple map lookups.</p>

<h2>Parameter names</h2>

<p>For relatively simple queries with only a handful of parameters,
lining up a few question marks with the positional arguments in the
Query method is straightforward and quite readable.
The same is possible here using the sqlitex.Exec function.</p>

<p>For complex queries with a dozen parameters, the sea of parameters
can be quite confusing.
Here instead we take advantage of SQLite&apos;s parameter names:</p>

<pre><code>stmt := conn.Prepare(`SELECT Name FROM People
	WHERE Country = $country
	AND CustomerType = $customerType`)

stmt.SetText(&quot;$country&quot;, country)
stmt.SetInt(&quot;$customerType&quot;, customerType)
</code></pre>

<p>Any errors that occur setting a field are reported when <code>Step</code> is called.</p>

<p>Similarly column names can be used to read values from the result:</p>

<pre><code>stmt.GetText(&quot;Name&quot;)
</code></pre>

<h2>Errors or bugs</h2>

<p>Everything can produce an error in database/sql.
This is the correct design given its requirements: databases are
separate processes with communication going over the network.
The connection to the database can disappear at any moment, and a
process needs to handle that.</p>

<p>SQLite is different.
The database engine is in-process and not going anywhere.
This means we should treat its errors differently.</p>

<p>In Go, errors that are part of the standard operation of a program
are returned as values.
Programs are expected to handle errors.</p>

<p>Program bugs that cannot be handled should not be returned as errors.
Doing so leads to unnecessarily passing around of useless error objects
and makes it easy to introduce more bugs (in particular, losing track of
where in the program the bug happened).</p>

<p>Here is a program bug that no-one can usefully handle:</p>

<pre><code>conn.Prep(&quot;SELET * FRO t;&quot;) // panics
</code></pre>

<p>Almost all programs making SQL queries define the text of those
queries statically.
(The only obvious exception is if you are writing an SQL REPL.)
Doing otherwise is a security risk.
It does not make sense to try and handle the error from an SQL typo
at run time.
So the standard way to prepare a statement, the Prep method,
does not return an error.
Instead it panics if the SQL fails to compile.</p>

<p>The behavior of the Prep method is spiritually similar to regexp.MustCompile,
which is designed to be used with regular expression string literals.
As a side effect this means slightly fewer lines of code are
required to execute a query, but most importantly, it means the bug
is treated correctly.</p>

<h2>Savepoints</h2>

<p>One of the concepts I find hardest to use well in <code>database/sql</code>
is the <a href="https://golang.org/pkg/database/sql/#Tx">Tx</a> object.
It represents a transaction, that is, statements executed via
it are wrapped in <code>BEGIN;</code> and <code>COMMIT;</code>/<code>ROLLBACK;</code>.
This sqlite package has no equivalent object.
Instead, it encourages you to exclusively use savepoints.</p>

<p>For those not familiar with the concept, the SQL
<code>SAVEPOINT foo; ... RELEASE foo;</code> is semantically the same as
<code>BEGIN DEFERRED; ... COMMIT;</code>.
What distinguishes savepoints is they can be nested.
(Hence the user-defined names, so you can specify from whence to
commit or rollback.)</p>

<p>If you can spare a few microseconds, savepoints provide an easy form
of transaction support in Go that can integrate well with error and
panic handling.</p>

<p>The fundamental principle is: for functions doing serial database
work, pass it a single connection, create a savepoint on function
entry, and defer the savepoint release.</p>

<p>Functions that follow this principle compose.</p>

<p>For example, using the helper function sqlitex.Savepoint:</p>

<pre><code>func doWork(conn *sqlite.Conn) (err error) {
	defer sqlitex.Save(conn)(&amp;err)
	// ...
	if err := doOtherWork(conn); err != nil {
		return err
	}
	// ...
}

func doOtherWork(conn *sqlite.Conn) (err error) {
	defer sqlitex.Save(conn)(&amp;err)
	// ...
}
</code></pre>

<p>In this example, if doOtherWork returns an error, the doWork
savepoint will unwind.
Elsewhere in the program doOtherWork can be safely called as an
independent, fully functional piece of code that is already wrapped
in a database transaction.</p>

<p>This ties committing or rolling back a database transaction to
whether or not a function returns an error.
The bookkeeping is a little easier and that makes it much easier to
move code around.</p>

<h2>Contexts and SetInterrupt</h2>

<p>The context package was retrofitted onto <code>database/sql</code> and it shows.
The exported API surface is much larger than it should be because
the retrofit happened after Go 1.0 was released and the API could
not be broken.</p>

<p>In <code>database/sql</code> just about everything has to take a context object,
and in some cases more than one context may be in play, as every
single function call is potentially a network event communicating
with a database server. As SQLite does not have that, the context
story can be simpler. No need for PrepareContext or PingContext.</p>

<p>Instead, a context can be associated with an SQLite connection:</p>

<pre><code>conn.SetInterrupt(ctx.Done())
</code></pre>

<p>Now ctx is in charge of interrupting calls to that connection
until <code>SetInterrupt</code> is called again to swap the context out.
There is also a shortcut for associating a context when using
a connection pool:</p>

<pre><code>conn := dbpool.Get(ctx.Done())
defer dbpool.Put(conn)
</code></pre>

<h2>Concurrency</h2>

<p>A <code>database/sql</code> driver for SQLite should be fully capable
of taking advantage of threads for multiple readers (and in
some cases, effectively multiple writers).
However when I tried out the Go SQLite drivers, I found a few
limits.
What I saw was SQLite used in thread serialization mode (slow!),
not using the in-process
<a href="https://www.sqlite.org/sharedcache.html">shared cache</a>
by default, not using the
<a href="https://www.sqlite.org/wal.html">write-ahead log</a> by default,
and no built-in handling for
<a href="https://www.sqlite.org/unlock_notify.html">unlock_notify</a>.</p>

<p>This package does these things by default, to maximize the
concurrency a program can get out of a connection pool.</p>

<h2>What&apos;s next</h2>

<p>Does the world really need another Go sqlite package? Maybe!</p>

<p>It is a lot of fun rethinking a general interface for a specific case.
You get to simultaneously throw a lot of things away and add new things.</p>

<p>There are a few things I would like to look at for improving error
reporting. For example, if you call <code>conn.OpenBlob</code> inside a savepoint,
then try to open a nested savepoint, SQLite will produce an error,
<code>SQLITE_BUSY</code>.
It won&apos;t tell you what it is that&apos;s open.
If a connection is tracking its own blobs, that would give us a good
chance to report what is open or in-progress.</p>

<p>This package will keep evolving, that is, I will keep breaking its
API, for the next few months.
If anyone wants to consider a <code>database/sql</code> overhaul as part of Go 2,
maybe there are some useful ideas in here.
Perhaps this package can serve as an experience report.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Experimentation Adrift</title>
	<link href="https://crawshaw.io/blog/experimentation-adrift" />
	<id>https://crawshaw.io/blog/experimentation-adrift</id>
	<updated>2018-03-30T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>Experimentation Adrift</h1>

<p><em>2018-03-30</em></p>

<p>You can learn numerous lessons from failure, almost all of them bogus.
This one I have thought about enough over the years that I think it is
worth writing down.</p>

<p>I started working on Google+ a few weeks before it publicly launched,
and was there about a year.
The first two months were frantic, but after that I managed to devote
a few hours a week to a project that really interested me:
bringing experiments to G+.</p>

<p>Google is <a href="https://research.google.com/pubs/pub36500.html">very fond of experimentation</a>
and I had spent time working in other parts of the company seeing it in action.
It really works.
You can take a product and make it better through careful application
of the scientific method.</p>

<p>It did not really work for G+ when I worked there.
(It has been five years, so I could not say how they use experiments today.)
Nothing went terribly wrong, but I did not manage to build what I wanted.</p>

<p>Experiments were run, metrics collected and studied.
But the culture of scientific feedback I had seen in other parts of
the company did not appear.
Features were designed, implemented, and final decision made using
processes that had nothing to do with what was learned from prior
experiments.</p>

<p>Effective scientists hypothesize, design and run an experiment to test
their hypothesis, and then use what they learn for their next hypothesis.</p>

<p>There is a lot of hand waving in this process.
We conveniently minimize discussion of how exactly we generate hypotheses.
Vague words like intuition or heuristic get used to describe it.
One thing that is clear is there is a cycle.
You have to learn from your experiments.
Your next hypothesis (or specifically, software feature) needs to
change when the experiment does not go the way you expected it to go.</p>

<p>I think this is a common problem in software A/B testing in the industry.
Two possibilities are tried. The winner at metrics of your choice
(engagement, revenue, clicks, likes, etc) is rolled out.
For the next feature, another A/B test.
A and B are two choices the designer liked, or management liked, or
maybe even something a programmer liked.
Either way, they are not two choices made after integrating what was
learned from prior A/B experiments.
And this experiment is not designed to maximize what you can learn
about your software features.
When this technique is applied for long enough, the result is at best
some kind of stuttering hill climb, in steps of some unknown length
up and hill you could not name.
At worst, it is a kind of long-term P-hacking, where most of your A/B
results are a wash, but sometimes it looks like experiments really
got you something.</p>

<p>In retrospect, the parts of the company that make very effective use
of experiments got lucky.
A handful of excellent engineers in a past life did a long science
PhD under in a lab with enough other well-trained scientists that the
necessary skills were ingrained.
The brutal introspection and honesty required to deal with the hard
fact that an experiment has shown that a truth you cherished is wrong.</p>

<p>These effective experimenters had another thing going for them:
regular releases and a long-term pace.
When a team is in &quot;start-up&quot; mode, rushing to release features
required by a large paying customer or to match a competitor&apos;s product,
you can&apos;t take a week to refine a hypothesis and three more to
design an experiment to test it.
There are things to be done.</p>

<p>Finding both of these, training and time, is extremely difficult
in a technology company.
But there is no way around it.
Scientific training takes years.
Doing science takes years.
So for now I am left to conclude that scientific culture is largely
incompatible with silicon valley culture.
There are a few places where it thrives and has enormous impact.
In general however, when I hear of scientific process being applied
to software I am extremely skeptical.
What a pity.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Leaving Google</title>
	<link href="https://crawshaw.io/blog/leaving-google" />
	<id>https://crawshaw.io/blog/leaving-google</id>
	<updated>2018-03-27T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<h1>Leaving Google</h1>

<p><em>2018-03-27</em></p>

<p>Today is my last day at Google.
It has been a wonderful place to work filled with excellent coworkers.</p>

<p>I am sad to be leaving Fuchsia before 1.0.
As an operating system it gets so many things right.
No doubt I will get a chance to be a user.</p>

<p>For the foreseeable future I&apos;m going to be doing childcare and
starting a software business.
Details to follow, first I have some prototyping to do.</p>

<p>Inside Google, there is a domain-locked version of Google+.
(It is a service available to all business customers, I believe.)
For the past five years it has served almost all of my social media needs.
Its quality peak was probably 2-3 years ago, but it still has the
best SNR of any fun-yet-educational distraction I have found so far.
Now that its gone I will have to see what else there is.</p>

<p>As part of that, I should blog more.
Not sure if that will happen, but I&apos;ll give it a shot.</p>

	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Less cgo overhead in Go 1.8</title>
	<link href="https://crawshaw.io/blog/go1.8-cgo" />
	<id>https://crawshaw.io/blog/go1.8-cgo</id>
	<updated>2017-02-15T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	One spare afternoon a few months ago I read through the cgo calling code in the Go runtime. There were two defers that could be merged into one, so I did.<br/><br/>Combined with Austin's improvements to defer's overall performance (CL 29656), the 146ns call to an empty C function in Go 1.7 is now a 56ns call in Go 1.8:<br/><pre>
name       old time/op  new time/op  delta
CgoNoop-8   146ns ± 1%    56ns ± 6%  -61.57%  (p=0.000 n=25+30)
</pre>
This is a good reminder to me of the important of making time for idle rummaging around in code. For the past six months I have done very little of it, as the piles of well-defined work I have seemed too important. But looking back over what came of that work, this idle afternoon is probably the most impactful thing I have done recently.<a class="entrylink" href="https://golang.org/cl/30080">golang.org/cl/30080</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>BBR</title>
	<link href="https://crawshaw.io/blog/2017-01-07" />
	<id>https://crawshaw.io/blog/2017-01-07</id>
	<updated>2017-01-07T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	The linked paper describes a new TCP congestion control algorithm. The key insight is an old assumption about networks is no longer true. It used to be the case that networks only dropped packets when they were unusable or when they were congested. Thus TCP stacks could work out how rapidly to send packets by ramping up until they saw packet loss.<br/><br/>In modern networks, packet loss is common long before a TCP connection reaches capacity. Our old congestion control algorithms were unnecessarily slowing connections.<br/><br/>The most important part of the paper is the use of field experiments. It goes beyond the usual prototype implementation, and deployed the algorithm to a large network and measured benefits.<br/><br/>More computer science needs to happen in the field.<br/><a class="entrylink" href="http://queue.acm.org/detail.cfm?id=3022184">queue.acm.org/detail.cfm?id=3022184</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Compiler Bomb</title>
	<link href="https://crawshaw.io/blog/2016-10-14" />
	<id>https://crawshaw.io/blog/2016-10-14</id>
	<updated>2016-10-14T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	A tiny C program that compiles to a 16GB executable. Builds in a cool 27 minutes:<br/><pre>
main[-1u]={1};
</pre>
<a class="entrylink" href="https://codegolf.stackexchange.com/questions/69189/build-a-compiler-bomb/69193#69193">codegolf.stackexchange.com/questions/69189/build-a-compiler-bomb/69193#69193</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>On recieving the News</title>
	<link href="https://crawshaw.io/blog/2016-09-24" />
	<id>https://crawshaw.io/blog/2016-09-24</id>
	<updated>2016-09-24T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"HEISENBERG: Well that's not quite right. I would say that I was absolutely convinced of the possibility of our making a uranium engine but I never thought that we would make a bomb and at the bottom of my heart I was really glad that it was to be an engine and not a bomb. I must admit that."</blockquote><a class="entrylink" href="http://germanhistorydocs.ghi-dc.org/pdf/eng/English101.pdf">germanhistorydocs.ghi-dc.org/pdf/eng/English101.pdf</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Buried by the media</title>
	<link href="https://crawshaw.io/blog/2016-09-05" />
	<id>https://crawshaw.io/blog/2016-09-05</id>
	<updated>2016-09-05T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"No one else would follow; even the minister failed to show. Shaking his head ever so slightly, Jerry Flemmons of the Fort Worth Star-Telegram turned to me and said, 'Cochran, if we're gonna write a story about the burial of Lee Harvey Oswald, we're gonna have to bury the son of a bitch ourselves.'"</blockquote><a class="entrylink" href="http://www.sltrib.com/sltrib/world/57162874-68/oswald-fort-lee-worth.html.csp">www.sltrib.com/sltrib/world/57162874-68/oswald-fort-lee-worth.html.csp</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Smaller Go 1.7 binaries</title>
	<link href="https://crawshaw.io/blog/2016-08-19" />
	<id>https://crawshaw.io/blog/2016-08-19</id>
	<updated>2016-08-19T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	A post I wrote over on the Go blog.<br/><a class="entrylink" href="https://blog.golang.org/go1.7-binary-size">blog.golang.org/go1.7-binary-size</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Good business</title>
	<link href="https://crawshaw.io/blog/2016-07-27" />
	<id>https://crawshaw.io/blog/2016-07-27</id>
	<updated>2016-07-27T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"Bill and I were actually talking about what kind of investments GV was looking to make. He stressed that GV was looking to invest in businesses that were actually good businesses. As a counterexample, he brought up Twitter, which at the time he considered to be a 'good investment' (said with a grin and a wink) but not a 'good business'. I had one of those feelings that you get when somebody really smart just shared with you The Truth."</blockquote><a class="entrylink" href="https://news.ycombinator.com/item?id=12174072">news.ycombinator.com/item?id=12174072</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Everyone a writer</title>
	<link href="https://crawshaw.io/blog/2016-07-08" />
	<id>https://crawshaw.io/blog/2016-07-08</id>
	<updated>2016-07-08T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"The irresistible proliferation of graphomania shows me that everyone without exception bears a potential writer within him, so that the entire human species has good reason to go down into the streets and shout: we are all writers! For everyone is pained by the thought of disappearing, unheard and unseen, into an indifferent universe, and because of that everyone wants, while there is still time, to turn himself into a universe of words. One morning (and it will be soon), when everyone wakes up as a writer, the age of universal deafness and incomprehension will have arrived.<br/>...<br/>Let us define our terms. A woman who writes her lover four letters a day is not a graphomaniac, she is simply a woman in love. But my friend who xeroxes his love letters so he can publish them someday--my friend is a graphomaniac. Graphomania is not a desire to write letters, diaries, or family chronicles (to write for oneself or one's immediate family); it is a desire to write books (to have a public of unknown readers). In this sense the taxi driver and Goethe share the same passion. What distinguishes Goethe from the taxi driver is the result of the passion, not the passion itself.<br/><br/>Graphomania (an obsession with writing books) takes on the proportions of a mass epidemic whenever a society develops to the point where it can provide three basic conditions: 1. A high degree of general well-being to enable people to devote their energies to useless activities; 2. An advanced state of social atomization and the resultant general feeling of the isolation of the individual; 3. A radical absence of significant social change in the internal development of the nation.<br/><br/>But the effect transmits a kind of flashback to the cause. If general isolation causes graphomania, mass graphomania itself reinforces and aggravates the feeling of general isolation. The invention of printing originally promoted mutual understanding. In the era of graphomania the writing of books has the opposite effect: everyone surrounds himself with his own writings as with a wall of mirrors cutting off all voices from without."</blockquote><br/>— The Book of Laughter and Forgetting, Kundera<br/>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2016-06-29</title>
	<link href="https://crawshaw.io/blog/2016-06-29" />
	<id>https://crawshaw.io/blog/2016-06-29</id>
	<updated>2016-06-29T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"moving the font engine out of the kernel in Windows (which Microsoft has done starting with Windows 10)."</blockquote><br/>Good News.<br/><a class="entrylink" href="https://googleprojectzero.blogspot.com/2016/06/a-year-of-windows-kernel-font-fuzzing-1_27.html">googleprojectzero.blogspot.com/2016/06/a-year-of-windows-kernel-font-fuzzing-1_27.html</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Transaction oriented collector</title>
	<link href="https://crawshaw.io/blog/2016-06-24" />
	<id>https://crawshaw.io/blog/2016-06-24</id>
	<updated>2016-06-24T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	Rick and Austin just published their design doc for their new Go garbage collector. It is an attempt to get the advantages of isolated heaps via lightweight dynamic analysis, without making the programming model harder to use.<br/><br/>A very interested project that I hope succeeds.<a class="entrylink" href="https://golang.org/s/gctoc">golang.org/s/gctoc</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Machining under a microscope</title>
	<link href="https://crawshaw.io/blog/2016-06-20" />
	<id>https://crawshaw.io/blog/2016-06-20</id>
	<updated>2016-06-20T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"The average width of such a channel is 0.002 inch. For this kind of work, the shop often uses spade-type end mills as small as 0.001 inch wide, although even smaller tools have been used occasionally."</blockquote><a class="entrylink" href="http://www.mmsonline.com/articles/cutting-with-a-0001-inch-end-mill">www.mmsonline.com/articles/cutting-with-a-0001-inch-end-mill</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Limits of Superintelligence</title>
	<link href="https://crawshaw.io/blog/2016-05-07" />
	<id>https://crawshaw.io/blog/2016-05-07</id>
	<updated>2016-05-07T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	In 2012 Dean et al. published a paper on unsupervised learning using 16000 CPU cores. Seeing machine learning successfully scaled across a modern data center convinced me that human-equivalent and better-than-human AIs were coming soon. In the past few months I've revised my estimates of what soon means, from years to decades. This is quick attempt to jot down my reasoning.<br/><br/>There have not been any theoretical breakthroughs in AI research in decades. All the progress you have seen has come from the hard work of harnessing ever greater numbers of transistors. (The programming techniques for even the most sophisticated recurrent neural network used to win at Go would have seemed a natural continuation of the Perceptron to Rosenblatt in the 1950s.)<br/><br/>Machine intelligence is proportional to the number of transistors. (More accurately, the number of switch events, but transistors is a reasonable approximation.)<br/><br/>The number of transistors continues to increase, so we should expect machine intelligence to increase.<br/><br/>Here's the wrinkle: the last couple of years has seen the slow down of Moore's law. At this point I believe the safe money is on assuming it is over. (We won't know when it happened until long after the fact.) We still have another century at least of optimizing our hardware and software layouts, and there are more economic reasons to keep making computers. So this doesn't mean the end of increasing transistor counts or machine intelligence.<br/><br/>The end of Moore's law does slow down AI progress significantly. Because while the law is about die area, an important corollary is that the number of transistors you get for a Joule of energy increases. Under Moore's law, the amount of machine intelligence you got for a Joule increased every year. Now (modulo some large constant factors from improved software engineering), the amount of machine intelligence you get for a Joule is fixed.<br/><br/>Unlike five years ago, progress in AI is now tied limited by our industrial output.<br/><br/>The requirement for a superintelligence is that it is not only as smart as a human, but that it can grow its own intelligence independent of us. That means a superintelligence now needs complete control over industry: mining, refining, and manufacturing. Until we have roboticized industry and opened up effectively-limitless resources (the solar system), machine intelligence is stuck in a box.<br/><br/>So I expect to see superintelligence, but now I expect it to be a huge facility behind terawatts of solar panels, built on Ceres and launched into a heliocentric orbit.<br/><a class="entrylink" href="http://research.google.com/pubs/pub40565.html">research.google.com/pubs/pub40565.html</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>COPY Relocations</title>
	<link href="https://crawshaw.io/blog/2016-04-17" />
	<id>https://crawshaw.io/blog/2016-04-17</id>
	<updated>2016-04-17T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	As part of my work on reducing Go binary size, I ran into the concept of linker copy relocations. This is a relatively obscure underdocumented concept so I want to scribble down some notes.<br/><br/>Some background: A relocation is a task created by the compiler and performed by the linker. The typical relocation is <blockquote>"put the address of symbol X inside symbol Y at offset O"</blockquote>. There are many reasons a compiler can't do this itself. The clearest reason is that it may not have a copy of symbol X. Many programming languages support compiling programs piecemeal and referring to other symbols by forward declaration. In C it is as easy as:<br/><br/><pre>
$ cat symY.c
void x();

void y() {
x();
}
$ cc -c symY.c
$ readelf -r symY.o

Relocation section '.rela.text' at offset 0x518 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
00000000000a  000900000002 R_X86_64_PC32     0000000000000000 x - 4
$
</pre>
<br/>…to get a relocation for the address of a symbol x inside symbol y, in an object file that knows nothing about x. When the linker runs, it is given symY.o, the moral equivalent of a symX.o object file, lays out the symbols and resolves the relocations. In an introduction to linking, this the end of the story. But not today.<br/><br/>Dynamic linking is an extra link phase that happens after the executable is linked. A long time after. It is performed by the operating system when an executable is run. Here is a traditional example building on symY.c:<br/><br/><pre>
$ cat main.c
void y();

void x() {}

int main(void) {
y();
return 0;
}

$ cc -shared -o libsymY.so symY.o
$ cc -fpic -c main.c
$ cc -g main.o -L . -lsymY
</pre>
<br/>What's happening here is we turn symY.o into a shared library, compile main.c (which needs a symbol y, defined in our shared library), and then link it looking for the library (-lsymY) in the current directory (-L .). The result is a binary with a relocation:<br/><br/><pre>
$ readelf -r a.out
…
000000601028  000400000007 R_X86_64_JUMP_SLO 0000000000000000 y + 0
</pre>
<br/>When you execute this program, the OS loader finds the .so file, maps it into memory, and does the job of the linker resolving relocations.<br/><br/>At this point it is really tempting to walk away, pretend shared libraries don't exist and go back to a sensible world where linkers link. For the purpose of everyday programming, please do. But this machinery is widely used and become more common. The increasingly popular ASLR security technique (Address Space Layout Randomization) maps the binary into a random location in memory when the program starts. To put the data and program text in different relative locations requires relocations that can only be resolved at load time. The result is PIE binaries (Position Independent Executable) that look like a shared object with a main function.<br/><br/>Back to COPY relocations.<br/><br/>A COPY relocation is a special kind of dynamic relocation that instructs the loader to copy a symbol to a particular location. It is used to enable what in a world of PIE binaries looks like a half measure: position-dependent main executables that use a shared library. The position-dependent code needs to be fully linked, that is, the traditional linker needs the address of a symbol that won't be known until we reach the dynamic linker. To make this work, it leaves space for a symbol at a known address, writes the main executable to expect the symbol to be there, and leaves a COPY relocation for the dynamic linker, asking it to move the symbol into place.<br/><br/>The job of a dynamic linker when faced with a COPY relocation is to move the symbol out of the memory region allocated for the shared object and into the region of the main executable. It then needs to resolve all relocations in the shared object to point to the symbol location in main executable. This is possible because the shared object is position-independent.<br/><br/>The result of these COPY relocations is another surprising way the memory of your program can end up laid out by linkers. You can produce two objects with linkers and expect the loader to load each somewhere. You know the loader can patch up references in each to point at the other (typically word-sized pointers). Well now it can take chunks out of one piece and put them in another piece. I got burned by this with the layout games I played in https://golang.org/cl/21285. I worked on the assumption that all the symbols I neatly laid out in a section would be in that section. Instead ld.bfd generated (incorrectly in this case, https://sourceware.org/bugzilla/show_bug.cgi?id=19962) a R_ARM_COPY for some of my symbols.<br/><br/><a class="entrylink" href="https://golang.org/issue/6853">golang.org/issue/6853</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Atom Feed</title>
	<link href="https://crawshaw.io/blog/2016-04-16" />
	<id>https://crawshaw.io/blog/2016-04-16</id>
	<updated>2016-04-16T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	I've attempted to add an atom feed to this page. Please let me know if there are any issues.
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2016-02-10</title>
	<link href="https://crawshaw.io/blog/2016-02-10" />
	<id>https://crawshaw.io/blog/2016-02-10</id>
	<updated>2016-02-10T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	Los Alamos, the early 1940s.<br/><br/><blockquote>"Well, Mr. Frankel, who started this program, began to suffer from the computer disease that anybody who works with computers now knows about. It's a very serious disease and it interferes completely with the work. The trouble with computers is you play with them. They are so wonderful. You have these switches—if it's an even number you do this, if it's an odd number you do that—and pretty soon you can do more and more elaborate things if you are clever enough, on one machine.<br/><br/>After a while the whole system broke down. Frankel wasn't paying any attention; he wasn't supervising anybody. The system was going very, very slowly—while he was sitting in a room figuring out how to make one tabulator automatically print arc-tangent X, and then it would start and it would print columns and then bitsi, bitsi, bitsi, and calculate the arc-tangent automatically by integrating as it went along and make a whole table in one operation.<br/><br/>Absolutely useless. We had tables of arc-tangents."</blockquote><br/>Los Alamos from Below, from <blockquote>"Surely You're Joking, Mr. Feynman!"</blockquote>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2016-01-23</title>
	<link href="https://crawshaw.io/blog/2016-01-23" />
	<id>https://crawshaw.io/blog/2016-01-23</id>
	<updated>2016-01-23T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"Unless you’re a plow driver or a parka-clad elected official trying to look essential, one doesn’t pretend to do battle against a blizzard. You submit. Surrender. Hunker down. A snowstorm rewards indolence and punishes the go-getters, which is only one of the many reasons it’s the best natural disaster there is.<br/><br/>… And, gloriously if briefly, it hides everything else — the plastic grocery bags and mini-marts and dog poop and salt-grimed Toyotas and sundry disorder of modernity. Watching the quotidian American crudscape transform into a fairy-tale kingdom is a legitimate wonder. Name another disaster that leaves the afflicted region more attractive in its wake."</blockquote><a class="entrylink" href="http://www.nytimes.com/2016/01/23/opinion/in-case-of-blizzard-do-nothing.html">www.nytimes.com/2016/01/23/opinion/in-case-of-blizzard-do-nothing.html</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2016-01-18</title>
	<link href="https://crawshaw.io/blog/2016-01-18" />
	<id>https://crawshaw.io/blog/2016-01-18</id>
	<updated>2016-01-18T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	SQL as a stream processing language.<br/><br/>It is frustrating that SQL continues to exist, but I believe that despite funky syntax and bad semantics, it covers an important subset of language expressivity. It is a useful subset, and acts as a common culture for programmers to build new systems and explain it to other programmers without having to start from first principles. (Which lets you skip over some of the inevitable tide of negativity you face every time you try to introduce a new idea in programming.)<br/><br/>Anyway, PipelineDB looks really interesting.<a class="entrylink" href="http://docs.pipelinedb.com/continuous-views.html">docs.pipelinedb.com/continuous-views.html</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2016-01-15</title>
	<link href="https://crawshaw.io/blog/2016-01-15" />
	<id>https://crawshaw.io/blog/2016-01-15</id>
	<updated>2016-01-15T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	An easy read that should be given to freshmen. Also see the linked essay, <blockquote>"The Unexotic Underclass"</blockquote> by C.Z. Nnaemeka, which is also worth a read.<br/><br/><blockquote>"Let’s pretend, for a moment, that you are a 22-year-old college student in Kampala, Uganda. You’re sitting in class and discreetly scrolling through Facebook on your phone. You see that there has been another mass shooting in America, this time in a place called San Bernardino. You’ve never heard of it. You’ve never been to America. But you’ve certainly heard a lot about gun violence in the U.S. It seems like a new mass shooting happens every week.<br/><br/>You wonder if you could go there and get stricter gun legislation passed. You’d be a hero to the American people, a problem-solver, a lifesaver. How hard could it be? Maybe there’s a fellowship for high-minded people like you to go to America after college and train as social entrepreneurs. You could start the nonprofit organization that ends mass shootings, maybe even win a humanitarian award by the time you are 30."</blockquote><a class="entrylink" href="https://medium.com/the-development-set/the-reductive-seduction-of-other-people-s-problems-3c07b307732d">medium.com/the-development-set/the-reductive-seduction-of-other-people-s-problems-3c07b307732d</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2016-01-09</title>
	<link href="https://crawshaw.io/blog/2016-01-09" />
	<id>https://crawshaw.io/blog/2016-01-09</id>
	<updated>2016-01-09T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	While I hear a lot more talk about the Raspberry Pi and Arduino, these BeagleBoard products are what I want to use for any embedded or small hardware development (in my case, a custom scientific instrument).<br/><br/>This latest BeagleBoard-X15 comes with 2 Cortex-M4 microcontrollers and 4 PRUs (small realtime microcontrollers) on the board with a decent computer. That's a great set of tools with none of the fussy wiring problems of using multiple Arudinos.<br/><br/>If the X15 has enough discrete microcontrollers for the job, then there's a huge advantage in using it: you get to wire your major components together with software, not harder. Every time I find a way to shift a problem from hardware in software, I get more productive.<br/><a class="entrylink" href="http://beagleboard.org/x15">beagleboard.org/x15</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2016-01-07</title>
	<link href="https://crawshaw.io/blog/2016-01-07" />
	<id>https://crawshaw.io/blog/2016-01-07</id>
	<updated>2016-01-07T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	Over the holidays I took some real vacation and finally got stuck back into my hobby project. As part of it I implemented a Unix shell with job control. It turns out a shell is a reasonably easy introduction to some of the crustier parts of Unix I've been avoiding for a while now. It lets you get a sense of TTYs without dealing with the really terrible parts.<br/><br/>(If you ever wanted to convince yourself that Unix is due for a replacement, take a look at how much kernel API surface is dedicated to terminals. It is a mess that can only be justified by historical argument.)<br/><br/>Here is a taste of some of the fun of shells, for those of you who find such things fun:<br/><br/>When you start a process in an interactive shell, the process gets a new process group id, and that pgid is brought to the foreground of the current terminal session. (See tcsetpgrp(3) for more details.) From here on out the terminal delivers signals to the new process group.<br/><br/>So far so good.<br/><br/>Now, what happens when you start a pipeline? Executing 'echo hello | rev' starts two processes, where do the signals go? That is why the signals are delivered to a process group: the shell starts echo, creates a new pgid, and then starts rev and gives it the same pgid. Easy enough, and I implemented similarly to what you will find in the GNU libc manual.<br/><br/>Except it sometimes did not work.<br/><br/>Turns out what was happening was that 'echo hello' is so short-lived, and its output tiny enough to fit into the kernel buffer, that it would exit before my shell had a chance to assign its pgid to 'rev'. By the time I got there, the pgid was invalid.<br/><br/>Digging around in the bash sources revealed that I was not the first to deal with this, jobs.c:200:<br/><br/><pre>
"Surely I spoke of things I did not
understand, things too wonderful
for me to know."
</pre>
<br/>Oops, wrong Job. Try again:<br/><br/><pre>
/* Pipes which each shell uses to communicate
   with the process group leader until all of
   the processes in a pipeline have been
   started.  Then the process leader is
   allowed to continue. */
int pgrp_pipe[2] = { -1, -1 };
</pre>
<br/>Bash plays with this pipe between fork and exec for 'echo'. The process is forked, then blocked on pgrp_pipe until the parent bash has forked all the subsequent processes in the pipeline, then it closes it and lets echo continue.<br/><br/>This was a bit of a novelty for me. I don't spend much time programming between fork and exec. It also turned out to be an unpleasant trick to replicate exactly in Go. (I am using the convenient fork/exec wrapper, syscall.StartProcess, and to modify it I have to copy a whole lot of OS-specific code.) So instead I create a dummy process for the duration of the pipeline initialization to pin the pgid for me.<br/><br/>The stuff inside our computers never ceases to amaze and worry me.<br/>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2016-01-05</title>
	<link href="https://crawshaw.io/blog/2016-01-05" />
	<id>https://crawshaw.io/blog/2016-01-05</id>
	<updated>2016-01-05T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	A chart showing what interplanetary probe science missions will be active over the next few years.<a class="entrylink" href="http://www.planetary.org/blogs/emily-lakdawalla/2015/12311322-planetary-exploration-timelines.html">www.planetary.org/blogs/emily-lakdawalla/2015/12311322-planetary-exploration-timelines.html</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2016-01-04</title>
	<link href="https://crawshaw.io/blog/2016-01-04" />
	<id>https://crawshaw.io/blog/2016-01-04</id>
	<updated>2016-01-04T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	C typedefs can have side effects:<br/><pre>
typedef int (*WTF[1])[x = x * 77];
</pre>
<a class="entrylink" href="https://twitter.com/whitequark/status/683692712374190081">twitter.com/whitequark/status/683692712374190081</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2016-01-03</title>
	<link href="https://crawshaw.io/blog/2016-01-03" />
	<id>https://crawshaw.io/blog/2016-01-03</id>
	<updated>2016-01-03T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"As if this were somehow a morally inferior form of megadeath to lobbing a couple thousand half megaton nuclear missile warheads at your least favorite country. Apparently this is how civilized countries who do not possess enemies with a plurality of coastal cities exterminate their foes. I don’t understand such people. Nuclear war is bad in general, m’kay?"</blockquote><a class="entrylink" href="https://scottlocklin.wordpress.com/2015/12/31/putins-nuclear-torpedo-and-project-pluto/">scottlocklin.wordpress.com/2015/12/31/putins-nuclear-torpedo-and-project-pluto/</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2016-01-02</title>
	<link href="https://crawshaw.io/blog/2016-01-02" />
	<id>https://crawshaw.io/blog/2016-01-02</id>
	<updated>2016-01-02T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	There is a rather unfortunate essay going around the tech crowd titled <blockquote>"The Refragmentation"</blockquote>, describing concepts of nationalism and conformity in the US as centering around the Second World War. It is unfortunate, because this is a big topic with a lot of existing literature that is probably not commonly known in the tech crowd. It is also a very interesting topic. For any who want to persue it, I suggest <blockquote>"Nations and Nationalism Since 1780"</blockquote> by Eric Hobsbawm as a starting point.
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2016-01-01</title>
	<link href="https://crawshaw.io/blog/2016-01-01" />
	<id>https://crawshaw.io/blog/2016-01-01</id>
	<updated>2016-01-01T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"It’s a Cortex M4F MCU with extraordinarily-low current consumption. How low? They’re quoting 34 uA/MHz running from flash."</blockquote><a class="entrylink" href="http://www.embedded.com/electronics-blogs/break-points/4441091/Subthreshold-transistors-and-MCUs">www.embedded.com/electronics-blogs/break-points/4441091/Subthreshold-transistors-and-MCUs</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-12-29</title>
	<link href="https://crawshaw.io/blog/2015-12-29" />
	<id>https://crawshaw.io/blog/2015-12-29</id>
	<updated>2015-12-29T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"One of Heraclitus’ best lines turns on a pun. In the archaic dialect of Greek that Homer used the word for an archer’s bow is bios (βιός). The Greek word for ‘life’ is spelled the same way (βίος). Heraclitus’ line runs:<br/><br/>βιός τῷ τόξῳ ὄνομα βίος ἔργον δὲ θάνατος<br/><br/>The name of the bow is life but its work is death.<br/><br/>This is the first and the last word on technology."</blockquote><a class="entrylink" href="http://lazenby.tumblr.com/post/132758312387/what-do-you-think-about-steve-jobs">lazenby.tumblr.com/post/132758312387/what-do-you-think-about-steve-jobs</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>Under the heel of the spirit</title>
	<link href="https://crawshaw.io/blog/2015-12-28" />
	<id>https://crawshaw.io/blog/2015-12-28</id>
	<updated>2015-12-28T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"Everybody thought Kennedy and Johnson and Nixon were spending four-and-a-half percent of the federal budget each year to prove that America owned Science. This was all a fiction. The Apollo Program was an elaborate demonstration of how even the blandest among us are under the heel of the spirit."</blockquote><br/><blockquote>"NASA needed astronauts to go plant a flag on the moon. For obvious reasons, the astronauts ended up being the most reliable type of man America makes: white, straight, full-starch protestant, center-right, and spawned by the union of science and the military. Every last one of them was the heart of the heart of the tv dinner demographic. But then they get shot into space, tossed from the gravity of this planet, across a quartermillion miles of nothing, to be snagged by the moon after three days. Eighteen guys did this and twelve descended further to find out that moon dust smells like gunsmoke. Every single one of them came back irrevocably changed. America had sent the squarest motherfuckers it could find to the moon and the moon sent back humans."</blockquote><a class="entrylink" href="http://lazenby.tumblr.com/post/30206152130/well-right-naturally-you-should-hate">lazenby.tumblr.com/post/30206152130/well-right-naturally-you-should-hate</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-12-27</title>
	<link href="https://crawshaw.io/blog/2015-12-27" />
	<id>https://crawshaw.io/blog/2015-12-27</id>
	<updated>2015-12-27T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	Additive manufacturing results in unexpected designs.<a class="entrylink" href="http://www.insidemetaladditivemanufacturing.com/blog/design-for-slm-topology-optimisation-of-metallic-structural-nodes-in-architecture-applications">www.insidemetaladditivemanufacturing.com/blog/design-for-slm-topology-optimisation-of-metallic-structural-nodes-in-architecture-applications</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-12-26</title>
	<link href="https://crawshaw.io/blog/2015-12-26" />
	<id>https://crawshaw.io/blog/2015-12-26</id>
	<updated>2015-12-26T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	A CPU with integrated 10-meter range photonic IO (1.3W/Tbps) made on a standard commercial fab.<a class="entrylink" href="http://news.berkeley.edu/2015/12/23/electronic-photonic-microprocessor-chip/">news.berkeley.edu/2015/12/23/electronic-photonic-microprocessor-chip/</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-12-20</title>
	<link href="https://crawshaw.io/blog/2015-12-20" />
	<id>https://crawshaw.io/blog/2015-12-20</id>
	<updated>2015-12-20T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	Useful starting point for understanding shells.<a class="entrylink" href="http://www.gnu.org/software/libc/manual/html_node/Implementing-a-Shell.html#Implementing-a-Shell">www.gnu.org/software/libc/manual/html_node/Implementing-a-Shell.html#Implementing-a-Shell</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-12-15</title>
	<link href="https://crawshaw.io/blog/2015-12-15" />
	<id>https://crawshaw.io/blog/2015-12-15</id>
	<updated>2015-12-15T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	64-bit DLLs are installed in C:\Windows\System32.<br/>32-bit DLLs are installed in C:\Windows\SysWOW64.<br/><br/>This is fine.
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-12-04</title>
	<link href="https://crawshaw.io/blog/2015-12-04" />
	<id>https://crawshaw.io/blog/2015-12-04</id>
	<updated>2015-12-04T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	An article which covers a project I started at Google called Lingo (Logs In Go). Thanks to a lot of effort by many people, the project of replacing Sawzall with Go is now complete.<a class="entrylink" href="http://www.unofficialgoogledatascience.com/2015/12/replacing-sawzall-case-study-in-domain.html">www.unofficialgoogledatascience.com/2015/12/replacing-sawzall-case-study-in-domain.html</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-11-18</title>
	<link href="https://crawshaw.io/blog/2015-11-18" />
	<id>https://crawshaw.io/blog/2015-11-18</id>
	<updated>2015-11-18T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	Single-threaded CPU performance of high-end ARM-based tablets now outperforms the Intel-based MacBook Air.<a class="entrylink" href="http://arstechnica.com/apple/2015/11/ipad-pro-review-mac-like-speed-with-all-the-virtues-and-limitations-of-ios/4/">arstechnica.com/apple/2015/11/ipad-pro-review-mac-like-speed-with-all-the-virtues-and-limitations-of-ios/4/</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-11-16</title>
	<link href="https://crawshaw.io/blog/2015-11-16" />
	<id>https://crawshaw.io/blog/2015-11-16</id>
	<updated>2015-11-16T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"Where the jury actually comes out on Go may take years to determine. There are no clear formal methods of measurement for how 'good' a language is, so it mostly happens by default as popular systems thrive and unpopular ones wither and die."</blockquote><a class="entrylink" href="http://jmoiron.net/blog/for-better-or-for-worse/">jmoiron.net/blog/for-better-or-for-worse/</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-10-13</title>
	<link href="https://crawshaw.io/blog/2015-10-13" />
	<id>https://crawshaw.io/blog/2015-10-13</id>
	<updated>2015-10-13T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"So OpenSSL has optional code to reject attempts to use weak DES keys.  It, sanely, is not enabled by default; if you want it you have to compile with DEVP_CHECK_DES_KEY.<br/><br/>Last Thursday it was reported to the openssl-dev mailing list by Ben Kaduk that there was a defect in this optional code: it had a syntax error and didn't even compile.  It had a typo of '!!' instead of '||':<br/><pre>
if (DES_set_key_checked(&amp;deskey[0], &amp;data(ctx)->ks1)
!! DES_set_key_checked(&amp;deskey[1], &amp;data(ctx)->ks2))
</pre>
<br/>This syntax error was present in the original commit: the code in the #ifdefs_ had never been compiled.<br/>...<br/>The OpenSSL response?  The code... that in 11 years had never been used... for a deprecated cipher... was fixed on Saturday, retaining the #ifdefs<br/>"</blockquote><a class="entrylink" href="https://marc.info/?l=openbsd-tech&amp;m=144472550016118">marc.info/?l=openbsd-tech&amp;m=144472550016118</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-08-07</title>
	<link href="https://crawshaw.io/blog/2015-08-07" />
	<id>https://crawshaw.io/blog/2015-08-07</id>
	<updated>2015-08-07T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"The sinkhole attack is is used to drop a rootkit into SMRAM. Rootkit now invisible to the OS, ring 0, hypervisor, AV, and everything else."</blockquote><br/>Complete control of an Intel chip via SMM (called Ring -2 here, the first time I've heard it called that, and perhaps a bit anacronistic as SMM existed before the Hypervisor, which they're calling Ring -1). Fascinating tour through parts of the chip below the kernel we rarely have to think about.<br/><br/><blockquote>"A forgotten patch to fix a forgotten problem on a tiny number of legacy systems 20 years ago… That opens up an incredible vulnerability on an entirely unrelated piece of the processor."</blockquote><a class="entrylink" href="https://github.com/xoreaxeaxeax/sinkhole">github.com/xoreaxeaxeax/sinkhole</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-08-04</title>
	<link href="https://crawshaw.io/blog/2015-08-04" />
	<id>https://crawshaw.io/blog/2015-08-04</id>
	<updated>2015-08-04T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	A CubeSat is going to hitch a ride on SLS as a secondary payload, and then use a tiny electric engine to do a two-month lunar transfer. Amazing.<a class="entrylink" href="http://www.nasa.gov/feature/goddard/lunar-icecube-to-take-on-big-mission-from-small-package">www.nasa.gov/feature/goddard/lunar-icecube-to-take-on-big-mission-from-small-package</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-07-27</title>
	<link href="https://crawshaw.io/blog/2015-07-27" />
	<id>https://crawshaw.io/blog/2015-07-27</id>
	<updated>2015-07-27T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"Finally we decided to design the processor ourselves, because only in this way, we thought, could we obtain a truly complete display processor. We approached the task by starting with a simple scheme and adding commands and features that we felt would enhance the power of the machine. Gradually the processor became more complex. We were not disturbed by this because computer graphics, after all, are complex. Finally the display processor came to resemble a full-fledged computer with some special graphics features. And then a strange thing happened. We felt compelled to add to the processor a second, subsidiary processor, which, itself, began to grow in complexity. It was then that we discovered a disturbing truth. Designing a display processor can become a never-ending cyclical process. In fact, we found the process so frustrating that we have come to call it the 'wheel of reincarnation.' We spent a long time trapped on that wheel before we finally broke free."</blockquote><a class="entrylink" href="http://cva.stanford.edu/classes/cs99s/papers/myer-sutherland-design-of-display-processors.pdf">cva.stanford.edu/classes/cs99s/papers/myer-sutherland-design-of-display-processors.pdf</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-07-17</title>
	<link href="https://crawshaw.io/blog/2015-07-17" />
	<id>https://crawshaw.io/blog/2015-07-17</id>
	<updated>2015-07-17T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	My favorite error page.<a class="entrylink" href="https://books.google.com/googlebooks/error.html">books.google.com/googlebooks/error.html</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-07-15</title>
	<link href="https://crawshaw.io/blog/2015-07-15" />
	<id>https://crawshaw.io/blog/2015-07-15</id>
	<updated>2015-07-15T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	A one hour, high-level overview of modern chip design and fabrication techniques. Just enough detail to convince a layperson like myself that underneath our nebulous high-level concepts like cores, Moore's law, and clock speed lurks a tumultuous melange of industrial chemistry and mechanical engineering.<a class="entrylink" href="https://www.youtube.com/watch?v=NGFhc8R_uO4">www.youtube.com/watch?v=NGFhc8R_uO4</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-07-14</title>
	<link href="https://crawshaw.io/blog/2015-07-14" />
	<id>https://crawshaw.io/blog/2015-07-14</id>
	<updated>2015-07-14T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	New Horizons runs on a 12 MHz MIPS R3000.
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-07-07</title>
	<link href="https://crawshaw.io/blog/2015-07-07" />
	<id>https://crawshaw.io/blog/2015-07-07</id>
	<updated>2015-07-07T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	Apparently Matthias Felleisen, author of the interesting and surprising Racket, just gave a talk at a conference where he quipped: <blockquote>"I hated types, I admit it. And when I hate something, I study it. So I went to the vatican of types and when I came back, my hate was deeper and more nuanced."</blockquote>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-06-26</title>
	<link href="https://crawshaw.io/blog/2015-06-26" />
	<id>https://crawshaw.io/blog/2015-06-26</id>
	<updated>2015-06-26T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"While the Union survived the civil war, the Constitution did not. In its place arose a new, more promising basis for justice and equality, the 14th Amendment, ensuring protection of the life, liberty, and property of all persons against deprivations without due process, and guaranteeing equal protection of the laws.<br/>...<br/>The men who gathered in Philadelphia in 1787 could not have envisioned these changes. They could not have imagined, nor would they have accepted, that the document they were drafting would one day be construed by a Supreme Court to which had been appointed a woman and the descendent of an African slave. 'We the People' no longer enslave, but the credit does not belong to the Framers. It belongs to those who refused to acquiesce in outdated notions of 'liberty,' 'justice,' and 'equality,' and who strived to better them.<br/>"</blockquote><a class="entrylink" href="http://www.thurgoodmarshall.com/speeches/constitutional_speech.htm">www.thurgoodmarshall.com/speeches/constitutional_speech.htm</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-06-24</title>
	<link href="https://crawshaw.io/blog/2015-06-24" />
	<id>https://crawshaw.io/blog/2015-06-24</id>
	<updated>2015-06-24T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"Cherokee is the first Unicode language in which lower case runes have smaller values than upper case runes."</blockquote><a class="entrylink" href="https://golang.org/cl/11286">golang.org/cl/11286</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-06-22</title>
	<link href="https://crawshaw.io/blog/2015-06-22" />
	<id>https://crawshaw.io/blog/2015-06-22</id>
	<updated>2015-06-22T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	A wonderful amateur photo of Saturn along with a description of how it was made. I never cease to be amazed at the power of cheap consumer-grade computers.<a class="entrylink" href="http://imgur.com/a/ErrVN">imgur.com/a/ErrVN</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-06-01</title>
	<link href="https://crawshaw.io/blog/2015-06-01" />
	<id>https://crawshaw.io/blog/2015-06-01</id>
	<updated>2015-06-01T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"Two URL objects are equal if they have the same protocol, reference equivalent hosts, have the same port number on the host, and the same file and fragment of the file. Two hosts are considered equivalent if both host names can be resolved into the same IP addresses;"</blockquote><a class="entrylink" href="https://docs.oracle.com/javase/8/docs/api/java/net/URL.html#equals-java.lang.Object">docs.oracle.com/javase/8/docs/api/java/net/URL.html#equals-java.lang.Object</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-05-08</title>
	<link href="https://crawshaw.io/blog/2015-05-08" />
	<id>https://crawshaw.io/blog/2015-05-08</id>
	<updated>2015-05-08T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	Literature Survey of 2D scrolling.<a class="entrylink" href="https://docs.google.com/document/d/1iNSQIyNpVGHeak6isbP6AHdHD50gs8MNXF1GCf08efg/pub">docs.google.com/document/d/1iNSQIyNpVGHeak6isbP6AHdHD50gs8MNXF1GCf08efg/pub</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-05-07</title>
	<link href="https://crawshaw.io/blog/2015-05-07" />
	<id>https://crawshaw.io/blog/2015-05-07</id>
	<updated>2015-05-07T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<br/>When my programs were stored in CVS I learned to backup draft work regularly by writing diff output to scratch files. It is an odd workflow. I got good at extracting hunks of unified diffs, and even editing them in place.<br/><br/>Since then I've worked in svn, darcs, hg, fossil, git, and perforce. Each of these has an alternative to this workflow, some ways better and some ways worse. At some point I've tried them all. Now the world I inhabit is settling on git, so I recently invested some extra time in the git toolchain: branches, cherry picking, amending, and rebasing. Add some gerrit topics for extra spice. So far some of these tools work. Branches are OK. Some of them are usability disasters, like rebase. So I find myself writing git diff head^ > ~/Dropbox/d1 and going back to my old ways.<br/><br/>Six new tools in the 18 years since I developed those silly tricks for CVS, and I still find my way back to them. I find this unsatisfactory. Version control systems have failed to capture control of versions.<br/>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-04-02</title>
	<link href="https://crawshaw.io/blog/2015-04-02" />
	<id>https://crawshaw.io/blog/2015-04-02</id>
	<updated>2015-04-02T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	A slide on why all packet-based systems suffer from clumping.<br/><br/><blockquote>"- Picture two conversations sharing a congested gateway as two separate train tracks with one common section.<br/>- When a blue train waits for red trains to go through the shared section, the blue trains behind it catch up (get more clumped)<br/>- If the merge rules are efficient (service each color to exhastion), the system clumps exponentially fast.<br/>"</blockquote><a class="entrylink" href="http://www.pollere.net/Pdfdocs/QrantJul06.pdf">www.pollere.net/Pdfdocs/QrantJul06.pdf</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-03-10</title>
	<link href="https://crawshaw.io/blog/2015-03-10" />
	<id>https://crawshaw.io/blog/2015-03-10</id>
	<updated>2015-03-10T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<a class="entrylink" href="https://golang.org/s/go15gcpacing">golang.org/s/go15gcpacing</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-03-09</title>
	<link href="https://crawshaw.io/blog/2015-03-09" />
	<id>https://crawshaw.io/blog/2015-03-09</id>
	<updated>2015-03-09T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	JPL C style guide. In particular: No malloc. Static loop bounds. -Werror. One level of pointer dereferencing.<a class="entrylink" href="http://lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf">lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-03-01</title>
	<link href="https://crawshaw.io/blog/2015-03-01" />
	<id>https://crawshaw.io/blog/2015-03-01</id>
	<updated>2015-03-01T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<a class="entrylink" href="http://gothamist.com/2012/03/08/the_1960_plan_to_put_a_dome_over_mi.php">gothamist.com/2012/03/08/the_1960_plan_to_put_a_dome_over_mi.php</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-01-11</title>
	<link href="https://crawshaw.io/blog/2015-01-11" />
	<id>https://crawshaw.io/blog/2015-01-11</id>
	<updated>2015-01-11T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<br/>Espurino is a JS interpreter for microcontrollers. It can run JavaScript programs on a chip with 8KB of RAM, which is an impressive achievement. It is however, not JavaScript as you know it. This produces a 4kHz square wave:<br/><pre>
while (1) {A0.set();A0.reset();}
</pre>
And this produces a 3.5kHz square wave:<br/><pre>
while (1) { A0.set();                 A0.reset();               }
</pre>
Cute. We do our best to pretend our stack of CPU micro-ops, CPU caches, kernel schedulers, optimizing compilers, JIT compilers, and garbage collectors don't exist when we program. That almost always makes sense, because they almost always work. But it is nice to see a reminder that interpreting a program is not compiling it.<br/><a class="entrylink" href="http://www.espruino.com/Performance">www.espruino.com/Performance</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2015-01-10</title>
	<link href="https://crawshaw.io/blog/2015-01-10" />
	<id>https://crawshaw.io/blog/2015-01-10</id>
	<updated>2015-01-10T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	Swimming on the moon would be great fun.<a class="entrylink" href="http://what-if.xkcd.com/124/">what-if.xkcd.com/124/</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2014-12-11</title>
	<link href="https://crawshaw.io/blog/2014-12-11" />
	<id>https://crawshaw.io/blog/2014-12-11</id>
	<updated>2014-12-11T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	Discrete 555 Timer<a class="entrylink" href="http://shop.evilmadscientist.com/productsmenu/652">shop.evilmadscientist.com/productsmenu/652</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2014-07-28</title>
	<link href="https://crawshaw.io/blog/2014-07-28" />
	<id>https://crawshaw.io/blog/2014-07-28</id>
	<updated>2014-07-28T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	Apple are really good at processor transitions:<br/><pre>
1998: 68k to PowerPC
2005: PowerPC to x86 (big to little endian)
2007: x86 to ARM (most of OS X ended up in iOS)
2013: ARMv7 to ARMv8 (whole new ISA)
</pre>
Despite all the moves, I always had the software I wanted. Emulators, universal binaries, worked and had close to zero visible effect on me.<br/>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2014-06-13</title>
	<link href="https://crawshaw.io/blog/2014-06-13" />
	<id>https://crawshaw.io/blog/2014-06-13</id>
	<updated>2014-06-13T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	ELF Auxiliary Vectors<a class="entrylink" href="http://articles.manugarg.com/aboutelfauxiliaryvectors.html">articles.manugarg.com/aboutelfauxiliaryvectors.html</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2014-05-14</title>
	<link href="https://crawshaw.io/blog/2014-05-14" />
	<id>https://crawshaw.io/blog/2014-05-14</id>
	<updated>2014-05-14T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	Turns out evolutionary algorithms for antenna design are real. NASA's ST-5 (X-band), then LADEE (S-band), both flew antennas that no human would design.<a class="entrylink" href="http://ti.arc.nasa.gov/news/ladee-sband-evolved-antenna/">ti.arc.nasa.gov/news/ladee-sband-evolved-antenna/</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2014-05-06</title>
	<link href="https://crawshaw.io/blog/2014-05-06" />
	<id>https://crawshaw.io/blog/2014-05-06</id>
	<updated>2014-05-06T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	public static void/main args System out println/<blockquote>"No space for poem"</blockquote><a class="entrylink" href="https://twitter.com/rob_pike/status/463859025361125377">twitter.com/rob_pike/status/463859025361125377</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2014-04-18</title>
	<link href="https://crawshaw.io/blog/2014-04-18" />
	<id>https://crawshaw.io/blog/2014-04-18</id>
	<updated>2014-04-18T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	Mobile devices would be significantly better if we could manage 1ms visual response.<a class="entrylink" href="https://www.youtube.com/watch?v=vOvQCPLkPt4">www.youtube.com/watch?v=vOvQCPLkPt4</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2014-03-08</title>
	<link href="https://crawshaw.io/blog/2014-03-08" />
	<id>https://crawshaw.io/blog/2014-03-08</id>
	<updated>2014-03-08T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	When solving a big problem, focus on the process you are using to solve the problem.<a class="entrylink" href="http://www.azarask.in/blog/post/the-wrong-problem">www.azarask.in/blog/post/the-wrong-problem</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

	<entry>
	<title>2014-01-17</title>
	<link href="https://crawshaw.io/blog/2014-01-17" />
	<id>https://crawshaw.io/blog/2014-01-17</id>
	<updated>2014-01-17T00:00:00Z</updated>
	<content type="xhtml">
	<div xmlns="http://www.w3.org/1999/xhtml">
	<blockquote>"pogo oscillation occurs when a surge in engine pressure increases back pressure against the fuel coming into the engine, reducing engine pressure, causing more fuel to come in and increasing engine pressure again."</blockquote><a class="entrylink" href="https://en.wikipedia.org/wiki/Pogo_oscillation">en.wikipedia.org/wiki/Pogo_oscillation</a>
	</div>
	</content>
	<author><name>David Crawshaw</name><email>david@zentus.com</email></author>
	</entry>

</feed>