My grandmother passed away today, early afternoon.

My cat passed away today, late afternoon.

Both were very old; my grandmother was 91, and my cat was 17.

But both happened on the same day, two hours apart. I was almost holding it together, after my grandmother passed. And then our cat started crying, her stomach in pain.

We’re deleting this date from the calendar. Henceforth, in subsequent years, we will all go straight from October 8 to October 10.

So today, my Internet connection went out. The router got stuck overnight, and I rebooted it, and no big deal. Windows, however, still shows this, even though I have a perfectly fine network connection:

If you Google it, you’ll find lots of people have the problem, across multiple versions of Windows, going back years. The solutions vary from “just reboot” to complicated registry hacks to “reinstall Windows.” š¤¦āāļø

I can’t even.

I hear anecdotal stories about “weird problems” like the one above all the time: My friend’s father can’t get the printer to work without reinstalling the drivers every time he uses it. Your cousin’s word processor crashes every time she clicks the “Paste” button and there’s an image on the clipboard. My colleague’s video glitches, but only in a video call with more than three people. And invariably, the solutions are always the same: Reinstall something. This one weird registry hack. Try my company’s cleaning software!

So I’d like to let all of the ordinary, average, nontechnical people in the room in on a little secret:

This is bullshit.

All of it is bullshit. Start to finish. Nearly every answer you hear about how to “fix” your bizarre issues is lies and garbage.

I was asked on Discord today why some languages require semicolons and others don’t, and this is one of those surprisingly deep questions that to the best of my knowledge hasn’t been answered very well elsewhere:

Why do some languages end statements with semicolons?

Why do other languages explicitly not end statements with semicolons?

Why do some languages require them but it seems like the compiler could just “figure it out,” since it seems to know when you’ve forgotten them?

Why are they optional in some languages?

And, of all things, why the weird shape that is the semicolon? Why not | or $ or even ā instead?

So let’s talk about semicolons, and try to answer this as well as we can.

Q. What is crypto? A. Real cryptography is a difficult and important branch of mathematics. Crypto is a pyramid/Ponzi/MLM scheme ā a scam designed to steal your money.

Q. What is Bitcoin? A. Bitcoin is a scam. It is a sophisticated computer program designed to steal your money by convincing you to trade your money for a magic bean number.

I work with ideas about computers. I think about the things computers can do, and I try to find ways to make computers do those things better or faster, and I write all those ideas down. And I try to find ways to stop computers from ever being slow, so that we don’t have to make them faster. I also think a lot about if there are things that computers can or can’t do, and if it’s important that computers can or can’t do them. It’s bad for people when computers are slow or when computers can’t do things because we want computers to help us with things we want to do. But making computers do things that they can’t do is hard, and making computers go faster can be hard too. So sometimes I use ideas from other people to make the things computers do faster or better, and sometimes I find my own ideas too, and then I write those ideas down and tell everyone about them so all of us can make computers do more things better.

Lately, I’ve been growing increasingly obsessed with this problem. While my solution is very fast (O(n) is pretty fast!), I’ve been concerned about a few possible issues with it:

First, I wasn’t certain it was anything better than locally-optimal. It’s guaranteed not to produce a bad result, but will it produce a good result? I couldn’t be sure.

Second, it relies on floating-point arithmetic, while most other solutions don’t.

It has the nice upside of being able to operate in constant space (not including the O(k) output space), and linear time, but those two caveats are potentially problematic. If it turned out to be a really bad solution, who’d use it? And the floating-point numbers felt too fuzzy to be safe. So I started poking at it again.

Being a computer scientist is a funny thing. You live on the edge of knowledge in a weird realm that’s not quite mathematics and not quite physical machinery. Like most of the sciences, of course, anything you “discover” was likely already discovered several decades before. But every great once in a while, you bump into a problem that seems to have received very little attention, and you see the existing solutions, and you think, “Huh, it seems like there must be a better way.”

I did that, once, a long time ago: I discovered a really neat technique for computing treemaps on-the-fly for arbitrarily large datasets, and that’s why SpaceMonger 2+ have such incredibly fast renderings. I don’t know for sure if it’s a novel solution, but I haven’t seen anyone else do it. One of these years, I need to properly publish a paper on how that works: I haven’t been intentionally keeping that algorithm private; I just haven’t gotten around to writing up a really good description of how it works.

But today, I’m going to talk about another algorithm entirely: Linear partitioning a sequence of numbers.

Twenty years ago today, I got a Knuth Check. Knuth dated it October 14, 2001, and I received it in the mail in early November that year, I think.

Iād heard, of course, about Knuth Checks, but I didnāt really know as much about them then as I do today. I was a kid a few years out of college, just trying to eke out a living building software; and when I found a mistake in an algorithm in his book, I intended to write a note in a short e-mail to Dr. Knuth, but I found out he was (wisely) avoiding the medium, so I sent him a letter. And a few months later, I got back a check.

Itās sat in a frame on my desk for twenty years, singularly my proudest professional accomplishment. Iāve shipped tons of software, Iāve made and discovered amazing things, Iāve won other awards ā but nothing holds a candle to having a Knuth Check with your name on it.

(More accurately, the original is in my safe-deposit box at the bank, and a copy has sat in a frame for twenty years, because Iām not leaving something this valuable out on my desk! š)

So here’s the story:

The Art of Computer Programming, Volume 1, has an algorithm on page 26 that describes an efficient solution for calculating a fractional logarithm in binary, using pure integer math. The algorithm in question derives from one first published in 1624 by Henry Briggs. At the bottom of page 26 in the 3^{rd} Edition, the book mentions that Richard Feynman (yes, that Feynman) determined an efficient inverse algorithm for calculating a fractional exponent in binary, and it proposes to the reader to solve it as problem #28.

As is typical of all of Knuthās āproblems for the reader,ā he includes the answer in the rather large appendix that comprises the last 150 pages of the book. The solution to problem #28 can be found at the bottom of page 470, and, unfortunately, it is not the same algorithm that Feynman discovered: It has a critical error in the first step:

E1. [Initialize.] If 1 – Ļµ is the largest possible value of x, set y ā (nearest approximation to b^{1-Ļµ}), x ā 1 – Ļµ, k ā 1. (The quantity yb^{-x} will remain approximately constant in the following steps.)

TAoCP, V1, 3rd ed., pp. 470

That has a rather unfortunate error in it: x ā 1 – Ļµ is potentially a meaningful expression in that context, but itās wrong, and it will calculate not the correct exponent but garbage data, with, as I recall, mostly zero bits and a few semi-randomly-distributed one bits.

The correct expression is x ā 1 – x. Itās possible the other form was a typographical error on Knuthās part, but itās not in the same category as a simple omission or spelling error, since if you follow the mathematics as written, you will simply get garbage, and not get a valid fractional exponent.

(I struggled for a few days with the original: It was Knuth; surely Iād been the one to make the mistake in my code, not the good Doctor in his! But ultimately when I began to experiment with the broken algorithm, the answer became obvious.)

So I wrote Knuth a letter in the spring of 2001 saying that I thought that was wrong as published, since it generated garbage output, and I included that I thought that x ā 1 – x was actually the correct form, since it appeared to be correct. And he sent me back a rather nice letter and a check for $2.56 for having found an error.

(I had also suggested that the inherent parallel execution implied by the commas in step E2 appeared to be wrong, that sequential execution appeared to be needed there; Knuth chastised me in the letter that no, parallel execution was in fact intended. But I still got an error check for a broken algorithm!)

Interestingly, I didnāt get my check by just searching for errors, like many people have: I was trying to build something; I encountered that error merely by remembering that the books contained an answer to the problem I was facing. But it seems that not many people build things by literally reading the textbook and doing exactly what it says; and apparently, nobody had implemented Feynmanās exponentiation algorithm from the book after it appeared in the 3^{rd} Edition in 1997. [ed. note: I checked, and Knuth added it to the 3^{rd} Ed. in April 1995; my error can be seen in his addenda.]

That said, an entertaining question to posit: Why the heck was I trying to implement fractional logarithms and exponents in the first place?

The answer was that it was 2001, and Windows 98 and ā486 PCs were still pretty common, and you couldnāt assume a computer supported floating-point math. And I needed to calculate powers: As simple as x^{y}, just like on your calculator, and it was even constrained: In my case, x was always an integer, and y was always a number between 1 and 2. But I didnāt have guaranteed floating-point hardware available so I could just write pow(x, y). So I improvised: I used the Briggs and Feynman methods to calculate the logarithm and the exponent in pure integer math, so that I could implement calculation of a power as exp(y * log x) instead, and effectively use multiplication ā which the hardware supported ā to produce a power (which it definitely didnāt).

Today, of course, Iād write pow(x, y), just like you would. But when youāre stuck on super-limited hardware, you have to be a bit more clever.

(For what itās worth, you can see the code live, in action, if you have a copy of SpaceMonger 2 or 3; just go into the Settings, and turn up the “Exaggeration” slider: Every fileās size will be raised to the power you choose, using exactly this math!)

Feynman, of course, derived that algorithm for exactly the same reason I needed it: They were using mechanical adding machines during WWII as part of the Manhattan Project, and those machines didnāt do powers. But the relativistic math for the atomic bomb needed lots of powers; so he derived the exponent algorithm as the converse to the logarithm algorithm, so that they were able to just use addition, subtraction, and multiplication, which the adding machines could do, to still efficiently calculate powers, which the machines couldnāt.

So thatās my Knuth Check story. Itās unlikely Iāll ever earn another; I certainly havenāt in the last twenty years! I still havenāt even finished reading Volume 4A yet, and Iām far too busy to randomly pore over the other tomes again just in search of a mistake. But I still managed to earn a prize thatās among the most coveted in this field, and twenty years later, Iām still happy to have done it even once.

I’ve had to say this several times over the years, so I thought I’d repeat it in a place where I can just make a link to it: The Dictionary<K,V> class in .NET does not store your data in order.

When you .Add() or .Remove() items from a dictionary, they get stored in whatever order the dictionary finds is most efficient. It just so happens that as long as you never.Remove() an item, the underlying data structure ends up with the items in order when you enumerate them, but this is not guaranteed.

Here’s an example program to demonstrate that it doesn’t work the way you probably expect:

You might think reading this that the first output would be in order of [A, B, C, D, E, F], and the second would go [A, B, C, E, F, G]. But that’s not the output that’s produced:

Fruits:
A -> Apple
B -> Banana
C -> Cherry
D -> Durian
E -> Elderberry
F -> Fig
Fruits:
A -> Apple
B -> Banana
C -> Cherry
G -> Grape
E -> Elderberry
F -> Fig

The new entry gets slotted in where the last one was removed, which is exactly the behavior you’d expect if you had read the underlying code, but it’s very different than what you’d expect if you just assumed everything will always have insertion order.

Moral of the story: If you need dictionary data to be ordered, don’t assume it’s ordered ā explicitly order it!

If you need things to be in insertion order, keep them in a list.

If you need things to be ordered by key, consider a SortedDictionary<K,V> instead.

If you only need things sometimes to be ordered by key, use Linq: dictionary.OrderBy(pair => pair.Key)

Comments Off on Reminder: .NET Dictionary isn’t ordered