So today, my Internet connection went out. The router got stuck overnight, and I rebooted it, and no big deal. Windows, however, still shows this, even though I have a perfectly fine network connection:
If you Google it, you’ll find lots of people have the problem, across multiple versions of Windows, going back years. The solutions vary from “just reboot” to complicated registry hacks to “reinstall Windows.” 🤦♂️
I can’t even.
I hear anecdotal stories about “weird problems” like the one above all the time: My friend’s father can’t get the printer to work without reinstalling the drivers every time he uses it. Your cousin’s word processor crashes every time she clicks the “Paste” button and there’s an image on the clipboard. My colleague’s video glitches, but only in a video call with more than three people. And invariably, the solutions are always the same: Reinstall something. This one weird registry hack. Try my company’s cleaning software!
So I’d like to let all of the ordinary, average, nontechnical people in the room in on a little secret:
This is bullshit.
All of it is bullshit. Start to finish. Nearly every answer you hear about how to “fix” your bizarre issues is lies and garbage.
I was asked on Discord today why some languages require semicolons and others don’t, and this is one of those surprisingly deep questions that to the best of my knowledge hasn’t been answered very well elsewhere:
Why do some languages end statements with semicolons?
Why do other languages explicitly not end statements with semicolons?
Why do some languages require them but it seems like the compiler could just “figure it out,” since it seems to know when you’ve forgotten them?
Why are they optional in some languages?
And, of all things, why the weird shape that is the semicolon? Why not | or $ or even ★ instead?
So let’s talk about semicolons, and try to answer this as well as we can.
I work with ideas about computers. I think about the things computers can do, and I try to find ways to make computers do those things better or faster, and I write all those ideas down. And I try to find ways to stop computers from ever being slow, so that we don’t have to make them faster. I also think a lot about if there are things that computers can or can’t do, and if it’s important that computers can or can’t do them. It’s bad for people when computers are slow or when computers can’t do things because we want computers to help us with things we want to do. But making computers do things that they can’t do is hard, and making computers go faster can be hard too. So sometimes I use ideas from other people to make the things computers do faster or better, and sometimes I find my own ideas too, and then I write those ideas down and tell everyone about them so all of us can make computers do more things better.
Lately, I’ve been growing increasingly obsessed with this problem. While my solution is very fast (O(n) is pretty fast!), I’ve been concerned about a few possible issues with it:
First, I wasn’t certain it was anything better than locally-optimal. It’s guaranteed not to produce a bad result, but will it produce a good result? I couldn’t be sure.
Second, it relies on floating-point arithmetic, while most other solutions don’t.
It has the nice upside of being able to operate in constant space (not including the O(k) output space), and linear time, but those two caveats are potentially problematic. If it turned out to be a really bad solution, who’d use it? And the floating-point numbers felt too fuzzy to be safe. So I started poking at it again.
Being a computer scientist is a funny thing. You live on the edge of knowledge in a weird realm that’s not quite mathematics and not quite physical machinery. Like most of the sciences, of course, anything you “discover” was likely already discovered several decades before. But every great once in a while, you bump into a problem that seems to have received very little attention, and you see the existing solutions, and you think, “Huh, it seems like there must be a better way.”
I did that, once, a long time ago: I discovered a really neat technique for computing treemaps on-the-fly for arbitrarily large datasets, and that’s why SpaceMonger 2+ have such incredibly fast renderings. I don’t know for sure if it’s a novel solution, but I haven’t seen anyone else do it. One of these years, I need to properly publish a paper on how that works: I haven’t been intentionally keeping that algorithm private; I just haven’t gotten around to writing up a really good description of how it works.
But today, I’m going to talk about another algorithm entirely: Linear partitioning a sequence of numbers.
Twenty years ago today, I got a Knuth Check. Knuth dated it October 14, 2001, and I received it in the mail in early November that year, I think.
I’d heard, of course, about Knuth Checks, but I didn’t really know as much about them then as I do today. I was a kid a few years out of college, just trying to eke out a living building software; and when I found a mistake in an algorithm in his book, I intended to write a note in a short e-mail to Dr. Knuth, but I found out he was (wisely) avoiding the medium, so I sent him a letter. And a few months later, I got back a check.
It’s sat in a frame on my desk for twenty years, singularly my proudest professional accomplishment. I’ve shipped tons of software, I’ve made and discovered amazing things, I’ve won other awards — but nothing holds a candle to having a Knuth Check with your name on it.
(More accurately, the original is in my safe-deposit box at the bank, and a copy has sat in a frame for twenty years, because I’m not leaving something this valuable out on my desk! 😃)
So here’s the story:
The Art of Computer Programming, Volume 1, has an algorithm on page 26 that describes an efficient solution for calculating a fractional logarithm in binary, using pure integer math. The algorithm in question derives from one first published in 1624 by Henry Briggs. At the bottom of page 26 in the 3rd Edition, the book mentions that Richard Feynman (yes, that Feynman) determined an efficient inverse algorithm for calculating a fractional exponent in binary, and it proposes to the reader to solve it as problem #28.
As is typical of all of Knuth’s “problems for the reader,” he includes the answer in the rather large appendix that comprises the last 150 pages of the book. The solution to problem #28 can be found at the bottom of page 470, and, unfortunately, it is not the same algorithm that Feynman discovered: It has a critical error in the first step:
E1. [Initialize.] If 1 – ϵ is the largest possible value of x, set y ← (nearest approximation to b1-ϵ), x ← 1 – ϵ, k ← 1. (The quantity yb-x will remain approximately constant in the following steps.)
TAoCP, V1, 3rd ed., pp. 470
That has a rather unfortunate error in it: x ← 1 – ϵ is potentially a meaningful expression in that context, but it’s wrong, and it will calculate not the correct exponent but garbage data, with, as I recall, mostly zero bits and a few semi-randomly-distributed one bits.
The correct expression is x ← 1 – x. It’s possible the other form was a typographical error on Knuth’s part, but it’s not in the same category as a simple omission or spelling error, since if you follow the mathematics as written, you will simply get garbage, and not get a valid fractional exponent.
(I struggled for a few days with the original: It was Knuth; surely I’d been the one to make the mistake in my code, not the good Doctor in his! But ultimately when I began to experiment with the broken algorithm, the answer became obvious.)
So I wrote Knuth a letter in the spring of 2001 saying that I thought that was wrong as published, since it generated garbage output, and I included that I thought that x ← 1 – x was actually the correct form, since it appeared to be correct. And he sent me back a rather nice letter and a check for $2.56 for having found an error.
(I had also suggested that the inherent parallel execution implied by the commas in step E2 appeared to be wrong, that sequential execution appeared to be needed there; Knuth chastised me in the letter that no, parallel execution was in fact intended. But I still got an error check for a broken algorithm!)
Interestingly, I didn’t get my check by just searching for errors, like many people have: I was trying to build something; I encountered that error merely by remembering that the books contained an answer to the problem I was facing. But it seems that not many people build things by literally reading the textbook and doing exactly what it says; and apparently, nobody had implemented Feynman’s exponentiation algorithm from the book after it appeared in the 3rd Edition in 1997. [ed. note: I checked, and Knuth added it to the 3rd Ed. in April 1995; my error can be seen in his addenda.]
That said, an entertaining question to posit: Why the heck was I trying to implement fractional logarithms and exponents in the first place?
The answer was that it was 2001, and Windows 98 and ’486 PCs were still pretty common, and you couldn’t assume a computer supported floating-point math. And I needed to calculate powers: As simple as xy, just like on your calculator, and it was even constrained: In my case, x was always an integer, and y was always a number between 1 and 2. But I didn’t have guaranteed floating-point hardware available so I could just write pow(x, y). So I improvised: I used the Briggs and Feynman methods to calculate the logarithm and the exponent in pure integer math, so that I could implement calculation of a power as exp(y * log x) instead, and effectively use multiplication — which the hardware supported — to produce a power (which it definitely didn’t).
Today, of course, I’d write pow(x, y), just like you would. But when you’re stuck on super-limited hardware, you have to be a bit more clever.
(For what it’s worth, you can see the code live, in action, if you have a copy of SpaceMonger 2 or 3; just go into the Settings, and turn up the “Exaggeration” slider: Every file’s size will be raised to the power you choose, using exactly this math!)
Feynman, of course, derived that algorithm for exactly the same reason I needed it: They were using mechanical adding machines during WWII as part of the Manhattan Project, and those machines didn’t do powers. But the relativistic math for the atomic bomb needed lots of powers; so he derived the exponent algorithm as the converse to the logarithm algorithm, so that they were able to just use addition, subtraction, and multiplication, which the adding machines could do, to still efficiently calculate powers, which the machines couldn’t.
So that’s my Knuth Check story. It’s unlikely I’ll ever earn another; I certainly haven’t in the last twenty years! I still haven’t even finished reading Volume 4A yet, and I’m far too busy to randomly pore over the other tomes again just in search of a mistake. But I still managed to earn a prize that’s among the most coveted in this field, and twenty years later, I’m still happy to have done it even once.
I’ve had to say this several times over the years, so I thought I’d repeat it in a place where I can just make a link to it: The Dictionary<K,V> class in .NET does not store your data in order.
When you .Add() or .Remove() items from a dictionary, they get stored in whatever order the dictionary finds is most efficient. It just so happens that as long as you never.Remove() an item, the underlying data structure ends up with the items in order when you enumerate them, but this is not guaranteed.
Here’s an example program to demonstrate that it doesn’t work the way you probably expect:
You might think reading this that the first output would be in order of [A, B, C, D, E, F], and the second would go [A, B, C, E, F, G]. But that’s not the output that’s produced:
A -> Apple
B -> Banana
C -> Cherry
D -> Durian
E -> Elderberry
F -> Fig
A -> Apple
B -> Banana
C -> Cherry
G -> Grape
E -> Elderberry
F -> Fig
The new entry gets slotted in where the last one was removed, which is exactly the behavior you’d expect if you had read the underlying code, but it’s very different than what you’d expect if you just assumed everything will always have insertion order.
Moral of the story: If you need dictionary data to be ordered, don’t assume it’s ordered — explicitly order it!
If you need things to be in insertion order, keep them in a list.
If you need things to be ordered by key, consider a SortedDictionary<K,V> instead.
If you only need things sometimes to be ordered by key, use Linq: dictionary.OrderBy(pair => pair.Key)
Comments Off on Reminder: .NET Dictionary isn’t ordered