No Software is Hamster-proof | BASeCamp Programming Blog

Alternate Title: Software Licenses and implicit trust

It is interesting to note that in many circles proprietary software is inherently considered untrustworthy. That is, of course, not for no reason- it is much more difficult to audit and verify that the software does what it is supposed to and to check for possibly security problems. However, conversely, it seems that a lot of Open Source software seems to get a sort of implicit trust applied to it. The claim is that if there isn’t somebody sifting through and auditing software, you don’t know what is in there- and, conversely, that if something is open source, we do know what is in there.

But, I would argue that the binaries are possibly more trustworthy in attempting to determine what a piece of software is doing simply by virtue of it being literally what is being executed. Even if we consider the scenario of auditing source code and building binaries ourself, we have to trust the binary of the compiler to not be injecting malicious code, too.

I’ve found that this sort of rabbit hole is something that a lot of Open Source advocates will happily woosh downwards as far as possible for proprietary software, but seem to avoid falling for Open Source software. Much of the same logic that get’s applied to justify distrust of proprietary binary code should cause distrust in areas of Open Source, but for some reason a lot of aspects of Open Source and the Free Software Community are free from the same sort of cynicism that is applied to proprietary software, even though there is no reason to think that software falling under a specific license makes it inherently more or less trustworthy. If we can effectively assume malicious motives for proprietary software developers, why do we presume the opposite for Open Source, particularly since it is now such a better target for malicious actors due to the fact that it is so often implicitly trusted?

Source code provided with a binary doesn’t mean anything because- even assuming users capable of auditing said code, there is no way to reliably and verifiably know that the source code is what was used to build the binary. Trust-gaining exercises like hashes or MD5sums can be adjusted, collided, or changed and web servers hacked to make illegitimate binary releases appear legitimate to propagate undesirable code which simply doesn’t appear in the associated source code with a supposed release (Linux Mint). Additionally, The indeterminate nature of modern compilers means that even compiling the same source more than once can often give completely different results as well, so you cannot really verify that the source matches a given binary by rebuilding the source and comparing the resulting binary to the one being verified.

Therefore, it would seem the only reasonable recourse is to only run binaries that you build yourself, from source that has been appropriately audited.

Thusly, we will want to audit the source code. And the first step is getting that source code. A naive person might think a git pull is sufficient. But no no- That is a security risk. What if GitHub is compromised to specifically deliver malicious files with that repository, hiding secret exploits deep within the source codebase? Too dangerous. Even with your careful audit, you could miss those exploits altogether.

Instead, the only reasonable way to acquire the source code to a project is to discover reliable contact details for the project maintainer and send then a PGP encrypted message requesting that they provide the source code either at a designated drop point- Which will have to be inconspicuous and under surveillance by an unaffiliated third party trusted by both of you – Or have him send a secure, asymmetrically encrypted message containing the source tarball.

Once you have the source, now you have to audit the entire codebase. Sure, you could call it quits and go "Developer says it’s clean, I trust him" fine. be a fool. be a foolish fool you fooly foolerson, because even if you know the tarball came from the developer, and you trust them- do you trust their wife? their children? their pets? Their neighbors? You shouldn’t. In fact, you shouldn’t even trust yourself. But you should, because I said you shouldn’t and you shouldn’t trust me. On the other hand, that’s exactly what I might want you to think.

"So what if I don’t trust their hamster, what’s the big deal"

Oh, of course. Mr Security suddenly decides that something is too off-the-wall.

Hamsters can be trained. Let that sink in. Now you know why you should never trust them. Sure, they look all cute running on their little cage, being pet by the developers cute 11 year old daughter, but looks can be deceiving. For all you know their daughter is a secret Microsoft agent and the hamster has been trained or brainwashed- using evil, proprietary and patent encumbered technology, no doubt, to act as a subversive undercurrent within that source repository. With full commit access to that project’s git repository, the hamster can execute remote commands issued using an undocumented wireless protocol that has no man page, which will cause it to perform all sorts of acts of terror on the git repository. Inserting NOP sleds before security code, adding JMP labels where they aren’t necessary. even adding buffer overflows by adding off-by-one errors as part of otherwise benign bugfixes.

Is it very likely? No. But it’s *possible* so cannot be ignored.

Let’s say you find issues and report them.

Now, eventually, the issues will be fixed. The lead developer might accept a pull, and claim it to fix the issue.

Don’t believe the lies. You must audit the pull yourself and find out what sinister motives underly the so-called "fix". "Oh, so you thought you could just change that if condition, did you? Well did you know that on an old version of the PowerPC compiler, this generates code that allows for a sophisticated remote execution exploit if running under Mac OS 9?" Trust nobody. No software is hamster-proof.

Have something to say about this post? Comment!