Sunday, November 4, 2012

Bitwise operators in C, JavaScript, Lua and Killa

So I thought that porting a Game Boy Advance (GBA) console emulator written in JavaScript to Killa would be fun, boy I was wrong... All because I wanted to play with this:

I ended ranting about bitwise operators and their issues in the languages I admire.

The case of C

Bitwise operations are very useful, especially if you are interfacing with hardware (or emulating it in my case), the C inventors knew this and made the right thing adding these lovely operators to the language from the beginning: ~, &, |, ^, <<, >>. This is not a beginner's tutorial about these operators so if you are not already looking with love these little symbols I'll suggest you to read this blog post for some nice tricks with explanations included. Nothing is perfect, however. The original bitwise operators were also used as logical operators in early versions of C (there were no && neither || operators), and thus they had lower priority than the comparison operators ==, !=, <=, <, >=, > to allow short circuit evaluations.

Dennis Ritchie, the grandfather of all the languages that matter, explains this in more depth:

The priorities of && || vs. == etc. came about in the following way.

Early C had no separate operators for & and && or | and ||. (Got that?) Instead it used the notion (inherited from B and BCPL) of "truth-value context": where a Boolean value was expected, after "if" and "while" and so forth, the & and | operators were interpreted as && and || are now; in ordinary expressions, the bitwise interpretations were used. It worked out pretty well, but was hard to explain. (There was the notion of "top-level operators" in a truth-value context.)

The precedence of & and | were as they are now.

Primarily at the urging of Alan Snyder, the && and || operators were added. This successfully separated the concepts of bitwise operations and short-circuit Boolean evaluation. However, I had cold feet about the precedence problems. For example, there were lots of programs with things like

if (a==b & c==d) ...

In retrospect it would have been better to go ahead and change the precedence of & to higher than ==, but it seemed safer just to split & and && without moving & past an existing operator. (After all, we had several hundred kilobytes of source code, and maybe 3 installations....)

What does this mean for us? Well, it just means that we need to be extra-careful to use additional parenthesis whenever we have expressions mixing bitwise and comparison operators. This isn't a terrible problem per se, because any worthy programmer would use a copious amount of parenthesis anyway, Isn't it?, Isn't it?. But it's bothersome nonetheless because this circumstantial flaw in the design of the C language allows a sneaky type of bug. It can happens to anybody, it doesn't matter if you work for Google or in the Doom 3 engine, you write something like this and go home happy:

without realizing that behind you your code is really doing this:

when your original intention was to do this:

The lesson? Use your parenthesis! And never trust in the programmer that says that he knows enough operator precedence rules to avoid using parenthesis. He is lying, nobody can do it, not forever. To make things worse newer languages like C++, Java and JavaScript followed this bad example, all in the name of compatibility.

How about fixing the operator precedence in C and related languages? Hahaha, don't dream about it. There are not hundreds of kilobytes of source code and maybe 3 installations now, there are millions and millions! Updating everything is just impossible.

The case of JavaScript

If following the bad example of C was not enough, JavaScript managed to add its own kirks. Because, surely, when you want to use bitwise operators you need them over signed integers. And thus the entire code of the JavaScript emulator I was porting was full of (x >>> 0) expressions that make the whole thing smell really ugly. But it was not the programmer fault. In JavaScript you must use >>> 0 to make the result of any bitwise operation interpreted as an unsigned integer. You'll need to translate C bitwise expressions to their JavaScript equivalents in this way:

 C: (3774191835 >> 2) | 2147483648
JS: (3774191835 >>> 2 | 2147483648) >>> 0

 C: 1986735448 << 1
JS: (1986735448 << 1) >>> 0

 C: 3774191835 & 4294967295
JS: (3774191835 & 4294967295) >>> 0

Are you feeling the pain? If JavaScript was going to copy the C bitwise operators (and their flawed precedence) at least it would have maintained the same behavior. Chances of fixing this? The same as making Microsoft, Google, Apple and Mozilla good friends and trashing JavaScript for a better language.

The case of Lua

The good news is that Lua doesn't have the above problems. The bad news is that Lua doesn't even have bitwise operators. After some years of user complaining and bitching in their user list, the benevolent Lua team finally acknowledged that having bitwise operations was a good thing. So they added them to the Lua 5.2 version, but instead of adding them in the the form of sane operators, they added them in the form of library functions. Not only they have horrendous names (Lua has not, and and or as keywords, so names like bnot, band and bor where used), the performance is less than stellar, some guy got bored of wasting CPU cycles and implemented the whole thing with Virtual Machine (VM) opcodes, the performance increased by one order of magnitude. Of course, all this doesn't matter because is not stock Lua and you would still need to write something like this:

  C: (3774191835 >> 2) | 2147483648
Lua: bit32.bor(bit32.rshift(3774191835, 2), 2147483648)

But, it's not that ugly... unless you have to port a GBA console emulator.

The case of Killa

So I was doing my very own programming language stealing the Lua 5.2 VM and because of that considering to use the same bitwise library for compatibility reasons (or it was my own laziness?). Whichever, the case is that porting the GBA emulator made me realize that enough is enough, the port was going to take forever and be a real PITA and I was seriously damned if I'd not implemented real bitwise operators for Killa. The good news? I fixed the operators precedence in the process (is now similar to Ruby or Python):

**              Exponentiation
! ~ - $         Not, complement, unary minus, length
* / %           Multiply, divide, and modulo
+ -             Addition and subtraction
>> <<           Right and left bitwise shift
&               Bitwise AND
^               Bitwise XOR
|               Bitwise OR
..              String Concatenation
== != <= < > >= Comparison operators
&&              Logical AND
||              Logical OR

The bitwise operators were implemented as VM opcodes for maximum performance. For give you an idea of how fast they are consider the problem I solved previously using strings and arrays, that method takes 113.8 seconds to get the answer for (n = 12). Using bitwise operations this time is reduced to 15 seconds:

And for the record the solution in Python takes 23 seconds. So Killa wins.

Note: this post has some trolling bits, I hope it didn't offend you much.

No comments:

Post a Comment