Wednesday, June 1, 2011

C Arrays: Here There Be Dragons

Note: This blog is deprecated. @synthesize zach has moved to a new home, at zpasternack.org. This blog entry can be found on the new blog here. Please update your links.

I was just answering a question on Stackoverflow (funny how often that inspires me to blog something) about using a C array in a Cocoa app.

The short answer is, you almost never want to do that. Sure, there are times when you might (if you’re going to end up passing it to some C function that needs a raw array; OpenGL or cocos2d, perhaps). In general, though, using some kind of container object is almost always a better solution.

This issue hit home for me, not too long ago. While developing PuzzleTiles, we needed a data model to store game state. Hmm, a two dimensional grid of tiles? My knee-jerk reaction was to type this:
const int MAX_ROWS = 8;
const int MAX_COLUMNS = 8;
@interface GameState : NSObject
{
    // stuff
    tileGrid[MAX_ROWS][MAX_COLUMNS];
    // more stuff
}
@end

Almost as soon as I typed this, I got this terrible sense of dread. As if a million memory buffers cried out, then were suddenly silenced, due to overflow.

But then I came to my senses. I got my big boy pants on, see? I’m an old hand, a seasoned vet, I know damn well what I’m doing. Right? Right.

I made sure to be very careful. tileGrid had no outside accessors; its state was returned via public member functions like - (int) getTileAtX:(int)x y:(int)y Only GameState could muck with the grid, internally, and I was very careful about that. All accessors did sanity checks on all input and output parameters. I wrote code that, in Debug builds, would do an internal consistency check on tileGrid after each tile move operation, to make sure nothing went bad. I knew I had this stuff figured out.

Weeks passed. All memories of tileGrid faded away; we had other issues we were dealing with as we neared our release. GameState was rarely even looked at; it was a pretty simple piece of code, and was (as far as we knew) rock solid. And yet... we were plagued with very rare, seemingly random crashes. Sometimes we’d see them once or twice in a day, sometimes not for several days in a row. If we ever did get to see one in the debugger, or view a crash log, it was complete nonsense. Code that should never crash, was. And it was rarely the same piece of code that failed. It was maddening. For weeks we fought this, scrutinizing every memory allocation and release; running over and over again in Leaks; poring over all our memory management code for any hint of trouble.

In desperation, I eventually returned to GameState. I added some extra consistency checking, and... saw it fail within a minute of running the game. It turned out my original consistency checking code was missing one corner case, and it turned out that case happened pretty often. When it did fail, one part of GameState’s internal code would fail to find a tile, and return the invalid value {-1, -1}. Then another part of the code would not bother to check for the invalid value, and would happily use it to index into tileGrid, accessing tileGrid[-1][-1], effectively randomly overwriting 4 bytes of whatever poor object happened to be 36 bytes before it. Now the random maddening crashes were completely clear.

It wasn’t all bad. While this bug eluded us, we spent a whole lot of time seriously scrutinizing our memory management. In the end, by the time we finally nailed this bug, the entire app was completely bullet proof.

Lesson (re-) learned: if you find yourself using a raw C array, think really hard. Is that really the best way to go? We were careful, and knew what we were doing, and still paid a price. Here there be dragons.

No comments:

Post a Comment