Born Sleepy

October 05, 2012

Simon Wolf wrote a blog post recently talking about passing NSError pointers into methods like this:

NSError* error = nil;
[something someMethod:@"blah" error:&error];

He talked about the fact that when you do this, you should always check the result of the method before using the error. In other words, you do this:

NSError* error = nil;
if (![something someMethod:@"blah" error:&error])
{
	NSLog(@"got error %@", error);
}

and not this:

NSError* error = nil;
[something someMethod:@"blah" error:&error];
if (error)
{
	NSLog(@"got error %@", error);
}

This is good advice. The method you’re calling isn’t guaranteed to set error to a non-nil value if nothing goes wrong. It might leave it alone, but it might set it to a temporary value, even though it eventually returns an “ok” result.

However, Simon then went on to talk about this line at the top:

NSError* error = nil;

His assertion was that by setting error to nil, you’re indicating that you intend to test it after the call - in other words that it’s an indicator that you don’t know what you’re doing.

He suggested that you should just do this:

NSError* error;

Strictly speaking, this is correct. If the method you’re calling follows the implicit contract for such functions, this should be fine.

However, I always set the error to nil before I pass it in, even though I never test it after the call. Why do I do this?

Consider the following method:

- (BOOL)someMethod:(NSString*)string error:(NSError**)error
{
	if (error)
	{
		if (something)
			*error = [self doSomeStuff];
		else (somethingelse)
			*error = [self doSomethingElse];
		//... lots of other cases here...
		
		//... later
		
		if ([(*error) code] == someCode)
		{
			[self doSomeExtraStuff];
		}
	}
	
	return YES;
}

This is a bit of a contrived example, but the point is that this method might be the one you’re calling, and it might be in a third party library that you have no control over.

Now lets imagine that some refactoring or other code change introduces a bug where *error doesn’t get set in the first block of if/else statements. Let’s also imagine that the developer has the relevant warnings turned off, and doesn’t noticed (shock horror - some people actually do this!).

What happens when the code starts accesses *error? Well, it depends what’s in it. If you did this:

NSError* error = nil;

you’ll be fine.

If you did this:

NSError* error;

you get undefined behaviour, based on what happened to be on the stack in the position that the error variable occupies. It’s a classic uninitialised variable bug, and potentially a bastard to track down.

At this point you may well be saying “if the quality of the library you’re using is that bad, you’ve only yourself to blame”. You might have a point, but bugs do creep in, even to good libraries.

Admittedly too, when you’re accessing an uninitialised Objective-C pointer, you’re way more likely to crash straight away than you would have been if you were just dereferencing a pointer to read a member in C/C++.

However, all of this is the reason why I still do:

NSError* error = nil;

even though I’m not going to test error afterwards.

You could call it paranoia, but don’t mistake it for misunderstanding the NSError** contract!

more...

If you work with submodules in git, and you’ve ever tried to move a repository locally to a different place on your machine, you may have encountered a problem.

In recent versions of git, the embedded submodules don’t have their own “.git” directory. Instead they contain a text file called .git which points git back at the root .git directory, which contains all the information for all submodules. Furthermore, there’s a config file for each submodule, hidden in the main .git directory, which points “forward” to the submodule.

That would all be fine except for one incredibly stupid thing: in versions of git prior to 1.7.10, this path was stored in absolute format.

Which means that if you move the repo on your disk, all of these paths break, and the repo no longer works!

This is, to put it mildly, a bit of a pain in the arse.

What to do?

The long answer is to go through all of the submodules and do the following:

  • hand edit the gitdir: entry in the .git file to make it relative
  • hand edit the config file in the directory that the gitdir entry points to, to make the worktree entry relative

That’s all fine and dandy, but it’s a tricky process, and if you’ve got lots of submodules, some of which may even have embedded submodules, it’s a lot of work.

Luckily, there’s a short answer:

  • do a search and replace across the .git and config files mentioned above, and replace the old broken absolute path with a new correct absolute path.

This isn’t ideal, but it gets you working again.

What’s more, it’s a lot easier to automate. Figuring out the proper relative paths isn’t that easy to automate, but doing a search and replace of one known string with another across a bunch of files is.

Here’s a script I wrote to do it.

It’s not a perfect script, and please be aware that it makes permanent changes to files so you may well want to zip up the whole of your repo first as a paranoid backup.

However, it seems to work for me. It could take the old and new paths as parameters, but I decided to embed them in the script for a couple of reasons.

One, it’s a better example of what the paths should look like. Two, this situation is most likely to occur when you’ve made some sort of global change, like renaming or changing your hard drive. In that case you’ll probably want to do the same replacement lots of times on different repos, so embedded it in the script is helpful (and also reduces the risk of typing it wrong).

more...

I asked this question on Stack Overflow earlier, but I think it’s worth posting basically the same thing here too.

I’ve been building libraries, and collections of layered libraries, for a long time, and I’m still not totally happy that I’ve found the best way to organise them. The question I asked was aimed at soliciting some insights from others.

Here it is:


Let’s say that I’ve three libraries, A, B, and C.

Library B uses library A. Library C uses library A.

I want people to be able to use A, B, and C together, or to just take any combination of A, B, and C if they wish.

I don’t really want to distribute them together in one large monolithic lump.

Apart from the sheer issue of bulk, there’s a good reason that I don’t want to do this. Let’s say that B has an external dependency on some other library that it’s designed to work with. I don’t want to force someone who just wants to use C to have to link in that other library, just because B uses it. So lumping together A, B and C in one package wouldn’t be good.

I want to make it easy for someone who just wants C, to grab C and know that they’ve got everything they need to work with it.

What are the best ways of dealing with this, given:

  • the language in question is Objective-c
  • my preferred delivery mechanism is one or more frameworks (but I’ll consider other options)
  • my preferred hosting mechanism is git / github
  • I’d rather not require a package manager

This seems like a relatively straightforward question, but before you dive in and say so, can I suggest that it’s actually quite subtle. To illustrate, here are some possible, and possibly flawed, solutions.

CONTAINMENT / SUBMODULES

The fact that B and C use A suggests that they should probably contain A. That’s easy enough to achieve with git submodules. But then of course the person using both B and C in their own project ends up with two copies of A. If their code wants to use A as well, which one does it use? What if B and C contain slightly different revisions of A?

RELATIVE LOCATION

An alternative is set up B and C so that they expect a copy of A to exist in some known location relative to B and C. For example in the same containing folder as B and C.

Like this:

libs/
  libA/
  libB/ -- expects A to live in ../
  libC/ -- expects A to live in ../

This sounds good, but it fails the “let people grab C and have everything” test. Grabbing C in itself isn’t sufficient, you also have to grab A and arrange for it to be in the correct place.

This is a pain - you even have to do this yourself if you want to set up automated tests, for example - but worse than that, which version of A? You can only test C against a given version of A, so when you release it into the wild, how do you ensure that other people can get that version. What if B and C need different versions?

IMPLICIT REQUIREMENT

This is a variation on the above “relative location” - the only difference being that you don’t set C’s project up to expect A to be in a given relative location, you just set it up to expect it to be in the search paths somewhere.

This is possible, particularly using workspaces in Xcode. If your project for C expects to be added to a workspace that also has A added to it, you can arrange things so that C can find A.

This doesn’t address any of the problems of the “relative location” solution though. You can’t even ship C with an example workspace, unless that workspace makes an assumption about the relative location of A!

LAYERED SUBMODULES

A variation on the solutions above is as follows:

  • A, B and C all live in their own repos
  • you make public “integration” repos (lets call them BI and CI) which arrange things nicely so that you can build and test (or use) B or C.

So CI might contain:

- C.xcworksheet
- modules/
    - A (submodule)
    - C (submodule)

This is looking a bit better. If someone just wants to use C, they can grab CI and have everything.

They will get the correct versions, thanks to them being submodules. When you publish a new version of CI you’ll implicitly be saying “this version of C works with this version of A”. Well, hopefully, assuming you’ve tested it.

The person using CI will get a workspace to build/test with. The CI repo can even contain sample code, example projects, and so on.

However, someone wanting to use B and C together still has a problem. If they just take BI and CI they’ll end up with two copies of A. Which might clash.

LAYERED SUBMODULES IN VARIOUS COMBINATIONS

The problem above isn’t insurmountable though.

You could provide a BCI repo which looks like this:

- BC.xcworkspace
- modules/
    - A (submodule)
    - B (submodule)
    - C (submodule)

Now you’re saying “if you want to use B and C together”, here’s a distribution that I know works.

This is all sounding good, but it’s getting a bit hard to maintain. I’m now potentially having to maintain, and push, various combinations of the following repos: A, B, C, BI, CI, BCI.

We’re only talking about three libraries so far. This is a real problem for me, but in the real world potentially I have about ten. That’s gotta hurt.

So, my question to you is:

  • What would you do?
  • Is there a better way?
  • Do I just have to accept that the choice between small modules and a big monolithic framework is a tradeoff between better flexibility for the users of the module, and more work for the maintainer?
more...

August 11, 2012

Often, when I’m contracting for someone, I find myself asking them lots of things.

Sometimes I do it formally by writing a spec, and saying “this is what you want, right?”. Other times it’s just a conversation.

Then, equally often, I find myself apologising for asking so many “stupid questions”.

Then I tell myself off for being insecure and apologising.

more...

My iTunes music collection has over 1400 albums in it (no doubt plenty of people have more, but still, that’s quite a lot of music).

A couple of years ago I decided to re-encoded all my CDs as Apple Lossless, since the physical discs were going into storage and I wasn’t sure when I’d see them again.

I store my collection on a mac mini, which is my media server. However, I also had a copy on my laptop, which I kept as AAC 256k, to save space (and because, frankly, it’s hard to tell the difference).

I like using the meta data tags properly, and I hate it when people abuse the Album tag to indicate what disc it is (by adding “[Disc 1]” on the end), when they mark an album as “Compilation” when it’s a collection of songs by the same artist (for me that tag is only supposed to be used to group together tracks by different artists), or when they use the Artist tag to name check collaborators (by adding “feat. Joe Blogs” or whatever), so you end up with one album scattered across multiple artists (if you’re going to do that, use the Album Artist tag to unify the album under the main artist).

Over the time that I’d built up the collection, I had spent a lot of time editing the meta data to get it into the format I wanted it. Unfortunately, re-encoding everything undid a lot of this work, and left me with quite a few duplicates - some with my “correct” meta data, some with the rubbish tags from the internet.

I managed to clean up a lot of the problems on the mini, but of course that didn’t fix the laptop. Worse, thanks to iTunes match, the problems started to multiply again. iTunes Match on the mini has become confused on multiple occasions and “forgotten” me, so I’ve had to add the entire collection to it again. At which point it started adding duplicate AAC copies of every album where I’d edited the meta data.

I guess that this is because it had matched different encodings of the same tracks on different machines, in some cases with different meta data. The upshot is that now on my mini I’ve ended up with two copies of a lot of stuff, with both the correct and the incorrect metadata - and my iTunes match collection is now in an almost unbearable mess as a result.

I try to remove the duplicate, but they’re not always easy to spot unless I fix the meta data problems first, because they get filed in different places. I attempt to delete these duplicates from the cloud at the same time, but I’m quite scared that I’ll end up removing the only copy of something from the cloud too, by accident.

It seems to me that the root of this problem is that there’s no ‘authoritative’ place to view your match collection as it exists in the cloud. That and the fact that match seems to take the approach of avoiding touching your meta data whenever possible - which sounds sensible but isn’t if you end up with the sort of mess I’ve got.

What I badly need is a way to view the collection on the web, remove duplicates, clean up meta data, and then sync these changes down onto all my machines. I’m really not entirely sure how it deals with meta data changes right now - I suspect that it basically does nothing, which means that if you ever re-sync your collection, you end up with duplicates again.

For now, things are so messy that what I’d really like to do is delete everything from all but one machine and from the cloud, and start again. Except that I don’t really trust that everything that needs to be stored in the cloud is, and that it’s in the correct format, with the correct data, and that deleting it all from most places wouldn’t end up with me losing stuff.

In a word: “aaaaaaaarrrrrghhhh!”.

more...