Thoughts on Software Wizardry

February 1, 2022

5-Minute Read

Thoughts on Software Wizardry

Last year is the year I started programming extensively. As I’ve worked on different languages and systems, I’d like to share some thoughts with you.

I define software wizardry as coding tricks that solve problems in unconventional ways that are often obscure and hard to get right the first time.

How not to code in Ruby

Ruby is invented to increase productivity, and part of that productivity boost comes from monkey patching and insane meta-programming capabilities.

Part of my day job is developing standard Cocoapods toolchain for internal use. As an open source MacOS/iOS project dependency manager, Cocoapods is written in Ruby. To add more features and optimizations that meet our internal needs, the add-ons are written as Gems that monkey-patch Cocoapods. It’s all fun and games as long as I’m the only one who writing all the patches, but in reality, one team patched this function, then another team patched this function as well. Soon I found out the current line being executed was jumping all over the places during debugging session. The supposed the joy of debugging interpreted languages became hell for me. Ruby doesn’t even guarantee the order in which patches are applied, debugging is an adventure!

Monkey patching exists because you might want to replace some functions you don’t like, but don’t bother forking the project or fix it upstream. That’s understandable for small and medium projects. But for project like our in-house toolchain, I think we should fork the Cocoapods and fixing it instead. Especially we don’t even update the Cocoapods anymore because updating means some functions get renamed and rendering existing monkey patches useless.

Compared to monkey patching, Ruby’s meta programming is real black magic. Ruby has functions that allow you to directly access and modify instances, evaluate expression in any Class or instance scope. Dynamically define new methods and Classes is supported as well. If you rely on it too much, as we did, you’ll find yourself on a roller coaster ride of “why my variables changing seemingly at random”.

Abuse of C PreProcessor Macro

C’s macro is useful, but also dangerous. Here’s an example how C’s Macro shot me in the feet.

Apple’s Objective-C Runtime defines type BOOL, but instead of defining it as char as most Objective-C books say, apple defines BOOL depending on the platform you’re building on!

Here is the code that defines BOOL that I copied from https://opensource.apple.com/source/objc4/objc4-818.2/runtime/objc.h.auto.html

#   if TARGET_OS_OSX || TARGET_OS_MACCATALYST || ((TARGET_OS_IOS || TARGET_OS_BRIDGE) && !__LP64__ && !__ARM_ARCH_7K)
#      define OBJC_BOOL_IS_BOOL 0
#   else
#      define OBJC_BOOL_IS_BOOL 1
#   endif
#endif

#if OBJC_BOOL_IS_BOOL
    typedef bool BOOL;
#else
#   define OBJC_BOOL_IS_CHAR 1
    typedef signed char BOOL;
    // BOOL is explicitly signed so @encode(BOOL) == "c" rather than "C"
    // even if -funsigned-char is used.
#endif

As we can see here, BOOL is sometimes bool as in C99, but sometimes it is signed char. That means the following code can be compiled for real iPhone, but not for Universal iPhone Simulator target.

void (^fooBlock)(bool) =
   ^(BOOL bar) {
      return;
   };

You get “incompatible function pointer” error when BOOL is unsigned char. Is the performance gain on some platforms worth the nondeterministic behavior? I think not.

How using proper English can help detect code error

A less know secret of Clang is that it detect Objective-C retain cycles only if your function name start with “set” or “add”. You can see the source code for yourself https://github.com/llvm/llvm-project/blob/release/13.x/clang/lib/Sema/SemaChecking.cpp#L15279

The reason for this is that to properly detect retain cycles, you need to construct a control flow graph, which is too expansive at compile time. Instead, Clang just checks if an instance happens to store a closure that also reference that instance. And how does clang check the whether the instance store that closure? By assuming that the name of the function the closure was passed to starts with “set” or “add”! This certainly doesn’t look good to people who name variables and functions in non-English.

Have some degree of retain cycle check is better than nothing, but it also gives people the false sense of security that no warning equals perfect code.

The optimization that allow project with incorrect build settings to compile

Another story from work. As mentioned above, I write Cocoapods add-ons. Dependencies in Cocoapods are called Pods, each pod can have nested sub-specifications. So a project that uses only some functions can depend on certain sub-specs. Now imagine you have a huge project with hundreds of components, and many components all depend on one low-level pod, but with different subspecs. Cocoapods will then calculate the intersect and make the intersect a dependency of the project, then copy each distinct subspec into the component. But for a huge project, this method will slow down the compile time. So we have an optimization that removes all subspecs depended by components and group them into their own pod. If the project structure looks like this before optimization.

Then after the optimization it looks like this.

However, Cocoapods also supports “scripting” option. Pods can run scripts that are normally used to copy assets such as icons to the main project. A poorly written pod that mixes source code with assets copying scripts will create duplicate assets if depended upon by two or more components, because the same script is called twice. And Cocoapods signals an error. But after our optimization, there are no more duplicates!

Although the poorly written Pod escaped quality control due to our optimization, it can no longer be used by projects that do not use this optimization.

I’m dumb, so I’ll steer far away from software wizardry

After being bitten by wizardry countless times, I think it’s better to just write clear and logically simple code. Tricks are fine as long as you cover all edge cases. And if the code only works under certain assumptions, I’d explicitly state the assumption. Software wizardry won’t cause any instant karma, but surely you will get the revenge some time later.