January 31 2022
“All software devs suck!” Bill snorted in disgust.
It was not the first time he had asserted as much. Zara sighed, and responded with patience. “No they don’t. Software is complicated. When you’re working on a project that has to be finished yesterday, you might skip some steps that are regrettable in hindsight. You don’t have any choice.”
“Nope, they suck,” Bill insisted.
Zara had been telling Bill about her most recent foray into debugging a long-standing problem in an Android app she had been hired to work on. It was one of those “add features/fix bugs” projects that seemed to never end. A trickle of work came in almost every week. This most recent bug had been raising its ugly head sporadically ever since she’d started work on the app, over a year ago. The first time she’d installed the app, it just hung at the splash screen.
Eventually, that bug had appeared on her docket. It was gnarly. She had sprinkled print statements throughout the app initialization process, and saw them in the logcat output. And then, nothing. The app just didn’t continue to the next screen. Zara found no hint that the app had crashed. It just stopped, sitting there, as if it were finished.
This was a React Native codebase, so debugging was most easily done running Metro server, where code changes would quickly be deployed to the running app on her device. Unfortunately, the bug only showed up in production mode, after an apk had been built and installed. This increased development time dramatically. It made the whole process discouraging and tedious.
The Android app wasn’t considered as important as the iOS version of the app, and the bug got backlogged. Once in a while, a user would complain, and Zara would look into the bug again. It was the kind of bug that she’d like to have a full week to investigate. The project didn’t have that kind of budget. She’d spend a couple hours researching it, and it would get backlogged again.
This time was different, though. The project owner had given her the okay to spend 10 hours on the bug. At this point, the app was no longer hanging at the splash screen. Instead, it hung after pressing a button on the first screen to appear after the splash.
Zara had other client work this week, and no idea how to make progress on this one. She stepped away from her desk and took a walk around the block. How was it possible to figure this one out?
There’s a debugging technique called git bisection. Zara only used this when she was desperate, and she was. You use git to check the codebase out halfway back to the initial commit. Then you run the code, and test to see if the bug shows up. If it doesn’t, you know the offending bug is somewhere between that halfway commit and the most recent one. You git bisect again, halfway between your current halfway point and the most recent commit. You keep bisecting the codebase like this until you discover the commit that caused the bug. At that point, you can more easily debug the problem because you know exactly what code changes caused the bug to appear.
Zara knew that git bisection wouldn’t work in this case. The app’s backend had changed substantially in the last year, and the app from a year or two ago wouldn’t function if you tried to run it today.
She thought some more. She didn’t actually need the entire app codebase to debug this problem.
She rushed back to her desk, and started ripping out code. The plan was to eliminate most of the code, and then add pieces back in gradually, to figure out which component caused the bug.
She tore out screens and components with abandon, eliminating almost everything but the splash screen and the couple screens after it. When she ran the code again, it broke heinously, as expected. She spent an hour fixing all the problems, and then ran the code again. Even with just a few components and screens, the bug still occurred! WTF!
She took another walk.
Think, think, think…. Could the problem actually be in one of the installed packages? When she returned to her desk, she started removing packages. With only a few screens left, most of the packages in package.json weren’t needed anymore, anyway. She only deleted a few packages at a time, since she quickly found out that removing packages could cause just as many problems as ripping out code had. Removing packages was slow work: “npm remove x y z”, then build the apk, fix any build errors, install, and test for the hanging issue. Wash, rinse, repeat, because the hanging issue was always there!
Zara removed almost 25 packages in this way. Then, after she removed five more, the bug suddenly stopped occurring. She tried not to get excited. She reinstalled all five packages, and the bug came back. She removed them, and the bug went away. She then gradually reintroduced the packages one at a time, and narrowed down on the offending package: expo-updates.
She searched the code. There were no references to expo-updates anywhere. So how could it be causing the app to hang? She spent some time researching the expo-updates package and the git commits from when the expo-updates package had been installed. And then she found it. Installing expo-updates requires you to manually edit the AndroidManifest.xml file with a line like this:
<meta-data android:name="expo.modules.updates.EXPO_UPDATE_URL" android:value="https://exp.host/@my-expo-username/my-app"/>
The documentation says EXPO_UPDATE_URL is supposed to be a “URL to the remote server where the app should check for updates”. But some long-lost developer had set the value to something that was not a URL at all - he’d set it to the Android package name.
At this point, Zara wasn’t sure why the app was hanging, but she had a hunch that expo-updates had tried to contact a URL that was not a URL, and had crashed or hung in some way she didn’t understand. But she didn’t need to understand anything else. The project was hanging due to expo-updates. It didn’t use expo-updates. Run npm remove expo-updates, and remove the code that had been manually added to AndroidManifest.xml. Bug fixed.
After doing all this, Zara decided she’d invented a new technique that she called “npm bisection”, similar to git bisection, but with npm modules… just delete half your modules and see if the bug goes away. A quick search made it clear that this concept already existed, and that there was even an experimental node module called npm-bisect designed to do just this. Okay, didn’t matter, she was delighted with her solution!
Zara wanted to crow about her victory, and she honed in on Bill, a software architect. She’d only gotten part way through her explanation when he made his put-down against the entire developer community.
“So what you’re saying is that I suck, since I’m a software developer?” she said, annoyed. She was only half-feigning hurt feelings.
“Present company excepted, of course,” he said, unconvincingly.
Zara rarely talked about her job with other people. She knew that it would be too technical and boring for most. She thought Bill might enjoy her story, but she should have known better. This was not the first time she’d heard him generalize about the badness of software devs everywhere. Discussing an interesting bug with him was nigh impossible because he’d just go off on a rant about the development process. Well, she wasn’t going to let it bring her down. She’d crushed a tough bug, and her mood was celebratory. “Let’s go to Hobbes for lunch. Drinks are on me!” she suggested. This erased Bill’s sour look, and off they went.