Fresh Hacker News | Use the Mikado Method to do safe changes in a complex codebase

▲Use the Mikado Method to do safe changes in a complex codebase(understandlegacycode.com)

135 points by foenix 4 days ago | 24 comments

▲bob1029 4 hours ago

My favorite tool for trying scary complicated things in an unknown space is the feature flag. This works even if you have zero tests and no documentation. The only thing you need is the live production system and a way to toggle the flag at runtime.

If you can ship your hypothesis along with an effectively unaltered version of prod, the ability to test things without breaking other things becomes much more feasible. I've never been in a real business scenario where I wasn't able to negotiate a brief experimental window during live business hours for at least one client.

▲nijave 1 hour ago

While very powerful, I think it's worth calling out some pitfuls. A few things we've ran into - long lived feature flags that are never cleaned up (which usually cause zombie or partially dead code) - rollout drift where different environments or customers have different flags set and it's difficult to know who actually has the feature - not flagging all connected functionality (i.e. one API is missing the flag that should have had it)

A good decom/cleanup strategy definitely helps

▲Groxx 48 minutes ago

Have them emit metrics when it's triggered. You can do a bulk "names X, Y, Z haven't used branch B in >30 days, delete?" task generator pretty easily. Un-triggered ones are also easy to catch if you force all calls to be grep-friendly (or similar), which is also an easy lint to write: unclear result? Block it, force `flag("inline constant", ...)`.

Personally I've also had a lot of success requiring "expiration" dates for all flags, and when passed they emit a highly visible warning metric. You can always just bump it another month to defer it, but people eventually get sick of doing that and clean it up so it'll go away for good. Make it annoying, so the cleanup is an improvement, and it happens pretty automatically.

▲enlyth 1 hour ago

Yep, archiving feature flags and deleting the dead code is usually thing number 9001 on the list of priorities, so in practice most projects end up with a graveyard of them.

Another issue that I've ran into a few times, is if a feature flag starts as a simple thing, but as new features get added, it evolves into a complex bifurcation of logic and many code paths become dependent on it, which can add crippling complexity to what you're developing

▲hinkley 1 hour ago

Feature flags are like bloom filters. They make 98 out of 100 situations better and they make the other 2 worse. When performance is the issue that’s usually fine. When reliability is the issue, that’s not sufficient.

If you work on fifty feature toggles a year, one of them is going to go wrong. If your team is doing a few hundred, you’re gonna have oopsies.

Most of the problematic cases are where the code is set up so that the old path and the new one can’t bypass each other cleanly. They get tangled up and maybe the toggle gets implemented inverted where it’s difficult to remove the old path without breaking the new.

▲jaggederest 2 hours ago

You can go even further with something like the gem scientist at the application level, or tee-testing at the data store level. Compare A and A', record the result, and return A. Eventually, you reach 100% compatibility between the two (or only deviations that are desirable) and can remove A, leaving only A'

I also like recording and replaying production traffic, as well, so that you can do your tee-testing in an environment that doesn't affect latency for production, but that's not quite the same thing.

▲eastbound 2 hours ago

You’ve just resolved a problem I had. I had this problem on a search engine, but I made it as a “v2”. And I told customers to switch to v2. And you know the v2 problem: Discrepancies that customers like. So both versions have fans, but we really need to pull the plug on v1. You’ve just solved it: I should have indexed even records with v1, odd records with v2. Then only I would know which engine was used.

▲charles_f 7 hours ago

Write tests. Most likely those 300k lines of code contain a TESST folder with 4 unit tests written by an intern who retired to become a bonsai farmer in the 1990s, and none of them pass anymore. Things become much less stressful if you have something basic telling you you're still good.

▲layer8 5 hours ago

The problem with complex legacy codebases is that you don’t know about the myriads of edge cases the existing code is covering, and that will only be discovered in production on customer premises wreaking havoc two months after you shipped the seemingly regression-free refactor.

▲ljm 3 hours ago

It helps if tests are well written such that they help you with refactoring, rather than just being the implementation (or a tightly coupled equivalent) but with assertions in it.

Rare to see though. I don't think being able to write code automatically means you can write decent tests. Skill needs to be developed.

▲mehagar 5 hours ago

I agree. This is one area I'm hoping that AI tools can help with. Given a complex codebase that no one understands, the ability to have an agent review the code change is at least better than nothing at all.

▲nijave 1 hour ago

You can infer based on code coverage. If coverage is low, tests are likely insufficient and change is risky

▲UltraSane 5 hours ago

If you save a log of input on the production system you can feed it to old and new versions to find any changed in behavior.

▲ 5 hours ago

▲karmakurtisaani 5 hours ago

The best time to write tests was 20 years ago. The second best is now, provided you've applied to all the companies with better culture.

▲ipsento606 3 hours ago

I've been working on react and react native applications professionally for over ten years, and I have never worked on a project with any kind of meaningful test coverage

▲stronglikedan 3 hours ago

over 20 years, many stacks, and same

▲locknitpicker 2 hours ago

> I have never worked on a project with any kind of meaningful test coverage

That says more about you and the care you put into quality assurance than anything else, really.

▲ipsento606 2 hours ago

Presumably you mean me, and every current and former team-member I've ever had? If so, you're talking about hundreds of engineers.

▲AnimalMuppet 1 hour ago

Have you ever worked at a place where you were put on an existing codebase, and that code has no tests? Have you ever worked at a place where, when you try to fix that, management tells you that they don't have the time to do so, they have to crank out new features?

Is ipsento606 working at such a place? I don't know, and neither do you. Why do you jump to the conclusion that it's their personal failing?

▲nitnelave 2 hours ago

Also known as "Make the change easy, then make the change"

Something to realize is that every codebase is legacy. My best new feature implementations are always several commits that do no-op refactorings, with no changes to tests even with good coverage (or adding tests before the refactoring for better coverage), then one short and sweet commit with just the behavior change.

▲collingreen 1 hour ago

I also do this and try to teach it to others. One thing I add is trying to go even further and making it so the new feature can essentially be a configuration change (because you built the system already in the first steps). It doesn't fit every situation so it's by no means a hard rule but "prefer declaration functionality over imperative".

▲hinkley 1 hour ago

That’s just mostly refactoring in general.

Mikado is more of a get out of jail card for getting trapped in a “top down refactor” which is an oxymoron.

▲Illniyar 5 hours ago

This is a good method if you are stuck and you don't know what you need to do. It also helps explore a project with a specific task in mind.

It is not very useful in giving you confidence your changes would not cause unexpected side effects, which is usually the main problem working with legacy code.

If you want confidence when working with legacy code, your best bet is to do a strangler fig pattern - find a boundaries for the module you want to work on, rewrite the module (or clone and make your changes), run both at the same time in shadow mode, monitor and verify your new module is working the same as the old one, then switch and eventually delete the old module.

▲LoganDark 5 hours ago

Boundaries? Module? I laugh.

▲hinkley 1 hour ago

Mikado is really only powerful when dealing with badly coupled code. Outside of that context you’re kinda cosplaying (like people peppering Patterns in code without an actual plan).

Refactoring is generally useful for annealing code enough that you can reshape it into separate concerns. But when the work hardening has been going on far too long there usually seems like there’s no way to get from A->D without just picking a day when you feel invincible, getting high on caffeine, putting on your uptempo playlist and telling people not to even look at you until you file your +1012 -872 commit.

I used to be able to do those before lunch. I also found myself to be the new maintainer of that code afterward. That doesn’t work when you’re the lead and people need to use you to brainstorm getting unblocked or figuring out weird bugs (especially when calling your code). All the plates fall at that point.

It was less than six months after I figured out the workaround that I learned the term Mikado, possibly when trying to google if anyone else had figured out what I had figured out. I still like my elevator pitch better than theirs:

Work on your “top down” refactor until you realize you’ve found yet another whole call tree you need to fix, and feel overwhelmed/want to smash your keyboard. This is the Last Straw. Go away from your keyboard until you calm down. Then come back, stash all your existing changes, and just fix the Last Straw.

For me I find that I’m always that meme of the guy giving up just before he finds diamonds in the mine. The Last Straw is always 1-4 changes from the bottom of the pile of suck, and then when you start to try to propagate that change back up the call stack, you find 75% of that other code you wrote is not needed, and you just need to add an argument or a little conditional block here and there. So you can use your IDE’s local history to cherry pick a couple of the bits you already wrote on the way down that are relevant, and dump the rest.

But you have to put that code aside to fight the Sunk Cost Fallacy that’s going to make you want to submit that +1012 instead of the +274 that is all you really needed. And by the way is easier to add more features to in the next sprint.

▲hamandcheese 5 hours ago

Replace "module" with "system" - every system has boundaries.

▲thfuran 5 hours ago

Some of them are notoriously spaghetti-shaped, and that’s hard to isolate and replace.

▲nailer 5 hours ago

Then your first step is found! Make those boundaries. Isolate dcomponents so you can test them.

▲yomismoaqui 6 hours ago

I recommend reading a classic, "Working Effectively With Legacy Code" from Michael Feathers.

▲mittermayr 6 hours ago

While great in theory, I think it almost always fails on "non-existent" testing structures that reliably cover the areas you're modifying. I change something, and if there's no immediate build or compile error, this (depending on the system) usually does not mean you're safe. A lot of issues happen on the interfaces (data in/out of the system) and certain advanced states and context. I wouldn't know how Mikado helps here.

In other words, I'd reword this to using the Mikado method to understand large codebases, or get a first glimpse of how things are connected and wired up. But to say it allows for _safe_ changes is stretching it a bit much.

▲SoftTalker 6 hours ago

Yes, most of the time such spaghetti code projects don't have any tests either. You may have to take the time to develop them, working at a high level first and then developing more specific tests. Hopefully you can use some coverage tools to determine how much of the code you are exercising. Again this isn't always feasible. Once you have a decent set of tests that pass on the original code base, you can start making changes.

Working with old code is tough, no real magic to work around that.

▲agge 6 hours ago

If you create a graph of what changes are needed to allow for other changes, eventually leading to your goal.

Then by definition you have the smallest safest step you can take. It would be the leaf nodes on your graph?

▲nuancebydefault 1 hour ago

I've been a few times in a situation where I needed to make significant changes in a huge codebase with lot's of tests but also with a lot of corner cases, on my own.

I've spent blood sweat, tears and restless evenings scrolling and ctrl-f-ing huge build and test logs to finally accomplish the task.

But let's take a step back.

So they assign you to get that done. You're supposed to be careful, courageous and precise while making those changes without regression. There's very little up-to-date documentation on the design, architecture, let alone any rationale on design choices. You're supposed to come up with methods like Mikado, tdd, shadowing or anything that gets the job done.

Is this even fair to ask? Suppose you ask a contractor to re-factor a house with old style plumbing and electricity. Will they do it Mikado style, or, would they say - look - we're going to tear things down and rebuild it from the ground. You need to be willing to pay for a designer, an architect, new materials and a set of specialized contractors.

So why do we as sw engineers put up with the assignment? Are we rewarded so much more than the project manager of that house who subcontracts the work to many people to tear down and rebuild?

▲phito 1 hour ago

If you're paid by the hour, then does it really matter if you have to refactor stuff? If it takes a long time to do then it'll be more expensive for your employer.

Does the project manager get paid more by the hour to refactor a house than to build one?

▲csours 2 hours ago

This sounds like torture (as written).

Of course, working in a legacy codebase is also torture.

Software development is a hyper-rational endeavor, so we don't often talk about feelings. This article also does not talk much about feelings.

Reading between the lines, it looks like reverting the code is supposed to affect how you feel about the work. Knowing that failure is an explicit option can help to set an expectation; however, without a mature understanding of failure, that expectation may just be misery.

With a mature understanding of failure, the possibility of a forced rollback should help you "let go" of those changes. It's like starting a day of painting or drawing with one that you force yourself to throw away; or a writing session with a silly page.

----

If someone thinks that they are giving you good advice, but it sounds terrible, then maybe they are expecting you to do some more work to realize the value of that advice.

If you are giving someone advice and they push back, maybe you are implying some extra work or expectations that you have not actually said out loud.

Advice is plagued by the tacit knowledge problem.

▲castral 1 hour ago

Maybe it is the framing of the step as a "reversion" or "roll-back" rather than "spike" or "prototype" that is causing that sense. Personally, I would never throw away the code I spent time and effort writing just to stick to a systematized refactoring method like this "Mikado." I don't think the advice is unsound, and I have done exactly this many times in my own career, but instead of throwing it away I would shelve it, put it in a branch, write a document about what has been/needs to be done, and write a corresponding tech debt or feature/fix ticket for it with the new and realistic estimate.

▲jeremyscanvic 4 hours ago

Is it possible in practice to control the side effects of making changes in a huge legacy code base?

Maybe the software crashes when you write 42 in some field and you're able to tell it's due to a missing division-by-zero check deep down in the code base. Your gut tells you you should add the check but who knows if something relies on this bug somehow, plus you've never heard of anyone having issues with values other than 42.

At this point you decide to hard code the behavior you want for the value 42 specifically. It's nasty and it only makes the code base more complex, but at least you're not breaking anything.

Anyone has experience of this mindset of embracing the mess?

▲0xbadcafebee 4 hours ago

I believe this is called Microsoft Driven Development

(seriously though, this book has answers for you: Working Effectively with Legacy Code, by Michael Feathers)

▲fc417fc802 3 hours ago

You misspelled Oracle.

▲niccl 3 hours ago

All. The. Time. And I hate it. Imagine giving a customer a rebate based on buggy code. You fix a bug, the customer comes back and wants to check that the rebate was correct that last time. Now you have to somehow hard-code the rebate they did get so that your (slightly less buggy) code gives the same result. But hard-coding has the risk of introducing other errors on its own. Oh yes, and you've never enough time to do things properly because Customers (or maybe Management). A tangled mess of soul destroying lifeblood-sucking code and pressures ensues.

▲sublinear 3 hours ago

I've never seen code truly get that bad, but I can already think of several problems with that approach.

Do you really know all of the expected behavior you're hardcoding in? What happens if your hardcoded behavior is just incorrect enough that it breaks something somewhere else? How can you be sure that your test for that specific value is even correct?

I think the better approach is to let things break naturally and open a bug with your findings. You'd be surprised how often someone else knows exactly what's going on and can fix it correctly. Your hacks are not just pouring gasoline onto the fire, but opening a well directly underneath that will keep it burning for a long time.

▲brutuscat 4 hours ago

For me now days is like this: - try to locate the relevant files - now build a prompt, explain the use case or the refactor purpose. Explain the relevant files and mention them and describe the interaction and how you understand that work together. Also explain how you think it needs to be refactored. Give the model the instruction to analyze the code and propose different solution for a complete refactor. Tell it to not implement it, just plan.

Then you’ll get several paths of action.

Chose one and tell the model to write into a file you’ll keep around while the implantation is on going so you won’t pollute the context and can start over each chunk of work in a clean prompt. Name the file refactor-<name >-plan.md tell it to write the plan step by step and dump a todo list having into account dependencies for tracking progress.

Review the plans, make fixes if needed. You need to have some sort of table reassembling a todo so it can track and make progress along.

Open a new prompt tell it analyze the plan file, to go to the todo list section and proceed with the next task. Verify it done, and update the plan.

Repeat until done.

▲agge 7 hours ago

Using a Mikado style graph for planning any large work in general has been really useful to me. Used it a lot at both Telia back in 2019 and Mentimeter at 2022.

It gives a great way to visualise the work needed to achieve a goal, without ever mentioning time.

▲woodruffw 2 hours ago

I was hoping it was a reference to The Mikado, given that the best way to refactor is with a short, sharp shock[1].

[1]: https://en.wikipedia.org/wiki/Short,_sharp_shock

▲spprashant 2 hours ago

I d like to hear more about people who have jumped onto large codebases and were instantly productive. I see a lot of emphasis on documentation and comments, but in my experience they get stale real fast.

▲Mikhail_K 5 hours ago

I usually use the method "shout Banzai! and charge straight like a kamikaze"

Is that the Mikado method?

▲dirkc 4 hours ago

The things that always get me with tasks like this is that there are *always* clear, existing errors in the legacy code. And you know if you fix those, all hell will break loose!

▲eblume 7 hours ago

I’ve been using a form of the Mikado Method based on a specific ordering of git commits (by message prefix) along with some pre commit hook scripts, governed by a document: https://docs.eblu.me/how-to/agent-change-process

I have this configured to feed in to an agent for large changes. It’s been working pretty well, still not perfect though… the tricky part is that it is very tempting (and maybe even sometimes correct) to not fully reset between mikado “iterations”, but then you wind up with a messy state transfer. The advantage so far has been that it’s easy to make progress while ditching a session context “poisoned” by some failure.

▲agge 6 hours ago

There is a great interview that talks about the process and what it is about more generally: https://youtu.be/HbjWOvxJjms?si=5ta-JOyfFLub2yX_

I think there are similar methods, such as nested todo-lists. But DAGs are exceptionally good for this use case of visualising work (Mikado graphs are DAGs).

▲w10-1 3 hours ago

Ah, no: incremental approaches only work in already well-formed code.

Poor code requires not coding but analysis and decisions, partitioning code and clients. So:

1. Stop writing code

2. Buy or write tools to analyze the code (modularity) and use-case (clients)

3. Make 3+ rough plans:

(a) leave it alone and manage quality;

(b) identify severable parts to fix and how (same clients);

(3) incrementally migrate (important) clients to something new

The key lesson is that incremental improvements are sinking money (and worse, time) into something that might need to go, without any real context for whether it's worth it.

▲theo1996 7 hours ago

1. take a well known method for problem solving basically any programmer/human knows 2. slap a cool word from the land of the rising sun 3.??? 4. profit! This article is painfully pretentious and stealth marketing for a book

▲agge 7 hours ago

Stealth marketing by someone completely unrelated to the book, 11 years after the book is released. Seems unlikely.

▲hidelooktropic 7 hours ago

So you do things one step at a time and timebox as you go? This method probably doesn't need its own name. In fact I think that's just what timeboxing is.

▲bee_rider 5 hours ago

FWIW Mikado seems to be the name of that game where you pick up one stick at a time from a pile, while trying to not disturb the pile. (I forget the exact rules). So it isn’t as if somebody is trying to name this method after themselves or something, it is just an attempt at an evocative made up term. Timeboxing is also, right? I mean, timeboxing is not recognized by my spell checker (I’d agree that it is more intuitive though).

▲bregma 5 hours ago

Mikado is the name of an opera (by Gilbert and Sullivan) in which someone is deemed to have been executed without actually having been executed. Sounds like an ideal test strategy to me: yes, all the tests were executed, just not actually run.

▲zem 3 hours ago

when I saw the title I was expecting a reference to the opera. was wondering if they were somehow going to work in the exchange "Besides, I don't see how a man can cut off his own head." "A man might try." in reference to gradually removing bits of the old code.

▲kaffekaka 4 hours ago

Plockepinn in Swedish, approximately "pickastick".

Edit: thought I read it was of Scandinavian origin, hence my comment. But Wikipedia said european origin. Well well.

▲bee_rider 1 hour ago

I suspect it was invented the first time a parent dropped a pile of sticks because their bored kids were distracting them. “Ok kids, new game, pick those sticks up as quietly and tediously as possible.”

▲quesera 2 hours ago

In the US, it was a game called "pick up sticks", and it was tedious and sometimes impossible.

So, this method is well-named at least! :)

▲topaz0 5 hours ago

There are important additions beyond timeboxing, at least according to the post. Notably, reverting your changes if you weren't able to complete the chosen task in the time box and starting over on a chosen subset of that task. I can imagine that part has benefits, though I haven't tried it myself.

▲dvh 7 hours ago

Inherited? I wrote the thing! Customer have no money for large refactoring.

▲agge 6 hours ago

So we make it many small commitable refactorings instead :)

▲janpot 8 hours ago

In 2026, we call this "plan mode"

▲eblume 7 hours ago

It goes a lot further than plan mode though, in fact I would say the key difference of mikado refactors from waterfall refactors is that you don’t do all the planning up front with mikado. If anything you try to do as little planning as possible.

▲koakuma-chan 7 hours ago

> The project doesn’t compile anymore.

Using a programming language that has a compiler, lucky.

▲frpdrp 1 hour ago

[dead]