This is an interesting approach. I think, in a way, it mirrors what I do. Having contracted for much of my career, I’ve had to get up to speed on a number of codebases quickly. When I have a choice of how to do this, I find a recently closed issue and try to write a unit test for it. If nothing else, you learn where the tests live, assuming they exist, and how much of a safety net you have if you start hacking away at things. Once I know how to add tests and run them (which is a really good way to deal with the codebase setup problem mentioned in the article because a lot of onboarding docs only get you to the codebase running without all the plumbing you need), I feel like I can get by without a full understanding of the code as I can throw in a couple of tests to prove what I want to get to and then hope the tests or CI or hooks prevent me from doing A Bad Thing. Not perfect and it varies depending on on how well the project is built and maintained, but if I can break things easily, people are probably used to things breaking and then I have an avenue to my first meaningful contribution. Making things break less.
I am quite skeptical and reserved when it comes to AI, particularly as it relates to impacts of the next generation of engineers. But using AI to learn a code base has been life-changing. Using a crutch to feel your way around. Then ditching the crutch when things are familiar, like using a map until you learn the road yourself.
When I have a codebase I dont know or didn’t touch in some time and there’s a bug, first step is reproduce it an then set a breakpoint early on somewhere, crab some coffee and spend some time to step through it looking at state until I know what’s happening and from there its usually kind of obvious.
Why would one need a graph view to learn a codebase when you can just slap a red dot next to the route and step a few times?
Your visualizer looks great! I really like that it queues up tasks to run instead of only operating on the code during runtime attachment. I haven't seen that kind of thing before.
I built my own node graph utility to do this for my code, after using Unreal's blueprints for the first time. Once it clicked for me that the two are different views of the same codebase, I was in love. It's so much easier for me to reason about node graphs, and so much easier for me to write code as plain text (with an IDE/language server). I share your wish that there were a more general utility for it, so I could use it for languages other than js/ts.
Very cool! For all its faults, seeing control and value change flows through execution is one of the things I really liked about Unreal's Blueprint viz scripting system. This looks like a better take on that.
And for huge git repos I always like to generate a Gource animation to understand how the repo grew, when big rearrangements and refactors happened, what the most active parts of the codebase are, etc.
The building of the visualiser was less interesting to me than the result and your conclusion. I agree that finding new ways to ingest the structure and logic of software would be very useful, and I like your solution. Is there a way to test it out?
I always thought to do this visualization in 3d and maybe with VR. Not sure how useful or pleasing experience it would be. Kudos to the author of the project to get this done!
This kind of approach might be what (finally) unlocks visual programming?
I feel like most good programmers are like good chess players. They don't need to see the board (code). But for inputting the code transformation into the system this might be a good programmer's chessboard.
Though to make it work concretely for arbitrary codebases I feel like a coding agent behind the scenes is 100% required.
> I feel like most good programmers are like good chess players.
A specific type or area of developers, I'd say. There are many types and not all of them require understanding sizeable code bases to do their work well.
Understanding your large codebase is a few prompts away. You can ask a model to trace through and provide reports on the project's design, architectural and implementation. From there, you can drill in with followups.
Done right, you may not know specific lines or chunks of code by heart, but much like a tuned-in company CEO, you have eyes and ears on the ground and retain global oversight and insight of the project itself. For specifics, you can learn what you need as you need it. If that means knowing how every single module works, that's just a conversation with your agent.
One of my favorite uses for Claude Code is to point it at a section of seriously badly written code with undecipherable symbol names, over the top cyclomatic complexity etc and just ask it to make the code readable.
This may be where AI coding tools unlock us. Being able to build tooling against novel concepts that change how we approach reading and writing code. I like it!
To flesh this out, let me see the volume of calls and data from one place to another. Help diagnose back-pressure, drops, rejections, and any other irregularities.
Think of an on-caller who wants to quickly pinpoint a problem. Visualization could help one understand the nature of the problem before reading the code. Then you could select a part of the visualization and ask the computer to tell you what that part does, if there are any recent changes to it, etc.
I try to explain what I mean the next few sentences of the post. I have spent a good amount of my career jumping into fairly large code bases. I don't need to take it quite so step by step. I have seen enough code to take shortcuts, to guess at what is there.
But telling people that isn't helpful. I try at the beginning to give more step by step of how I would get into understand the code base if I didn't already know these kinds of shortcuts. (I'm not sure I could write those down, they are just know how and heuristics, like how when you are a starting to code a missing ; can take a much longer time to see than as you've been programming for a while)
I thought that was curious. He says this isn’t how he would do it today then goes on to do it today (or presumably the same day he wrote that he wouldn’t do it this way today).
Is this similar to what you can get with Doxygen?
https://en.wikipedia.org/wiki/Doxygen#/media/File:Doxygen-1....
This is an interesting approach. I think, in a way, it mirrors what I do. Having contracted for much of my career, I’ve had to get up to speed on a number of codebases quickly. When I have a choice of how to do this, I find a recently closed issue and try to write a unit test for it. If nothing else, you learn where the tests live, assuming they exist, and how much of a safety net you have if you start hacking away at things. Once I know how to add tests and run them (which is a really good way to deal with the codebase setup problem mentioned in the article because a lot of onboarding docs only get you to the codebase running without all the plumbing you need), I feel like I can get by without a full understanding of the code as I can throw in a couple of tests to prove what I want to get to and then hope the tests or CI or hooks prevent me from doing A Bad Thing. Not perfect and it varies depending on on how well the project is built and maintained, but if I can break things easily, people are probably used to things breaking and then I have an avenue to my first meaningful contribution. Making things break less.
I am quite skeptical and reserved when it comes to AI, particularly as it relates to impacts of the next generation of engineers. But using AI to learn a code base has been life-changing. Using a crutch to feel your way around. Then ditching the crutch when things are familiar, like using a map until you learn the road yourself.
Doesn’t anyone use debuggers anymore?
When I have a codebase I dont know or didn’t touch in some time and there’s a bug, first step is reproduce it an then set a breakpoint early on somewhere, crab some coffee and spend some time to step through it looking at state until I know what’s happening and from there its usually kind of obvious.
Why would one need a graph view to learn a codebase when you can just slap a red dot next to the route and step a few times?
GitHub Next comes to mind
https://githubnext.com/projects/repo-visualization/
Not very useful, is it?
Your visualizer looks great! I really like that it queues up tasks to run instead of only operating on the code during runtime attachment. I haven't seen that kind of thing before.
I built my own node graph utility to do this for my code, after using Unreal's blueprints for the first time. Once it clicked for me that the two are different views of the same codebase, I was in love. It's so much easier for me to reason about node graphs, and so much easier for me to write code as plain text (with an IDE/language server). I share your wish that there were a more general utility for it, so I could use it for languages other than js/ts.
Anyway, great job on this!
Very cool! For all its faults, seeing control and value change flows through execution is one of the things I really liked about Unreal's Blueprint viz scripting system. This looks like a better take on that.
And for huge git repos I always like to generate a Gource animation to understand how the repo grew, when big rearrangements and refactors happened, what the most active parts of the codebase are, etc.
The building of the visualiser was less interesting to me than the result and your conclusion. I agree that finding new ways to ingest the structure and logic of software would be very useful, and I like your solution. Is there a way to test it out?
In reverse engineering we often use Graph View to see execution flow as well. Glad to see it being used elsewhere
Do you automate that? If so what tooling do you use?
IDA does it by default, for example.
I always thought to do this visualization in 3d and maybe with VR. Not sure how useful or pleasing experience it would be. Kudos to the author of the project to get this done!
I got Minority Report vibes.
This kind of approach might be what (finally) unlocks visual programming?
I feel like most good programmers are like good chess players. They don't need to see the board (code). But for inputting the code transformation into the system this might be a good programmer's chessboard.
Though to make it work concretely for arbitrary codebases I feel like a coding agent behind the scenes is 100% required.
> I feel like most good programmers are like good chess players.
A specific type or area of developers, I'd say. There are many types and not all of them require understanding sizeable code bases to do their work well.
Understanding your large codebase is a few prompts away. You can ask a model to trace through and provide reports on the project's design, architectural and implementation. From there, you can drill in with followups.
Done right, you may not know specific lines or chunks of code by heart, but much like a tuned-in company CEO, you have eyes and ears on the ground and retain global oversight and insight of the project itself. For specifics, you can learn what you need as you need it. If that means knowing how every single module works, that's just a conversation with your agent.
Do you guys remember the smalltalk toolkit posted here a while ago which their creators made specifically for help understanding new codebases?
https://gtoolkit.com/ or https://moosetechnology.org/
Woah, that Glamorous Toolkit environment looks amazing. Thanks for the pointer.
This is the first thing that I used LLMs on. Not code generation, but parser and tooling to gain understanding. Also saves resources in the long run.
One of my favorite uses for Claude Code is to point it at a section of seriously badly written code with undecipherable symbol names, over the top cyclomatic complexity etc and just ask it to make the code readable.
This may be where AI coding tools unlock us. Being able to build tooling against novel concepts that change how we approach reading and writing code. I like it!
A use case that interests me is dynamic visualization for debugging, when there are interacting systems.
To flesh this out, let me see the volume of calls and data from one place to another. Help diagnose back-pressure, drops, rejections, and any other irregularities.
Think of an on-caller who wants to quickly pinpoint a problem. Visualization could help one understand the nature of the problem before reading the code. Then you could select a part of the visualization and ask the computer to tell you what that part does, if there are any recent changes to it, etc.
- But I'll admit, this isn't precisely how I would do it today
How would you do it today?
I try to explain what I mean the next few sentences of the post. I have spent a good amount of my career jumping into fairly large code bases. I don't need to take it quite so step by step. I have seen enough code to take shortcuts, to guess at what is there.
But telling people that isn't helpful. I try at the beginning to give more step by step of how I would get into understand the code base if I didn't already know these kinds of shortcuts. (I'm not sure I could write those down, they are just know how and heuristics, like how when you are a starting to code a missing ; can take a much longer time to see than as you've been programming for a while)
I thought that was curious. He says this isn’t how he would do it today then goes on to do it today (or presumably the same day he wrote that he wouldn’t do it this way today).
Cool project! Would you be willing to share the source code?
Where's the visualizer the blog post talks about?
How is it different from regular code browser/indexers?
You are so lucky to have git history and issues to work from!
[dead]
[dead]
[dead]
[dead]