Things I've Done with AI

60 points by shepherdjerred 3 hours ago

brotchie an hour ago

Not enough time, too many projects. Useful projects I did over the weekend with Opus 4.6 and GPT 5.4 (just casually chatting with it).

2025 Taxes

Dumped all pdfs of all my tax forms into a single folder, asked Claude the rename them nicely. Ask it to use Gemini 2.5 Flash to extract out all tax-relevant details from all statements / tax forms. Had it put together a webui showing all income, deductions, etc, for the year. Had it estimate my 2025 tax refund / underpay.

Result was amazing. I now actually fully understand the tax position. It broke down all the progressive tax brackets, added notes for all the extra federal and state taxes (i.e. Medicare, CA Mental Health tax, etc).

Finally had Claude prepare all of my docs for upload to my accountant: FinCEN reporting, summary of all docs, etc.

Desk Fabrication

Planning on having a furniture maker fabricate a custom walnut solid desk for a custom office standing desk. Want to create a STEP of the exact cuts / bevels / countersinks / etc to help with fabrication.

Worked with Codex to plan out and then build an interactive in-browser 3D CAD experience. I can ask Codex to add some component (i.e. a grommet) and it will generate a parameterized B-rep geometry for that feature and then allow me to control the parameters live in the web UI.

Codex found Open CASCADE Technology (OCCT) B-rep modeling library, which has a web assembly compiled version, and integrated it.

Now have a WebGL view of the desk, can add various components, change their parameters, and see the impact live in 3D.

cj an hour ago

I love the tax use case.
What scares me though is how I've (still) seen ChatGPT make up numbers in some specific scenarios.
I have a ChatGPT project with all of my bloodwork and a bunch of medical info from the past 10 years uploaded. I think it's more context than ChatGPT can handle at once. When I ask it basic things like "Compare how my lipids have trended over the past 2 years" it will sometimes make up numbers for tests, or it will mix up the dates on a certain data points.
It's usually very small errors that I don't notice until I really study what it's telling me.
And also the opposite problem: A couple days ago I thought I saw an error (when really ChatGPT was right). So I said "No, that number is wrong, find the error" and instead of pushing back and telling me the number was right, it admitted to the error (there was no error) and made up a reason why it was wrong.
Hallucinations have gotten way better compared to a couple years ago, but at least ChatGPT seems to still break down especially when it's overloaded with a ton of context, in my experience.
- arjie 22 minutes ago
  
  In my case, what I like to do is extract data into machine-readable format and then once the data is appropriately modeled, further actions can use programmatic means to analyze. As an example, I also used Claude Code on my taxes:
  1. I keep all my accounts in accounting software (originally Wave, then beancount)
  2. Because the machinery is all in programmatically queriable means, the data is not in token-space, only the schema and logic
  I then use tax software to prep my professional and personal returns. The LLM acts as a validator, and ensures I've done my accounts right. I have `jmap` pull my mail via IMAP, my Mercury account via a read-only transactions-only token and then I let it compare against my beancount records to make sure I've accounted for things correctly.
  For the most part, you want it to be handling very little arithmetic in token-space though the SOTA models can do it pretty flawlessly. I did notice that they would occasionally make arithmetic errors in numerical comparison, but when using them as an assistant you're not using them directly but as a hypothesis generator and a checker tool and if you ask it to write out the reasoning it's pretty damned good.
  For me Opus 4.6 in Claude Code was remarkable for this use-case. These days, I just run `,cc accounts` and then look at the newly added accounts in fava and compare with Mercury. This is one of those tedious-to-enter trivial-to-verify use-cases that they excel at.
  To be honest, I was fine using Wave, but without machine-access it's software that's dead to me.
- shepherdjerred 36 minutes ago
  
  I've gotten better results by telling it "write a Python program to calculate X"
  - dmd 14 minutes ago
    
    Yeah, in my user prompt I have "Whenever you are asked to perform any operation which could be done deterministically by a program, you should write a program to do it that way and feed it the data, rather than thinking through the problem on your own." It's worked wonders.
  - cj 21 minutes ago
    
    Good call. I’ve also had better results pre-processing PDFs, extracting data into structured format, and then running prompts against that.
    Which should pair well with the “write a script” tactic.
    
    tavavex 10 minutes ago
    
    Yeah, asking for a tool to do a thing is almost always better than asking for the thing directly, I find. LLMs are kind of not there in terms of always being correct with large batches of data. And when you ask for a script, you can actually verify what's going on in there, without taking leaps of faith.
thijsvandien an hour ago

I don't know, but I would never upload such sensitive information to a service like that (local models FTW!) or trust the numbers.
- basch 37 minutes ago
  
  Which part is sensitive? Social is public, income is private but what is someone going to do with it?
  - thijsvandien 23 minutes ago
    
    Now that's a question I'd feel more confident having answered by an LLM. Personally, I'm tired of arguing with "nothing to hide", which (no offense) is just terribly naive these days.
mandeepj 23 minutes ago

> Result was amazing. I now actually fully understand the tax position.
You couldn’t do that with TurboTax or block’s tax file? You don’t have to submit or pay.
MikeNotThePope 24 minutes ago

Be careful with taxes. Hallucinations will cost you.
slopinthebag 23 minutes ago

> had Claude prepare all of my docs for upload to my accountant: FinCEN reporting, summary of all docs, etc.
I imagine your accountant had the same reaction I do when an amateur shows me their vibe codebase.
whattheheckheck 44 minutes ago

I had ai hallucinate that you can use different container images at runtime for emr serverless. That was incorrect its only at application creation time.
Hope you dont get audited

semiquaver an hour ago

I feel pretty productive myself with AI but this list isn’t beating the rap that AI boosters mostly use AI to do useless stuff focused on pretending to improve productivity or projects that make it easier to use AI.

lukan 27 minutes ago

"Or projects that make it easier to use AI"
I get the sentiment, but this is natural with a groundbraking new technology. We are still in the process of figuring out how to best apply generative LLM's in a productive way. Lots of people tinker and share their results. Most is surely hype and will get thrown away and forgotten soon, but some is solid. And I am glad for it as I did not take part in that but now enjoy the results as the agents have become really good now.
- harry8 7 minutes ago
  
  > "Or projects that make it easier to use AI"
  This is exactly the same reason why the appropriate question to ask about Haskell is "where are the open source projects that are useful for something that is not programming?"
  The answer for Haskell after 3 decades is very, very little. Pandoc, Git Annexe, Xmonad. Might be something else since I last did the exercise but for Haskell the answer is not much. Then we examine why the kids (us kids of all ages) can't or don't write Haskell programs.
  The answer for LLM coding may be very different. But the question "where is the software that does something that solves a problem outside its own orbit" is crucial. (You have a problem. You want to use foo to solve it, now you have two problems but you can use foo to solve a part of the second one!!)
  The price of getting code written just went down. Where are the site/business launches? Apps? New ideas being built? Specifically. With links. Not general, hand-wavy "these are the sorts of things that ..." because even if it's superb analysis, without some data that can be checked it's indistinguishable from hype.
  Whatever data we get will be very informative.
stavros an hour ago

Here's what I made:
* https://www.stavros.io/posts/i-made-a-voice-note-taker/ - A voice note recorder.
* https://github.com/skorokithakis/stavrobot - My secure AI personal assistant that's made my life admin massively easier.
* https://github.com/skorokithakis/macropad - A macropad.
* https://github.com/skorokithakis/sleight-of-hand - A clock that ticks seconds irregularly but is accurate for minutes.
* https://pine.town - A whimsical little massively multiplayer drawing town.
* https://encyclopedai.stavros.io - A fictional encyclopedia.
* https://justone.stavros.io - A web implementation of the board game Just One.
* https://www.themakery.cc - The website and newsletter for my maker community.
* https://theboard.stavros.io - A feature board that implements itself.
* https://github.com/skorokithakis/dracula - A blood test viewer.
* https://github.com/skorokithakis/support-email-bot - An email bot to answer common support queries for my users.
Maybe some of these will beat the rap.
- saulpw an hour ago
  
  Some of them definitely do not. Like a fictional encyclopedia? What is the point of that? That's like "an alphabetical novel".
  And even for the ones that might "beat the rap", I don't understand from your descriptions why they are interesting or unique. A voice note recorder? Cool. There are already hundreds if not thousands of those, why did you need to make your own in the first place? I'm not saying that yours isn't special, I'm just saying that it doesn't help to post the blandest description possible if you're trying to impress people with the utility of your utility.
  - stavros an hour ago
    
    Sounds like the goalposts are moving from "not useless stuff focused on pretending to improve productivity or projects that make it easier to use AI" to "extremely useful stuff".
    
    saulpw 40 minutes ago
    
    One issue is that I interpreted the parent as OR, not AND. "useless stuff OR productivity tools OR AI tools".
    Moreover though, I'm not even saying you shouldn't do those things. I'm actually playing around with AI quite a bit, and certainly have created my share of useless/productivity tools. But it's not a flex to show off your own Flappy Birds or OpenNanoClaw clone, even if they are written in COBOL or MUMPS.
    And they definitely do not have to be "extremely useful". But they should answer the question: what problem does it solve?
    
    jjee an hour ago
    
    Fair. But finally we are seeing what LLM proponents are putting forward.
    And it’s exactly what I expected - lines of code. Cute. But… so what? This is not good for the AI hype and nor any continued support for future investment.
    On the other hand all this stuff is going to drive continual innovation. The more tokens generated the more model producers invest. And we might eventually get to a place of local models.
    
    stavros 40 minutes ago
    
    I swear, I'm going to stop commenting on this site, the amount of shitting on people who use LLMs (ie everyone) is just impossible to deal with.
    
    slopinthebag 18 minutes ago
    
    I have the opposite experience, the amount of AI boosters deriding the less enthusiastic, gleefully exclaiming how someone will be "left behind" if they don't immediately adopt the latest hype cycle, or sharing AI slop and either embellishing or outright lying about it's capabilities is making me want to log off forever. "Handwritten code? Don't you only care about providing maximum shareholder value?" No.
- profsummergig an hour ago
  
  > "A clock that ticks seconds irregularly but is accurate for minutes."
  Sounds like something that could be tried as a fix for a kind of OCD (obsessive seconds counting).
  - stavros an hour ago
    
    Maybe, although it's actually giving me OCD, I think. It's really hard to tune out because of the irregular ticking. I implemented a regular mode to combat this, defeating the purpose somewhat.
    
    observationist 35 minutes ago
    
    Unpredictable things catch our attention - it's the exceptions that are important to survival, and our brains evolved to cope with the stimuli that this experiment messes with.
    Something like this would be anxiety inducing for most people, I bet. That'd be an excellent experiment, track heart rate, EEG, and performance on a range of cognitive tasks with 2 minute long breaks between each tasks, one group exposed to the irregular ticking, another exposed to regular ticking, another with silence, and one last one with pleasant white noise.
    
    pinkmuffinere 36 minutes ago
    
    what was the motivation for originally making it with irregular ticking?
    
    stavros 31 minutes ago
    
    It sounded fun (and it is)! My favorite mode is one that ticks each second imperceptibly fast, and then stalls for a second in one of the ticks (so that it lasts two).
    It's just the right amount of "did that clock just skip a beat? Nah must just be my imagination".
- risyachka 37 minutes ago
  
  It does not matter how much stuff is built. What matters is what comes out of it.
  And with AI the result of 99.9% is abandonware. Just piles of code no one will ever touch again.
  Which proves the point of no productivity gains. Its just cheap dopamine hits.
  - danso 26 minutes ago
    
    The user you're responding too lists a "blood test viewer" [0], which looks to be a tool that turns his blood test PDFs into structured and analyzed data. You're saying that unless he continuously revises/upgrades the code, it's still "abandonware" even if it meets his needs for the near future?
    [0] https://github.com/skorokithakis/dracula
    
    sarchertech 10 minutes ago
    
    Bit rot is real. The dependencies listed here include calling into AI APIs that will stop working with time. So yes if no one keeps this up to date it will rot into useless likely very quickly.
    That’s not even mentioning that this tools doesn’t do much beyond wrap a call to Claude. And it’s using Claude to display blood test data to the end user. This is not something I’d trust an LLM to not mess up. You’d really want to double check every single result.
  - tempaccount5050 31 minutes ago
    
    Missing the point. I no longer need to buy or rely on someone else for software I want to use. A lot of things I want to do ARE one offs. I can write software and throw it away when I'm done.
    
    incr_me 10 minutes ago
    
    I know this sounds sarcastic but I really mean it: For years everyone has been monastically extolling some variation of "the best code is deleted code". Now, we have a machine that spits out infinite code that we can infinitely delete. It's a blessing that we can have shitty code generated that exposes at light speed how shitty our ideas are and have always been.
    
    sarchertech 6 minutes ago
    
    You still need to spend plenty of time verifying they work though unless it’s something where that truly doesn’t matter.
  - grim_io 30 minutes ago
    
    Abandonware is what the customer wants.
    Constant enshittification and UI redesigns are driven by the provider to justify monthly extortion.
shepherdjerred an hour ago

That's a fair criticism of my personal projects. Maybe 3-4 of those could potentially see usefulness outside of myself.
At work, I would say I've done plenty of "useful" things with AI, but that's hard to show off given that I work on an internal application.
SunshineTheCat an hour ago

I've actually felt the same way about some (not all) but some "productivity" hacks I've seen people post online with their OpenClaw setups.
I chuckle when I see some of them because you could achieve the same (or often faster) result by jotting a note onto a notecard and sticking it in your pocket.
Most of the other automations running don't really seem to serve any real purpose at all.
But hey, if it's fun, have at it.
gopher_space 31 minutes ago

I mean I’m using it to deconstruct and reinvent my development process from the ground up, but it’s so easy to do this now and so customized for my specific needs that the idea of posting about it never crossed my mind.

smokel 41 minutes ago

I've written an Obsidian clone for myself, which has proper Emacs keybindings. Took me a few hours too many to get in all the features that I need.

What I find interesting is that I have little motivation to open source it. Making it usable for others requires a substantial amount of time, which would otherwise be just a fraction of the development time.

xorvoid 25 minutes ago

I was thinking about doing the same. Build a clone with AI custom tailored for my own quirks. And not bothering to open source it because it's too bespoke for anyone else. How hard was this? Can you share any advice?

bronlund an hour ago

If you are a parent, you know that feeling when your child is struggling with something and gets frustrated, but you keep silent and don't help because you know that the child has to figure this out by themselves. That's the same feeling I get when I hear all those doom and gloom perspectives on how AI is ruining coding :D

piker an hour ago

> I’ll continue use these tools with the hope that they don’t make me obsolete too quickly.

I'm starting to believe using them is more likely to make you obsolete than not.

vermilingua an hour ago

It baffles me that so many people are so willing to pay for the privilege of training their own replacement.
- jjee an hour ago
  
  But are you though?
  From where I stand this thing is going to provide great leverage to those who don’t simply just write code. I personally doubt the thing will ever get to a place where it can be trusted to operate alone - it needs a team of people and to go super fast you need more people.
  Moreover, the price won’t be high due to competition.
  I’ve changed my view on LLMs as being good, as long as competition is fierce.
  - alas44 32 minutes ago
    
    Looks like a LLM generated comment
    
    shepherdjerred 13 minutes ago
    
    It reads like a human to me. But I understand being suspicious of an account that’s 40min old
    
    jjee 27 minutes ago
    
    Let me guess. You just type code for a living and feel your livelihood is threatened?
    Seriously get your head out of your butt. If I can have a great vision and get it executed that could cause serious positive disruption and it doesn’t require me a thousand engineers to produce and get it to market, how and why is that a bad thing?
    This is something everyone should strive for.
    
    alas44 21 minutes ago
    
    Not sure how hacker news can effectively protect against what looks like fake users posting LLM generated comments :(
shepherdjerred an hour ago

I completely agree. Most programmers work on rather boring and not particularly novel things. If they don't adapt, then they'll be replaced.
I do think it'll be a while before LLMs make significant contributions to complex projects, though. For example I can't imagine many maintainers of the Linux kernel use LLMs much.
- piker 29 minutes ago
  
  No. That's not really where I'm coming from.
  I believe your skills are atrophying when you use these things no matter how trivial the case. That compounds with their bias towards solving problems by producing more code to further reduce your productivity without them.
  - shepherdjerred 11 minutes ago
    
    Ah I read it wrong. I must be using LLMs too much :)
    I do agree with you to some extent. I think anyone who uses LLMs will need to set aside some time writing code by hand to keep their skills sharp.
- max_streese an hour ago
  
  And if we do adapt we might still get replaced because less of us will be able to do more. Or we wont because of Jevons Paradox. Linux maintainers on the other hand can code (with and without AI) what I could not (with or without AI). So in a way becoming a more knowledgeable, more skilled programmer is the way? In any case, too much speculation about the future.
keybored an hour ago

That’s the most craven AI user line I’ve read. Well at least from this week.

ipaddr 16 minutes ago

I've heard a few people say I haven't written a single line of code since ...

What do people think of it?

I personal don't think that's a badge of honor. Aside from losing your coding skills you miss oppurtunities to generate AI pieces and connect them to existing systems that can't be feed into the AI. Plus making small changes is easier than having the AI make them without messing something else up.

Maxatar 10 minutes ago

I wouldn't say strictly speaking that I've written no code, but the amount of code I've written since "committing" to using Claude Code since February is absolutely miniscule.
I prefer having Claude make even small changes at this point since every change it makes ends up tweaking it to better understand something about my coding convention, standard, interpretation etc... It does pick up on these little changes and commits them to memory so that in the long run you end up not having to make any little changes whatsoever.
And to drive this point further, even prior to using LLMs, if I review someone's work and see even a single typo or something minor that I could probably just fix in a second, I still insist that the author is the one to fix it. It's something my mentor at Google did with me which at the time I kind of felt was a bit annoying, but I've come to understand their reason for it and appreciate it.
- sarchertech a minute ago
  
  Unfortunately Claude has a context window limit so it’s not going to keep “learning” forever.

JeanMarcS an hour ago

And like everyone else you trained the AI how to replace you by giving it more insight on how to prompt stuff.

shepherdjerred an hour ago

Yes. I also freely release almost all of the code I've ever written, aside from what I've done at work (which I would release if I legally could)

stavros an hour ago

What did you think of Dagger? I used Earthly a while ago but the one thing I didn't like was that it couldn't parallelize runs, since it only ran on one CI instance. Other than that, I liked that I could run my entire CI pipeline locally, but didn't like it so much that I ended up using it for much else.

shepherdjerred an hour ago

I really like Dagger. I had a _lot_ of weird issues with Earthly, like edge cases. Dagger has been mostly solid.
It still has gaps. I don't think they've landed on the right model for CI. Like Earthly, their model is a CI runner + local cache. I believe a distributed cache (like Bazel) makes more sense.
If I were choosing between the two I'd personally always pick Dagger, but I think there is a strong argument for Earthly for simpler projects. If you're using multiple Earthfiles or a few hundred lines of Earthly, I think you've outgrown it.

vunderba 31 minutes ago

I tend to only use LLMs to complete projects that are relatively unique and that haven't been done before. Because if I'm not going to get anything out of the journey, I might as well get something out of the destination.

*Piece Together*

An animated puzzle game that I built with a fairly heavy reliance on agentic coding, especially for scaffolding. I did have to jump in and tweak some things manually (the piece-matching algorithm, responsive design, etc.), but overall I’d estimate that LLMs handled about 80% of the work. It's heavily based on the concept of animated puzzles in the early edutainment game The Island of Dr. Brain.

https://animated-puzzles.specr.net

*Lend Me Your Ears*

Lend Me Your Ears is an interactive web-based game inspired by the classic Simon toy (originally by Milton Bradley). It presents players with a sequence of musical notes and challenges them to reproduce the sequence using either an on-screen piano, MIDI keyboard, or an acoustic instrument such as a guitar.

https://lend-me-your-ears.specr.net

*Shâh Kur - Invisible Chess*

A voice controlled blindfold chess game that uses novel types of approaches (last N pieces moved hidden, fade over time, etc). Already been already playing it daily on my walks.

https://shahkur.specr.net

*Word game to find the common word*

It's based off an old word game where one person tries to come up with three words: sign, watch, bus. The other person has to think of a common word that forms compound-style words with each of them: stop.

I was quite surprised to see that this didn't exist online already.

https://common-thread.specr.net

*A Slide Puzzle*

Slide puzzles for qualified MENSA members. I built it for a friend who's basically a real-life equivalent of Dustin Hoffman's character from Rain Man. So you might have to rearrange a slide puzzle from the periodic table of elements, or the U.S. presidents by portrait, etc.

https://slide-puzzles.specr.net

*Glyphshift*

Transforms random words on web pages into different writing systems like Hiragana, Braille, and Morse Code to help you learn and practice reading these alphabets so you can practice the most functionally pointless task, like being able to read braille visually.

https://github.com/scpedicini/glyph-shift

All of these were built with varying levels of assistance from agentic coding. None of them were purely vibe-coded and there was a great deal of manual and unit testing to verify functionality as it was built.

lowsong 20 minutes ago

> At work, all that matters is that value is delivered to the business. Code needs to be maintainable so that new requirements can be met. Code follows design patterns, when appropriate, because they are known solutions to common problems, and thus are easy to talk about with others. Code has type systems and static analysis so that programmers make fewer mistakes.

This is a narrow view of software engineering. Thinking that your role is "code that works" is hardly better than thinking you're a "(human) resource that produces code". Your job is to provide value. You do that by building knowledge, not only of the system you're developing but of the problem space you're exploring, the customers you're serving, the innovations you can do that your competitors can't.

It's like saying that a soccer player's purpose is "to kick a ball" and therefore a machine that launches balls faster and further than any human will replace all soccer players, and soon all professional teams will be made up of robots.

slopinthebag 27 minutes ago

> Speaking in the context of solving a problem: does AI need to write beautiful code? No. It needs to write code that works. The code doesn’t need to be maintainable in the traditional sense. If you have sufficient tests, you can throw some LLMs at a pile of “bad” code and have them figure it out.

Code doesn't need to be "beautiful", but the beauty of code has nothing to do with maintainability. Linus once said "Bad programmers worry about the code. Good programmers worry about data structures and their relationships." The actual hard part of software is not the code, it's what isn't in the code - the assumptions, relationships, feedback loops, emergent behaviours, etc. Maintainability in that regard is about system design. Imagine software as a graph, the nodes being pieces of code and the edges being those implicit relationships. LLM's are good at generating the nodes but useless at the edges.

The only thing that seems to work is to have a validation criteria (eg. a test suite) that the LLM can use to do a guided random walk towards a solution where the edges and nodes align to satisfy the criteria. This can be useful if what you are doing doesn't really matter, like in the case of all the pet projects and tools people share. But it does matter if your program assumes responsibility somewhere, like if you're handling user data. This idea of guardrail-style programming has been around for a while, but nobody drives by bouncing off the guardrails to get to their destination, because it's much more efficient to encode what a program should do instead of what it shouldn't, which is the case with this type of mega-test-driven-development. Is it more efficient to tell someone where not to go when giving directions as opposed to telling them how to get there?

Take the Cloudflare Next.js experiment for example - their version passed all the Next.js tests but still had issues because the test suite didn't even come close to encoding how the system works.

So no, you still need to care about maintainability. You don't need to obsess over code aesthetics or design patterns or whatever, but you never needed to do that. In fact, more than ever programmers need to be concerned with the edges of their software and how they can guide the LLM's to generate the nodes (code) while maintaining the invariants of the edges.