Show HN: I AI-coded a tower defense game and documented the whole process
github.comI'm a software developer with 20+ years of experience but during this time I never programmed any games, but I really wanted to for the longest time. With the advent of AI coding agents I thought that this is the best time to try and so I've learned a bit of Phaser.js (a Javascript based game engine) and entered Beginner's Jam Summer 2025 - a game jam for beginners in the game dev industry that allows AI coding. After around 25-30 hours (working mainly after my full-time day job) I managed to submit the game I called "Tower of Time" (the theme of the jam was "Time Travel").
You can play it in your browser here: https://m4v3k.itch.io/tower-of-time
The goal of this project for me was first and foremost to see if AI coding is good enough to help me with creating something that's actually fun to play and to my delight is turns out the answer is yes! I decided to document the whole process for myself and others to learn from my mistakes, so both the code AND all the prompts I used are published on GitHub (see submission link). The art assets are largely taken from itch.io artists who shared them for free, with some slight touch ups. Sounds came from freesound.org.
I've also streamed parts of the process, you can watch me working on the final stretch and submitting the finished game (warning, it's 5+ hours long):
https://www.twitch.tv/videos/2503428478
During this process I've learned a lot and I want to use this knowledge in my next project that will hopefully be more ambitious. If you have any comments or questions I'm here to answer!
I'm really enjoying reading over the prompts used for development: (https://github.com/maciej-trebacz/tower-of-time-game/blob/ma...)
A lot of posts about "vibe coding success stories" would have you believe that with the right mix of MCPs, some complex claude code orchestration flow that uses 20 agents in parallel, and a bunch of LLM-generated rules files you can one-shot a game like this with the prompt "create a tower defense game where you rewind time. No security holes. No bugs."
But the prompts used for this project match my experience of what works best with AI-coding: a strong and thorough idea of what you want, broken up into hundreds of smaller problems, with specific architectural steers on the really critical pieces.
> what works best with AI-coding: a strong and thorough idea of what you want, broken up into hundreds of smaller problems, with specific architectural steers on the really critical pieces
As a tech lead who also wears product owner hats sometimes: This is how you should do it with humans also. At least 70% of my job is translating an executive’s “Time travel tower game. No bugs” into that long series of prompts with a strong architectural vision that people can work on as a team with the right levels of abstraction to avoid stepping on each other’s toes.
I tried to build a simple static HTML game for the board game Just One, where you get a text box, type a word in, and it's shown full screen on the phone. There's a bug where, when you type, the text box jumps around, and none of the four LLMs I tried managed to fix it, no matter how much I prompted them. I don't know how you guys manage to one-shot entire games when I can't even stop a text box from jumping around the screen :(
Browser text entry on mobile phones is notoriously hard to get right and some bugs are literally unfixable [1]. I'm a frontend developer in my day job and I struggled with this even before AI was a thing. I think you just accidentally picked one of the hardest tasks for the AI to do for you.
[1] Example: https://www.reddit.com/r/webdev/comments/xaksu6/on_ios_safar...
Huh, that's actually my exact bug. I didn't realize this was so hard, thank you.
I have a reasonably good solution for this project of mine you might find useful:
https://grack.com/demos/adventure/
The trick for me was just using a hidden input and updating the state of an in game input box. The code is ancient by today's standards but uses a reasonably simple technique to get the selection bounds of the text.
It works with auto complete on phones and has been stable for a decade.
hidden input box is something I heard before from some hacker-ish old collegues - seems to be a powerful and reliable approach to store state & enable communication between components!
That's promising, thank you! I'll ask the LLM to implement it.
https://xkcd.com/1425/
One of the frustrating things about web dev, I find, is the staggering gulf between apparently nearly identical tasks and unpredictability of it. So often I will find myself on gwernnet asking Said Achmiz, 'this letter is a little too far left in Safari, can we fix that?' and the answer is 'yes but fixing that would require shipping our own browser in a virtual machine.' ¯\_(ツ)_/¯
> what works best with AI-coding: a strong and thorough idea of what you want, broken up into hundreds of smaller problems, with specific architectural steers on the really critical pieces
This has worked extremely well for me.
I have been working on an end-to-end modeling solution for my day job and I'm doing it entirely w/Claude.
I am on full-rework iteration three, learning as I go on what works best, and this is definitely the way. I'm going to be making a presentation to my team about how to use AI to accelerate and extend their day-to-day for things like this and here's my general outline:
1. Tell the LLM your overall goal and have it craft a thoughtful product plan from start to finish.
2. Take that plan and tell it to break each of the parts into many different parts that are well-planned and thoroughly documented, and then tell it to give you a plan on how to best execute it with LLMs.
3. Then go piece by piece, refining as you go.
The tool sets up an environment, gets the data from the warehouse, models it, and visualizes it in great detail. It took me about 22 hours of total time and roughly 2 hours of active time.
It's beautiful, fast, and fully featured. I am honestly BLOWN AWAY by what it did and I can't wait to see what others on my team do w/this. We could have all done the setup, data ingestion, and modeling, no question; the visualization platform it built for me we absolutely could NOT have done w/the expertise we have on staff--but the time it took? The first three pieces probably were a few days of time, but the last part, I have no idea. Weeks? Months?
Amazing.
I wrote a whole PRD for this very simple idea, but still the bug persisted, even though I started from scratch four times. Granted, some had different bugs.
Have you tried with both Claude opus 4 and Gemini 2.5 pro?
Opus 4, Sonnet 4, o3, o4-mini-high.
I guess sometimes I have to do some minor debugging myself. But I really haven't encountered what you're experiencing.
Early on, I realized that you have to start a new "chat" after so many messages or the LLM will become incoherent. I've found that gpt-4.1 has a much lower threshold for this than o3. Maybe that's affecting your workflow and you're not realizing it?
No, that's why I started again, because it's a fairly simple problem and I was worried that the context would get saturated. A sibling commenter said that browser rendering bugs on mobile are just too hard, which seems to be the case here.
Same. I had some idea that I wanted to build a basic sinatra webapp with a couple features. First version was pretty good. Then I asked it to use tailwind for the css. Again pretty good. Then I said I wanted to use htmx to load content dynamically. Suddenly it decides every backend method needs to check if the call is from htmx and alter what it does based on that. No amount of prompting could get it to fix it.
Hard to tell what exactly went wrong in your case, but if I were to guess - were you trying to do all of this in a single LLM/agent conversation? If you'll look at my prompt history for the game from OP you'll see it was created with a dozens of separate conversations. This is crucial for non-trivial projects, otherwise the agent will run out of context and start to hallucinate.
Agent mode in RubyMine which I think is using a recent version of sonnet. I tried starting a new agent conversation but it was still off quite a bit. For me my interest in finessing the LLM runs out pretty quickly, especially if I see it moving further and further from the mark. I guess I can see why some people prefer to interact with the LLM more than the code, but I’m the opposite. My goal is to build something. If I can do in 2 hours of prompting or 2 hours of doing it manually I’d rather just do it manually. It’s a bit like using a mirror to button your shirt. I’d prefer to just look down.
> If I can do in 2 hours of prompting or 2 hours of doing it manually I’d rather just do it manually.
100% agree, if that was the case I would not use LLMs either. Point is, at least for my use case and using my workflow it's more like 2 hours vs 10 minutes which suddenly changes the whole equation for me.
Yeah, or 10 minutes of prompting and then 20 minutes of implementing my own flavor of the LLM's solution vs 2 hours of trial and error because I'm usually too lazy to come up with a plan.
CSS is the devil and I fully admit to burning many hours of dev time, mine without an LLM, an LLM by itself, and a combination of the two together to iron out similar layout nonsense for a game I was helping a friend with. In the end, what solved it was breaking things into hierarchical react components and adding divs by hand and using the chrome dev tools inspector, and good old fashioned human brain power to solve it. The other one was translating a python script to rust. I let the LLM run me around in circles, but what finally did it was using Google to find a different library to use, and then to tell the LLM to use that library instead.
I didn't realize this was so hard, thanks. I expected to be simple positioning issues, but the LLMs all found it impossible.
Here's the game, BTW (requires multiple people in the same location): https://home.stavros.io/justone/
> what works best with AI-coding: a strong and thorough idea of what you want, broken up into hundreds of smaller problems
A technique that works well for me is to get the AI to one-shot the basic functionality or gameplay, and then build on top of that with many iterations.
The one-shot should be immediately impressive, if not then ditch it and try again with an amended prompt until you get something good to build on.
What I've found works best is to hand-code the first feature, rendering the codebase itself effectively a self-documenting entity. Then you can vibe code the rest.
All future features will have enough patterns defined from the first one (schema, folder structure, modules, views, components, etc), that very few explicit vibe coding rules need to be defined.
I totally agree!
this is the idea behind my recent post actually[1] where I recommend people use AI to write specs before they code. If all you have to do is a human is edit the spec, not write it from scratch, you're more likely to actually make one.
[1] https://lukebechtel.com/blog/vibe-speccing
Heh, didn't know there was a name for it...
What I've taken to lately is getting the robots to write "scientific papers" on what I want them to get up to so instead of iterating over broken code I can just ask them "does this change follow the specification?" Seems to stop them from doing overly stupid things...mostly.
Plus, since what I've been working on is just a mash-up of other people's ideas, it provides a good theoretical foundation of how all the different bits fit together. Just give them the paper you've been working on and some other paper and ask how the two can be used together, a lot of the time the two ideas aren't compatible so it saves a lot of time trying to force two thing to work when they really shouldn't. Very good way to explore different ideas without the robots going all crazy and producing a full code project (complete with test and build suites) instead of just giving a simple answer.
there is now I suppose! ;)
Yeah it isn't a panacea but it has afforded me less frustration than the alternative of jumping straight in.
> Since what I've been working on is just a mash-up of other people's ideas
Totally, I find most work I do, if I'm honest, is in this bucket. LLMs are pretty good at "filling in the gaps" between two ideas like this
> No security holes. No bugs.
You forgot “Don’t hallucinate.” Noob.
> No security holes. No bugs.
A friend called me for advice on trouble he was having with an LLM and I asked “What exactly do you want the LLM to do?” He said “I want it to knock this project out of the park.” And I had to explain to him it doesn’t work that way. You can’t just ask for perfection.
I mean, you can, but you won’t get it.
>a strong and thorough idea of what you want, broken up into hundreds of smaller problems, with specific architectural steers on the really critical pieces.
Serious question: at what point is it easier to just write the code?
Depends. If you have written other Tower Defense games then it’s probably really close to that line. If you just took a CS class in high school then this vibe approach is probably 20x faster.
My aunt would always tell me that making fresh pasta or grounding your own meat was basically just as fast as buying it. And while it may have have been true for her it definitely wasn’t for me.
And if it's a work project, you're going to spend a few years working on the same tech. So by the time you're done, there's going to be templates, snippets,... that you can quickly reuse for any prototyping with the tech. You would be faster by the fact that you know that it's correct and you don't have to review it. Helps greatly with mental load. I remember initializing a project in React by lifting whole modules out of an old one. Those modules could have been libraries the way they were coded.
All of this, and highlighting this part:
>You would be faster by the fact that you know that it's correct and you don't have to review it. Helps greatly with mental load.
I keep thinking maybe it's me who's just not getting the vibe coding hype. Or maybe my writing vs reading code efficiency is skewed towards writing more than most people's. Because the idea of validating and fixing code vs just writing it doesn't feel efficient or quality-oriented.
Then, there's the idea that it will suddenly break code that previously worked.
Overall, I keep hearing people advocating for providing the AI more details, new approaches/processes/etc. to try to get the right output. It makes me wonder if things might be coming full circle. I mean, there has to be some point where it's better to just write the code and be done with it.
Coincidentally those seem to be strongly correlated with success in old fashioned application development as well.
> A lot of posts about "vibe coding success stories"
Where are you reading “a lot of posts” making this specific claim? I’ve never seen any serious person make such a claim
> a strong and thorough idea of what you want, broken up into hundreds of smaller problems, with specific architectural steers on the really critical pieces.
This is how I’ve been using LLM bots since CGPT preview and it’s been phenomenally useful and 100x my productivity
The gap seems to be between people who never knew how to build, looking for a perfect Oracle that would be like a genie in a lamp, then mad when its actual work
The thing the last few years have beat into me is that most engineers are actually functionally bad engineers who only know 1:1000th of what they should know in order to know how to build a successful project end to end
My assumption was that all of the bad engineers I worked with in person were a accidental sample of some larger group of really good ones (who I’ve also been able to work with over the years) and that it’s just rare to find an actual capable engineer who understands the whole process
Turns out that’s a trivial minority (like every other field) and most people are pretty bad at what they do
I see 100x used quite a bit related to LLM productivity. It seems extreme because it implies one could generate a year’s worth of value in a few days. I would think delivering features involves too much non coding work for this to be possible.
But that’s precisely what I’m saying is that what I can do today by myself in a couple of days would have taken me a year with a team of three people
The key limiting factor to any project as somebody else in this thread said was “people alignment are the number one hindrance in project speed”
So 10 years ago if I wanted to make a web application that does complex shit I’d have to go and hire a handful of experts have them coordinate, manage the coordination of it, deliver it, monitor it everything else all the way through ideation storyboarding and everything else
I can do 100% of that myself now, now it’s true I could’ve done 100% of myself previously, but again it took a year of side effort to do it
If 100x was really possible, it would be instantly, undeniably obvious to everyone. There would be no need for people alignment because one lone developer could crank out basically anything less complicated than an OS in a month.
It is starting to become obvious to more and more people. And is it really that hard to believe that a tool can extend your natural abilities by 2 orders of magnitude but not everyone can instantly use it? If fact you’re using one right now. Your computer or phone can do many things orders of magnitude faster than you can do alone, but only until recently most people had no idea how to use computers and could not benefit from this power.
I believe with LLM’s were set to relive the same phenomenon again.
I use it at work everyday. I work with people who use it everyday. 100x is complete and utter nonsense.
100x means that I can finish something that would have taken me 10 years in a little over a month.
It would be obvious not because people are posting “I get a 100x productivity boost”, but because show HN would be filled with “look at this database engine I wrote in a month”, and “check out this OS that took me 2 months”.
And people at work would be posting new repos where they completely rewrote entire apps from the ground up to solve annoying tech debt issues.
You’re missing the point by bike shedding on “100x”
It’s probably higher tbh because there’s things I prototyped to test an assumption on, realized it was O(N^2) then dumped it and tried 4 more architecture simulations to get to one that was implementable with existing tool chains I know
So you’re doing exactly what i called out which is evaluating it as a magic oracle instead of what I said which is that it makes me personally something like 100x more productive as a support tool, which often means quickly ruling out bad ideas
Preventing a problem in architecture is worth way more than 100x
If what you meant by 100x more productive is that sometimes for very some specific things it made you 100x more productive, and that isn’t applicable to software development in general, I can see that.
I have many times delivered a year of value in a few days by figuring out that we didn’t actually need to build something instead of just building exactly what someone asked for.
>I have many times delivered a year of value in a few days by figuring out that we didn’t actually need to build something instead of just building exactly what someone asked for.
Knowing what not to do more of a superpower than knowing what to do - cause it’s possible to know
You can prototype by hand too. Personally I find it might take me 10 min to try a change with an LLM that would have taken me 30 min to 1hr by hand. It's a very nice gain but given the other things to do that aren't sped up by LLM all that much (thinking about the options, communicating with the team), it's not _that_ crazy.
Sorry, I call bs, unless you were very poor developer without any skills to manage people.
[dead]
The bottleneck IME is people. It's almost never code. It's getting alignment, buy-in, everyone rowing in the same direction.
Tech that powers up an individual so they can go faster can be a bit of a liability for a company, bus factor 1 and all that.
100x is a bold statement.
You can easily get to 100x in a greenfield project but you will never get to 100x in a legacy codebase.
That depends on the code-base. I've found that hand-writing the first 50% of the code base actually makes adding new features somewhat easier because the context/shape of the idea is starting to come into focus. The LLM can take what exists and extrapolate on it.
> Where are you reading “a lot of posts” making this specific claim?
Reddit.
This is awesome. I've been in software for 20+ years now as well.
One thing I've noticed is many (most?) people in our cohort are very skeptical of AI coding (or simply aren't paying attention).
I recently developed a large-ish app (~34k SLOC) primarily using AI. My impression is the leverage you get out of it is exponentially proportional to the quality of your instructions, the structure of your interactions, and the amount of attention you pay to the outputs (e.g. for course-correction).
"Just like every other tool!"
The difference is the specific leverage is 10x any other "10x" tool I've encountered so far. So, just like every tool, only more so.
I think what most skeptics miss is that we shouldn't treat these as external things. If you attempt to wholly delegate some task with a poorly-specified description of the intended outcome, you're gonna have a bad time. There may be a day when these things can read our minds, but it's not today. What it CAN do is help you clarify your thinking, teach you new things, and blast through some of the drudgery. To get max leverage, we need to integrate them into our own cognitive loops.
This post was interesting to me because I also have a lot of programming experience but other than hunt the wumpus high school I haven’t programmed a game and recently started using AI to help with a new game.
AI has become three things for me:
(1) A learning tool. What it is really great at is understanding my questions when I don’t have the proper terminology. Because of this it can give me a starting point for answers. It is also really fantastic for exposing me to unknown unknowns; probably the most important thing it does for me.
(2) A tool to do boring or tedious things that I can do but slow me down. I’ve found it good enough at a variety of things like commenting code, writing a config file (that I usually edit), or other text-based adventures.
(3) Search. Just like (1), because it understands what I’m actually after, it is irrelevant if I know what a thing is actually called. I also let it filter things for me, make recommendations, etc.
I think you can let it think for you, but… why would you? It’s not as smart as you. It’s just faster and knows more things. It’s like an FPU for the CPU of your brain.
Sorry for the pedantry, but there's little evidence to suggest that neural nets know about unknown unknowns. They know a lot about known unknowns though. Really enjoyed your comment in any case :)
I commented on this before, I'm in this weird "opinion arbitrage" spot where I'm relatively skeptical by HN standards but I'm actually pushing for more usage at work. Hell, I'm typing this while I wait for Claude to be done.
The reason for my skepticism is the delta between what they're being sold as and what they actually do. All AI solutions, including agents (especially agents!), are effectively worse-than-worthless without guidance from someone experienced. There's very little that's "autonomous" about them, in fact.
The very guy who coined the term "vibe coding" went on stage effectively saying we're putting the carriage before the horse!
Omitting the important caveat that while they are fantastic tools they need to be restrained a lot is effectively lying.
My stance has long been that LLMs are currently worse than the evangelists are claiming they are, but are significantly better than the detractors and skeptics think they are.
Like most things, the truth is somewhere in the middle. But unlike many things, they are changing and advancing rapidly, so it's current state is not the resting state.
It's the same problem that crypto experiences. Almost everyone is propagating lies about the technology, even if a majority of those doing so don't understand enough to realize they're lies.
I'd argue there's more intentional lying in crypto and less value to be gained, but in both cases people who might derive real benefit from the hard truth of the matter are turning away before they enter the door due to dishonesty/misrepresentation.
My opinion is that it's about the tools you use. Bad tools, bad agentic behavior.
Better spoons, better food.
Spoons are tools you use to consume food; a better analogy would be better kitchen, better food. An induction range, quality good, nice pans, nice knives will definitely enable higher quality food. Of course you can still make crap if you don’t know what you’re doing, the point is that the tools raise the ceiling for someone who does know what they’re doing.
one level deeper... better farm (soil, water, .etc), better food... also you can easily ruin food in a top-of-the-line kitchen by over-processing, adding tainted spices, excessive heat, .etc
Owners of silver spoons tend to eat pretty well.
I've come to this same conclusion pretty strongly in the past few months in particular. I actually had negative comments on my experience with AI previously.
For all the talk of AI hitting a ceiling the latest tools have improved greatly. I'm literally doing things in hours that'd previously take weeks with little issue. I do of course have to think about the prompts and break it down to a fine grained level and i also have the AI integrated well with the IDE.
The biggest wins are the times you hit a new framework/library. Traditionally you'd go through the 'search for code samples on usage of new library/language/framework -> work those samples into a form that accomplishes your task' cycle. AI is much better for this to the extent it even often surprises me. "Oh the library has a more straightforward way to accomplish X than i thought!".
For those who are still skeptical it's time to try it again.
> I do of course have to think about the prompts and break it down to a fine grained level
This is where I’ve found usefulness falling off. Code is much more succinct and exact than English. I was never slowed down by how fast I could type (and maybe some are? I’ve watched people finger type and use the mouse excessively) but by how fast I could understand the existing systems. By the time I could write an expressive prompt in English I might as well have made the changes myself.
I’ve found it enormously useful as google on steroids or as a translator (which many changes that require code often end up being).
> This is where I’ve found usefulness falling off. Code is much more succinct and exact than English.
Depends on how you use English. If you describe all the details down to the last line of requirements — then, yeah. But actually, a lot of requirements are typical and can be compressed to things like "make a configuration page following this config type" and LLM will figure it out and put checkboxes for booleans, drop-downs for enums, and all the boilerplate that goes with them. Sometimes you have to correct this output, but it's still much faster than describing the whole thing.
How do you integrate it with your IDE? I have used intellij for 15 years, anything worse than it's actually code-aware auto complete feels like a downgrade. ie. hallucinating APIs feels totally unnecessary when my old IDE never did that. If our projects had an orderly structure, some test coverage and a reasonable way to manage database migrations, I might let the AI a lot more loose, but alas.
I use intellij. Latest versions have claude 4 as an option to enable. I almost exclusively use the code aware chat function rather than allow the ai to suggest changes as i type and i use the ability to merge changes from the ai chat window to bring things across as the responses are generated.
This leaves the autocomplete untouched. Basically it’s a way of working where the ai only jumps in when asked. It works really really well.
For me the LLMs are failing laughably to be able to generate some examples of any actively developing library, even for simple stuff. They invent non-existing APIs, use years old functions that are long deprecated, et cetera.
> I'm literally doing things in hours that'd previously take weeks with little issue.
What's an example of this? Some of the ones I see most are: converting legacy code to something modern, building a greenfield app or feature in an unfamiliar language / framework / space.
But at work I don't have these types of jobs, and I want to get this productivity speed up, but right now I'm stuck at it helps a lot but not turning weeks of work into hours, so trying to get there
I recently had a need to create educational animations. These were programmatically created using the Manim library in Python.
I'm a mobile dev by trade. The best interaction i had recently was with Python and the Manim library specifically which are not my area of expertise. This was a series of "Create an animation that shows X with a graph of the result over variables Y". AI gave a one shot successful results with good coding practices for all of this. I could have spent a week coming up to speed on that library and re-remembering all the Python syntax or i could have fought against doing it at all but instead, one hour of prompting, "here it is boss, done".
I had similar results doing some updates to the app itself too fwiw. Android dev has a lot of boilerplate. "Create a new screen to show a list of images in a recycler view". Everyone who's done Android knows the boilerplate involved in what i just stated. Again 1 shot results. Unlike the above this is something i know how to do well, i just didn't want to type 100's of lines of boilerplate.
Would that have taken you weeks though?
I imagine reading through a few articles and examples could have gotten you there. I never heard of Manim before but found these pretty quickly:
https://docs.manim.community/en/stable/examples.html
https://manimclass.com/plot-a-function-in-manim/
I am not trying to pick at you, but it feels like what I am currently able to do with AI, shave off a few hours, but not weeks.
I agree with you the ease of cutting through boilerplate is a big win, but it also doesn't register as weeks worth of work for me...
A single graph might save hours. A full feature series where each graph type has yet new syntax to learn is indeed much more. Especially when there's followups, "let's make the graph move over the left half of the screen and then the next animation shows in the right half?" which again were one shot done in minutes with AI. For me just to gain the context of how to move the animation into the left half smoothly and then move all animations that were drawn into a separate animation file into this file and reposition each element from that second file into the right half of the screen would have probably taken a day.
We tend to underestimate engineering time generally. So i wouldn't look at the above and say "that seems doable in X hours". I stand strongly by my assertion that it saved me a week (at least!) all up.
> I stand strongly by my assertion that it saved me a week (at least!) all up.
Fair enough. More power to you then. I'll keep looking for some other examples. Thanks for sharing!
Which IDE are you using?
The jetbrains collection which now have claud built in with a subscription option.
> the leverage you get out of it is exponentially proportional to the quality of your instructions, the structure of your interactions, and the amount of attention you pay to the outputs
Couldn't say it better myself. I think many people get discouraged when they don't get good results without realizing that for good results you need to learn how to interact with these AI agents, it's a skill that you can improve by using them a lot. Also some AI tools are just better than others for certain use cases, you need to find one that works best with what you're doing.
When it finally clicks for you and you realize how much value you can extract from these tools there's literally no coming back.
It's not about learning how to interact with AI agents. The only required skills for working with these tools are basic reading and writing skills any decent English speaker would have. Knowing how and when to provide additional context and breaking down problems into incremental steps are common workflows within teams, not something novel or unique to LLMs.
"Prompt" or "context engineering" is what grifters claim they can teach for a fee.
What does make a difference is what has been obvious since the advent of LLMs: domain experts get the most out of them. LLMs can be coaxed into generating almost any thinkable output, as long as they're prompted for it. Only experts will know precisely what to ask for, what not to ask for, and whether or not the output aligns with their expectations. Everyone else is winging it, and their results will always be of inferior quality, until and if these tools improve significantly.
What's dubious to me is whether experts really gain much from using LLMs. They're already good at their job. How valuable is it to use a tool that can automate the mechanical parts of what they do, while leaving them with larger tasks like ensuring that the output is actually correct? In the context of programming, it's like pairing up with a junior developer in the driver seat who can type really quickly, but will confidently make mistakes or will blindly agree with anything you say. At a certain point it becomes less frustrating and even faster to type at normal human speeds using boring old tools yourself.
> It's not about learning how to interact with AI agents. The only required skills for working with these tools are basic reading and writing skills any decent English speaker would have.
This is flatly untrue, just as the same would be untrue about getting the most out of people (but the behavioral quirks of AI systems and the ways to deal with them do not follow human psychology, so while it is inaccurate in the same way as with people, the skills needed are almost entirely unrelated.)
> The difference is the specific leverage is 10x any other "10x" tool I've encountered so far. So, just like every tool, only more so.
One of the best comparisons to me is languages.
The old "lisp [or whatever] lets us do more, faster" idea, but with a fun twist where if you can reduce the code you write but still end up with generated code in a fast, high-performance language, without the extra work you would have to do to go add type annotations or whatnot everywhere for SBCL.
But with a gotcha that you are gonna have to do a lot of double-checking on some of the less-easily/obviously-verified parts of the generated code for certain type of work.
And of course highly-expressive languages have resulted in no small number of messy, unmaintainable codebases being built by people without well-specified advance plans. So we're gonna see a lot of those, still.
BUT to me the real, even bigger win - because I spend less time writing 100% new code than integrating old and new, or trying to improve old code/make it suit new purposes, is supercharged debugging. A debugger is a huge improvement over print statements everywhere for many types of things. A machine that you can copy-paste a block of code into, and say "the output looks like [this] instead of like [that], what's going on" and get a fresh set of eyes to quickly give you some generally-good suggestions is a huge improvement over the status quo for a lot of other things as well. Especially the type of things that are hard to attach a debugger to (sql, "infrastructure as code", build scripts, etc, just to start).
In addition, there is a learning curve and a skill ceiling that is deceptively higher than people think. Also, running Claude Opus in some tight agentic harness will give very different results than asking GPT-4o in the browser and copy/pasting stuff around.
> One thing I've noticed is many (most?) people in our cohort are very skeptical of AI coding (or simply aren't paying attention).
I'd hope most devs are using AI heavily when coding, the last 6 months seem to have reached a level of competence in raw programming skill somewhere around mid- or senior- level with hilarious variance between brilliance and idiocy.
I think you might be seeing the most vocal programmers are terrified for their future prospects and there isn't much room to reason with them so they're let alone.
Fun game!
In the old days, code reuse was an aspirational goal. We had collections of functions, libraries, etc., but the overhead of reusing specific lines of code, or patterns of lines of code, was too burdensome to be practical. Many tutorials have been published on how to create a tower defense game, meaning there are tons of sample code out there for this domain.
I would ask that given the amount of source material available, when when ask an LLM to generate code, is this really "AI" of any sort, or is it really a new kind of search?
Yours is beautiful; the code is too. I'm sure you had a lot more hand than just using AI.
I stopped coding a long time ago. Recently, after a few friends insisted on trying out AI-Assistance codes and I tinkered. And all I came up was a Bubble Wrap popper, and a silencer. :-)
https://bubble-pop.oinam.com
https://void.oinam.com
Running chrome on android and the bubbles aren't popping for me. Count is staying at zero. Are you open to PRs? :-)
Yes Please. Absolutely. Feel free to send in any edits. https://github.com/oinam/bubble-pop
I think indie games could be a really good use case for coding AIs. Low stakes, fun-oriented, sounds like a match.
The first commit[0] seems to have a lot of code, but no `PROMPTS.md` yet.
For example, `EnergySystem.ts` is already present on this first commit, but later appears in the `PROMPTS.md` in a way that suggests it was made from scratch by the AI.
Can you elaborate a bit more on this part of the repository history?
[0]: https://github.com/maciej-trebacz/tower-of-time-game/commit/...
Because this was a game jam entry with one week deadline I was going pretty fast and didn't bother to use source control for the first 2-3 days of work, hence the huge initial commit. I also weren't writing down prompts as I went, only after the game was finished I went back in my chat history in the tools I used and copied all the prompts to the `PROMPTS.md` file.
If you want to follow the history of this project as it was created the best way would be to read the prompts file from top to bottom. For example the EnergySystem.ts file was created right after I was done with enemy pathfinding, spawing and tower shooting and it was created from scratch by the AI using the prompt "I want to implement an Energy subsystem where..."
You can easily let the LLM take over committing. "create atomic conventional commits until the git workspace is clean".
I have a Claude Code session open just for commiting the state once I have a piece of work complete and working.
Thanks for clarifying!
Also, good catch in using the chat history to reconstruct the first phases of work.
I believe it can be a fun experiment for others to try to reproduce it from scratch using the prompts and image assets only.
That's pretty impressive and super motivating. Love that you documented the prompts. From my experience "vibe coding" can either speed you up or slow you down. As long as you are using succinct and clear instructions and know how to review code quickly, as well as understand the architecture you can really speed up the process
This is the first I've heard of Augment Code. What does it do? Why did you pick that tool, versus alternatives? How well did it work for you? Do you recommend it?
I'd also like to hear about this—did OP use Augment Code in Cursor? How does that work/what exactly does that get you? Do you pay for both?
I've heard about Augment Code on X and what piqued my interest was their "context engine" which is a fancy way of saying they have a way of navigating big codebases and providing enough context for their LLM to execute your query. It worked really well on a medium-sized codebase in my day job where other agents would fail.
It's a VS Code extension so I'm using it inside Cursor and depending on a task I would either use Cursor's Agent mode (for simpler, more constrained tasks) or Augment Code's (for tasks that span multiple files and are more vague and/or require more steps to finish).
There are downsides though - it's more expensive than Cursor ($50 vs $20 per month) and it can be unreliable - I'm frequently getting errors/timeouts that require hitting "Try again" manually, which is frustrating. I might switch to Claude Code after my plan runs out because I've heard many good things about it recently.
Thanks for this, I made a tower defence a while ago and I had been considering applying an AI to the task of designing new waves and tuning hitpoints/speed/armour
It made me think that one of the things that it probably needs is a way to get a 'feel' for the game in motion. Perhaps a protocol for encoding visible game state into tokens is needed. With terrain, game entity positions, and any other properties visible to the player. I don't think a straight autoencoder over the whole thing would work but a game element autoencoder might as a list of tokens.
Then the game could provide an image for what the screen looks like plus tokens fed directly out of the engine to give the AI a notion of what is actually occurring. I'm not sure how much training a model would need to be able to use the tokens effectively. It's possible that the current embedding space can hold a representation of game state in a few tokens, then maybe only finetuning would be needed. You'd 'just' need a training set of game logs with measurements of how much fun people found them. There's probably some intriguing information there for whoever makes such a dataset. Identifying player preference clusters would open doors to making variants of existing games for different player types.
Thanks for sharing! This aligns with a workflow I've been converging on incorporating traceability and transparency into LLM-augmented workflows[1]. One of the big benefits I've realized is sharing and committing prompts gives significantly more insight into the original problem set out to be solved by the developer, and then it additionally shows how it morphed over time or what new challenges arose. Cool project!
[1]https://colinmilhaupt.com/posts/responsible-llm-use/
Thanks for the read! I too have over 20 years in tech and have been going back and forth with Gemini-cli to gamify some tools for integration testing some Enterprise applications and it’s amazing what can be done with Gemini alongside usage of MCP servers. I am finding positive results if I approach problems in chunks and provide clarity in prompt instructions. The AI will make mistakes and sometimes get caught up in loops for some problems (like application routing.. lol) but I am happy to step in and effectively pair program with the AI when issues are present. I notice too that it has never been a better time to enforce things like how Duplication Is Evil because otherwise the AI may make a change in one area and forget that it has similar changes to make in another file. This applies both to programming logic as well as User eXperience and application behaviour.
Anyway what a world. It would have taken me weeks to create what an AI and myself are able to whip up in a few short, and fun, hours.
Giving a personality to Gemini is also a vital feature to me. I love the portability of the GEMINI.md file so I can bring that personality onto other devices and hand-tailor it to custom specifications.
> AIs like to write a lot of code
I vibe coded a greenfield side project last weekend for the first time and I was not prepared for this. It wrote probably 5x more functions than it needed or used, and it absolutely did not trust the type definitions. It added runtime guards for so many random property accesses.
I enjoyed watching it go from taking credit for writing new files and changes, and then slowly forgetting after a few hours that it was the one that wrote it ... repeatedly calling calling it "legacy" code and assuming the intents of the original author.
But yeah, it, Claude (no idea which one), likes to be verbose!
I especially find it funny when it would load the web app in the built-in browser to check its work, and then claiming it found the problem before the page even finishes opening.
I noticed it's really obsessed with using Python tooling... in a typescript/node/npm project.
Overall it was fun and useful, but we've got a long way to go before PMs and non-engineers can write production-quality software from scratch via prompts.
In my experience Claude Sonnet is much more verbose than Claude Opus, and writes worse code as a result. The difference is pretty striking once you try using them both for the same task.
It generally feels like Opus gives you the 5th or 10th iteration, but Sonnet gives you the first possible solution.
A bug in the intro: in the first round in the first playthrough my turret destroyed one of the critter and the other reached the tower. There was no other prompt or anything happening in the game after that and had to restart. The next time, the turret did not destroy any critter, the prompt to use backspace appeared and the game progressed normally.
Interesting, I actually had the turret destroy one of the enemies several time but it didn't prevent the tutorial message from showing up. I'll look into it though, thanks for the report!
Yeah I cannot reproduce it in subsequent runs even if this happens, no idea what happened in that.
This is a pretty cool game! I love the twists of rewinding time and playing with the keyboard. It would look pretty cool on Reddit, with a level builder. Redditors could build levels to challenge each other and see who can reach a highscore on each of the UGC levels. Check out Flappy Goose and Build It on Reddit to see some examples.
Thanks! The game actually supports gamepads too but I didn't have enough time to put in proper instructions for that.
Fun game! Starred on github for making the development process transparent, including sharing your prompts! :)
I've been using Claude to do the things that are straightforward that I don't want to for about a month now. The power of these development techniques is no where near fully tapped yet from what I can see.
Oh definitely. Just wait one year. Or five. Or ten. We’re in it for a wild, wild ride.
Such a cool game! Exactly the simple TD game I've been craving for a while.
If you ever want to build this out in Unity, you should try https://www.coplay.dev/ for the AI copilot
Thanks for the game!
Very cool and I wish it lasted longer.
I’m finding incredible amusement in the idea of there being people who check-in prompts as the source code, and the “reproducible builds” people, and sitting them next to each other at a convention.
Curious if anyone here has tried rosebud.ai for something similar. I looked into it, and it did appear to break it down into steps, but can't really produce anything that runs without upgrading to a paid tier.
I couldn't stop playing this game, very engaging :) Thanks!
I didn't find any mention to the costs spent on Claude
I've mentioned this in other replies, but I actually didn't pay for Claude directly, only via Cursor with their $20/mo subscription.
I'm using Claude for my side projects. The pricing tiers are Free (for Sonnet 3.5), $20/m for Opus 4, $100/m for max (5x usage limit as $20) and then a $200 tier above that at another 5x I believe.
The usage limits reset every 6 hours.
Part of me thinks Rockstar delayed the release of GTA 6 because they realized they can polish the game by a significant margin using the latest AI tools.
How many tokens did you use up and what did you pay for them?
I can't give you the token count because I didn't really track that (Augment Code does not give you detailed token stats) and I'm not paying per token - I use the Developer plan on Augment Code ($50/mo) and Pro plan on Cursor ($20/mo). Didn't pay for additional usage and I have requests to spare on both of them.
As far as stats go from the provider dashboards I see:
- 7667 lines of Agent Edits accepted on Cursor
- 105 messages (prompts) on Augment Code
Ok folks, I need a hint. I can't ever build up enough energy to afford a second turret. What's the secret?
Focus on upgrading your first turret to level 2 before building a second one - this increases energy generation significantly and makes additional turrets affordable.
Ah!! I didn’t even realize I could upgrade. Thanks!
Um, you can’t ^^’. The key is to use the rewind power very sparingly, only as long as it’s required to destroy the enemy. This way you’ll guarantee you will be in the net positive energy generation for the first 2-3 waves. You should be able to afford the 2nd tower by the end of Wave 2.
I was barely using the backspace, I even tried letting most of the enemies through, but I never built up enough energy for a second tower.
Great game! The rewind time "skill" it's like playing an Edge of Tomorrow game
Excited to try this when I’m on a computer. Thanks for sharing everything!
How much time did it take start to finish?
It took about a week working on it on and off in my spare time. I'd say probably 25-30 hours total.
so cool
After scanning through the video, the first 20 minutes is a guy doing coding with no AI involved. He's manually designing a level in a pre-made level editor. He's manually writing code in a pre-made IDE. He's not having AI code.
At the 20 minute mark, he decides to ask the AI a question. He wants it to figure out how to prevent a menu from showing when it shouldn't. It takes him 57 seconds to type/communicate this to the AI.
He then basically just sits there for over 60 seconds while the AI analyzes the relevant code and figures it out, slowly outputting progress along the way.
After a full two minutes into this "AI assistance" process, the AI finally tells him to just call a "canBuildAtCurrentPosition" method when a button is pressed, which is a method that already exists, to switch on whether the menu should be shown or not.
The AI also then tries to do something with running the game to test if that change works, even though in the context he provided he told it to never try to run it, so he has to forcefully stop the AI from continuing to spend more time running, and he has to edit a context file to be even more explicit about how the AI should not do that. He's frustrated, saying "how many times do I have to tell it to not do that".
So, his first use of AI in 20 minutes of coding, is an over two minute long process, for the AI to tell him to just call a method that already existed when a button is pressed. A single line change. A change which you could trivially do in < 5 seconds if you were just aware of what code existed in your project.
About what I expected.
Congratulations, you've managed to spot all the pain/weak points of the whole ~30 hour process without ever noticing all the good points. It takes a special type of skill to do so :).
> About what I expected.
But then again, you opened the video already with an expectation to see failure, so of course you found it.
One could find a 5-minute slice of any highly successful project I’ve worked on where my actions look foolish and my tools look broken.
Isolating your analysis of this to a single unflattering interaction is intellectually dishonest; you have a bone to pick.
[dead]
Why does this and the follow up comments feel like a sneaky ad for this “Augment Code” tool?
You might want to read a reply I just posted to another commenter about what I dislike about this tool and how I'll probably switch to a different one [0]
[0] https://news.ycombinator.com/item?id=44463967#44465304
Maybe because it stands out against the other (very well known) tools mentioned in the readme
I came back here specifically to ask about Augment Code, after going to their website and not really understanding what it is
[flagged]
Did you actually try the game? It has a pretty unique gameplay element to it. I'd say that's where they passion and creativity comes in.
Isn't creativity always impressive if done well?
Code is a medium, painting is a medium, piano is a medium, and prompting is a medium.
This sounds a little bitter.
Prompting is telling someone or something else to do something for you. Is management creative?
The most creative aspect of this is the idea. The rest is boring.
Being precise with language and defining specifications based on domain knowledge is generally creative. The better analogy is product design rather than product management.
Please tell me - how exactly you think is typing English prose into a box different vs typing a bunch of english keywords and symbols in the exact correct order into a box? To me these are both conceptually the same. But typing English to me is an order of magnitude faster than typing code, especially in a domain that's new to me (like game dev).
> What did _you_ do? How were _you_ creative? Where in this project is _your_ passion?
My passion is, and was never in the act of typing code. It was put in creating something that's useful or, in this case - fun. It's the end product that matters to me. I've put a lot of thinking into how exactly each element should work, look and behave. Then I verbalized all this thinking and created prompts that got turned into code, which resulted in a game that's apparently fun for people to play. And that's it.
Thank you for taking time to check my project.
Did you even look at the content of the post?
There's a prompts.md (link: https://github.com/maciej-trebacz/tower-of-time-game/blob/ma...) that shows what the author did in this project, most of which is providing creative direction and corrective nudges to the AI.
Most people only care about the end result, and vibe coding got the author there much faster and with less effort.
Your comment reads like a carriage driver bemoaning car drivers because they didn't have to feed, groom, harness, and command a team of horses, yet still arrived successfully at their destination.
"Why should I be impressed that you turned a wheel and pushed a pedal?"
It's just a bunch of ai bots upvoting
[flagged]
This entire account is AI generated comments. Huh.
[flagged]
I am so relieved to hear this. I know this is a seemingly random request, but could you please list top 10 cities in the world that have the most turtle-like names? It's an urgent issue, matter of life and death.
Putting an emdash in a comment about how human you are is a bold move!