Hacker Newsnew | past | comments | ask | show | jobs | submit | SeriousM's commentslogin

I found myself in a similar workflow. Depending on the task at hand (starting a new project, enhancement, maintenance), I let the agent create/read the markdown files that I keep updated (AGENT, STATE, ROADMAP, DESIGN, ARCHITECTURE, (CODESTYLE if I plan to modi it myself)). Then I choose the various roles that I need in this session and and have a planning phase. After that, the agent is starting implement the changes and I have a manual correction phase.

This flow works for my needs, building idea demos, prototypes or tools for my own sake. I don't let agent code in our main code base where everything is still hand tailored. That's a conscious decision.

I noticed that the cheaper models (flash, ...) are quite hard to hold back changing files. A question for possible options sometimes results in "yes, I'll go with option A" without asking back. Frontier models on the other hand love to plan and ask you deliberately for your consent.

I use pi.dev with almost no skills at all to understand how models really work and "feel" to work with.


> A “thinker” who doesn’t write, who skips the step of “merely” synthesizing their vague thoughts into prose, is not thinking. And then these people give their noise to the AI.

OP is quite good with words and has a high standard and world view. The reason why people use AI to manifest their ideas is probably because they have no other way communicate otherwise.

It's a medium to pack the idea into "something" that represents the idea. It was never about a finished and polished product. It's the sign language for deaf people - a way to show your thoughts.

I'm certain that the people presenting their github repo do put quite some effort (= prompt work) into it, which IS the thinking process. At the end of the day, most developers are introverts that can think very well but have hard times with soft skills.

Everyone wants to be proud of his work, let us don't blame them how the show it off.


> The reason why people use AI to manifest their ideas is probably because they have no other way communicate otherwise.

Isn't this a bit circular? They're not communicating to the AI through a BCI.


>It's a medium to pack the idea into "something" that represents the idea. It was never about a finished and polished product. It's the sign language for deaf people - a way to show your thoughts.

Sign language is a fully fledged language, as capable of expressing deep and complex thoughts as spoken English. Likening it to some kind of prosthetic for second-rate thinkers is insulting.


The reason why people use AI to manifest their ideas is probably because they have no other way communicate otherwise.

What?! This is nonsense. You’re really making the argument that most people getting LLMs to write for them just couldn’t communicate in any way five years ago?


The five-years-ago internet was certainly full of incoherently expressed ideas (and still is now). For some people AI is just spellcheck on the sentence/paragraph level.

As a reader, I appreciate reading writing that lacks large amounts of spelling mistakes. Everyone agreeing on spelling seems like a useful monoculture, like driving on the same side of the road.

But I don't feel the same way about AI writing. It feels totally different in a way that good spelling does not.

Even if I liked the style, I would object strongly to that style quickly becoming a monoculture.

We're on a path to a style optimized for shallow attention maximization becoming the majority of text we read.


Thanks! That's exactly what we need for our 6 ppl team.

I would love to see the push10k on android playstore/fdroid. It looks so inviting and motivating that I searched for equivalent alternatives for android but found none! Would you, maybe, publish it there as well, pleeeease?

Could we please stop putting price tags on 15-commit repos? It's just crazy that every idea, created with ai, now costs 10$ or more per month, despite it costs 5$ to create.

Yes. In my case (and I guess everyones usecase is subjective) my system prompt states to read the AGENT.md file when possible.

On a new project I usually set up the context of the model (language to use, reason of the product/prototype, etc.) and then I tell the LLM to write a AGENT.md, STATE.md and ROADMAP.md. I don't tell the LLM what's in there because the model has it's own directive and flavor what should be in these files. The models already know the purpose of these files by themself! On a new session, I let the agent read the markdown files in order to continue with the work. Before a session ends, I let the LLM update the markdown files. Maybe one word of caution: don't switch models - it's like putting another person on a working station and ask them to continue the work of others.

Easy setup, really good outcome!


I do something similar. I don’t necessarily call it AGENT or STATE and every project has its own files. I have architecture documents that accompany change log descriptions that load technical knowledge that the agent can readily use.

I find it also necessary to have a principles document outlining the particular problems that the software is supposed to solve and guard rails to not cross. I call it promise driven development.


Thank you for sharing your usecase! I like your product very much!

Could you talk a bit how you did the finetuning? Did you use unsloth or any other tool and how went the verification to proof the outcome?


Thank you!

Yea absolutely, but man, where to even start, it is very specific.

Fundementally I didn't use any wrappers like unsloth or axolotl, although I have used the latter before a year or two back and it was good, but I needed something very very custom. I also wanted the whole fine tuning pipeline to exported OpenVino model to be seamless.

I heavily leaned on codex, claude and some manual sleuthing around the internet to understand what I needed. I'd played about with QLoRA finetuning with axolotl before and felt most comfortable with that. So I needed to keep everything as stripped down as possible and figured I can just utilise the 3 main huggingface libraries (transformers, peft and datasets) and also bitsandbytes (as suggested by claude to quantize the model to keep this working on my GPU) along with some custom scripts generated by claude/codex (each cross referencing each other) that will do the different stages of the training run.

The next part was the data. Obviously didn't have access to thousands of meetings and associated output documents but I did have a 3090ti sitting there and a codex subscription. So I set about working out what format I needed the data in (many thanks again, to claude/codex) and started generating hundreds of different transcripts, different amounts of speakers, content, tones, subjects, spelling mistakes - like all the different things you could think a meeting would have. Then it's a case of actually generating a good meeting document off the back of the transcripts and creating the "gold standard" that we'd use.

I'm going to gloss over a lot here as I'd rather not detail it as it relates to some propriatary stuff that I had to work through, but you basically pair the transcripts together and run the training.

At the verification stage, there was pretty much 3 things:

1. "just" do some regex string matching to see if there's any of the source transcript key facts in the output to ensure fact preservation. Same with owner fabrication (who said what), I don't want something attributed to someone when it wasn't them that said it and then finally markdown validation.

2. Using codex/claude to validate the transcript and output from the model - I used the latest frontier models, probably overkill for my task, but they were good at the job

3. Finally me going through some actual recordings of myself, groups, meetings and manually verifiying the output

So a fair bit of work, and for context I'm on version 10 now, so it's been a journey!


Thank you so much for the insight, I really appreciate it!

And its fuel are github copilot ai credits, burned in 2 days. Good luck habing a running business on scout and your credits are gone.

Don't worry, they'll make it borderline free at the start, let everyone integrate into their org, and then jack up the price after just like they do with everything else.

And when you had a tool call that asked the user for the next step, you could easily run a whole day with 4c. Guess how the people did 5k $ worth of token with 100$ spent.


It's about the comments that are currently exploding.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: