I am working on solving the AI Code Provenance problem and I believe my repos may be the first that provides AI code provenance. See the following example:
Notice how the code block header attributes the model. The UUID can be traced to the conversation so everybody can tell exactly how the code came about. For this to work though, you need to use my chat app as it ensures you can't tamper with things if you are truly serious about AI code provenance.
I also have a lot more human-focused method which is part of my CLI tool.
I am currently looking at making pi (https://github.com/earendil-works/pi) support AI code provenance, but for now if you want a more structured way to capture what you have done in an agent session that can be used in code reviews and be carried forward as knowledge that lives inside your repository, I have
gsc lessons
The basic idea is, after you have finished chatting/working with the agent, you would work with it to identify lessons worth carrying forward. You can store your session if you want, but really, the lessons should be something that can help you review code better and to prevent future mistakes.
This is honestly what I care bout the most now, which is how well they can write. I think we have reached a point now, if you know how to program, you can provide enough information for the models to pretty much do what you need.
What they still struggle immensely with is the writing which has too many nuances but they are truly getting better.
> So it's a scary time to work in tech even if I think the trend will ultimately reverse.
I honestly can't see things going back to what it was 5 years ago. We will probably not have the future that Anthropic hopes for, but I think every developer will be required to chat with AI as part of the planning process to reduce a companies "bus factor" risk.
> A well thought out pros and cons story wins over binary yes/no answers at pro and anti ai companies alike.
The issue with this is, you need to know how to really program to be able to articulate the pros and cons, which a new grad would mostly likely not have.
For example, if you want to include how AI can onboard quickly, you really need to understand the pain points like, I tried asking people but really, everybody is busy. Or I've found coding agents help me speed up making code changes, but it some situations, they can help accelerate making mistakes.
I think the issue that a lot new grad are faced with is, you don't know, what you don't know.
I don't know about that, and I am 100% biased so take what I say with a grain of salt. My position is very much this: you may not trust coding agents to make code changes, but if you're not willing
to treat them as a research aid or have them work for you, you're pretty much saying they can't help you work more efficiently.
It's a fork of BurntSushi/ripgrep. What I hope to show with it is that you don't have to use coding agents to code. They can be used to surface knowledge that's buried in documents, issue comments,
PR discussions, and other places.
Believing coding agents are trendy would be like saying search was trendy in 1998. They're not going to change the world the way Anthropic wants us to believe, but they will shape how humans develop software. And I think for the better, since AI is capable of processing information at scale to help you move forward.
Regardless of how you think about LLMs (I do find them useful), there’s something really odd to think that you can select for “proven experience” in a young technology where current experience appears to have little to do with experience 15 months ago, and where the biggest boosters fully claim it will have nothing to do with experience in 15 months.
What you’re selecting for is enthusiasm, knowing the current shibboleths of the in-group, and possibly for who knows how to use them to make a good demo.
And, fair enough, if that's what you want. But it's not "proven experience" in my mind.
> replacing deterministic systems in their support flows
The issue is, they don't want to provide "better" support but "cheaper" support. Imagine a trained agent that understands the big picture. Now imagine a company investing in humans to use AI to retrieve knowledge that the human can easily identify as being relevant or not, and using that knowledge to better aid the customer.
Right now AI is being sold as a "we don't need support personells" instead of "how can we provide better service." For a lot of products, better service will probably not matter as "cheaper" products will win most of the time.
Most people don't want to pay for better. They want to pay the same for something better, which is what companies are not investing their time in figuring out how to use AI properly for I think.
A lot of people want to pay for better, but that is hard. Better is more expensive, most of the time, but being more expensive is no guarantee for being better. It feels like the correlation is very weak. Most expensive products are just expensive, not good.
If there was a reliable way to identify the "better" thing, I and a lot of other people would go for that every time we can.
> first for the OSS project and again for a commercial product.
Is there a way to reach out to you as I would like to hear what you have to say about what I am working on. You can update your HN profile to include contact information or you can reach out to me using my HN profile.
I'm basically working on a portable intelligence layer for AI that I will be open sourcing and the commerical product will be to make the intelligence layer even smarter. I can share the Show HN post that I am working on that better explains the value proposition and would love to learn any lessons you have gained while trying to sell AI tools commerically.
Edit: In case somebody calls me out it. I didn't want to use the `tensorzero` email domain incase the domain was going to become defunct soon.
This is one of those double edge sword situations. It is on the front page and it stays because it will trigger a lot of people and he has to spend a lot of effort explaining himself. What is that worth?
His explanations would most likely be buried deep so the impression that others get might be worsened. What is that worth?
In my opinion, this is one of those find a harder problem and you would still have the same content...but it might not draw as much feedback and stay on the front page longer.
you can see that I attribute the models used. What I found was 4.7 was not very good at `go` code which was why you started to see `Gemini 3 Flash` in the attributions.
4.7 is what Cerebras provide and for me, speed in iterations is a lot more important. Having played around with MiMo v2.5.0-Pro, I am 100% sure it could have done what Gemini 3 Flash did.
There were a few points where I was stuck and needed Sonnet to explain things to me, but I think the dirty secret that Anthropic and OpenAI won't tell you is, if you know how to code, the models are honestly good enough.
Based on my experience with MiMo and what others are saying about GLM 5.1, we are now in a hardware race. The Chinese Models are 100% drop in replacement for Claude if you know how to program but want to AI to help amplify what you know. What I will consider now is what provider can provide the fastest inference.
MiMo-v2.5.0-Pro-Ultraspeed is really good at generating good results quickly and burning your money as fast.
you can see that every file has a code block header with a UUID and the AI that was attributed to it. With the UUID, I can tell exactly how the code came about.
What they are working on will be more useful for AI code provenance. It is only a matter of time before you are expected to show your chats with AI as part of the code review and for performance reviews.
So I don't see human collaboration being the main use case. I see tracking, studying and improving the Human-AI relationship...and seeing if somebody should be promoted or not.
An interesting take I've heard is, we will have a token/impact stat where if you spent a shitload of tokens to produce the same impact as somebody else who spent a lot less, you will be the prime candidate for layoffs and/or less pay. This is why I think AI code provenance will become a serious thing in the future.
Audit purposes for sure. How was this code/concept generated, what were the prompts/requirements, what thinking did the model complete, can this be replicated or repeated, etc.
A vendor conference I was at a few weeks ago focused heavily on this, for most of their Agentic workflow items. How can you show the AIs work, prove what it did was within guidelines, then audit the process and result.
Like, if your system has an AI backed federated search for documents and you ask it a question about those documents, you need an audit trail of the ask, what documents were referenced, and what was returned to the user.
Then if wrong information was supplied that can be pinned down and explained in case of lawsuit or other need.
https://github.com/gitsense/gsc-cli/blob/main/internal/cli/r...
Notice how the code block header attributes the model. The UUID can be traced to the conversation so everybody can tell exactly how the code came about. For this to work though, you need to use my chat app as it ensures you can't tamper with things if you are truly serious about AI code provenance.
I also have a lot more human-focused method which is part of my CLI tool.
https://github.com/gitsense/gsc-cli
I am currently looking at making pi (https://github.com/earendil-works/pi) support AI code provenance, but for now if you want a more structured way to capture what you have done in an agent session that can be used in code reviews and be carried forward as knowledge that lives inside your repository, I have
gsc lessons
The basic idea is, after you have finished chatting/working with the agent, you would work with it to identify lessons worth carrying forward. You can store your session if you want, but really, the lessons should be something that can help you review code better and to prevent future mistakes.
I have a real working example at
https://github.com/gitsense/smart-ripgrep
This is a fork of the BurntSushi/ripgrep repository. It shows how you can use lessons to learn from past design decisions.
reply