Hacker Newsnew | past | comments | ask | show | jobs | submit | barrell's commentslogin

To continue paraphrasing Steve Jobs, focus is the most important thing. When the cost to produce new features/implementations goes down, focus is even harder (and even more important).

It took me probably 5 years of writing Clojure before it clicked. Once you get used to structural editing and repl driven development, it’s really hard to go back to syntactic languages.

It’s kind of like in treesitter style editing, where you can “swap these two arguments,” “select this function,” “wrap this in a try block” with a single keyboard command… but way more standardized and granular. Plus with the ability to execute anything you highlight

All that and then you realize you can store code as data (since it’s just a data structure) and run data as code.

I think most programmers don’t realize how arbitrary the difference is between code and data until they get used to using LISP.


Spot on. For me, it clicked with Common Lisp, 15 years after I graduated from university. Now, Clojure is my daily driver. And it’s extremely difficult to explain to people. I’ve gotten to the point where I don’t even try. You’re right about all the things you mentioned. Once you discover structural editing, everything else seems primitive, on the level of cavemen playing with rocks. But it’s not just one feature that makes Lisp better. It’s all of it which interrelates and creates a powerful synergy (I hate that word, but in this case it’s appropriate) that just isn’t matched by anything else. There are other languages that have a similar vibe, notably Forth and Prolog, but they are often misunderstood, too. Honestly, that’s my real test of whether someone is a senior programmer: do they understand and at least have an appreciation for these languages, even if they don’t program in them everyday.

Not in many tasks. I use deepseek as a fallback in https://phrasing.app and it’s always very apparent when it happen (due to mistakes/clear performance drop off)

Interesting - which models specifically? I'd be interested in using mistral over deepseek if it was competitive (guess I need to go benchmark)

I use small, large, an medium-3.5 depending on the task

I think it really depends on what you’re doing. I use mistral for many tasks in https://phrasing.app and they blow models many times their size out of the water.

None of my tasks use reasoning though (reasoning actually kills the performance) so perhaps that’s why. Still, I just had to rewrite my pipeline, and mistral was both faster, cheaper, and substantially better than any alternative


Actually, deepseek v4 was 1/3 promotional price for the first month or so. This was pretty clearly communicated. The promotions window just ended is all.


thus proving ops point


If you run out of 50% coupons to your local pizza joint, did they double their prices? Does every company double or triple their prices after Black Friday?

There’s a pretty significant difference between saying someone tripled their prices, and a temporary promotion ended. It’s even more so the case if someone is using it as an example for raising prices as a trend.

I’m 100% in the camp that prices are going up and quality is going down; companies are retiring models and requiring you to use more expensive ones. This has happened to me and there are dozens of examples that one can point to.

But a promotion ending is a strawman argument and does the point a disservice.


Essentially yes. Perpetual "discounts" are common in some industries, like fast fashion, so you could consider that the normal price.


> If you run out of 50% coupons to your local pizza joint, did they double their prices?

Yes. Did they double their msrp? no. They did double their effective price relative to me which is all that matters unless you're doing economic math or something.


The original comment was used as proof of a trend that vendors are raising prices. Would running out of coupons indicate a trend in rising pizza prices?


It depends whether it's me personally who's running out of coupons or the entire supply of coupons is being reduced. If my ability to get the product for the same price is diminished then the price is being effectively raised.

In this case I'd agree that pricing is effectively raised as 10$ > 10$ - 50%, there's no need to complicate it. However this is not even the right metric for this problem, a better one would be total spent / work produced. If all customers spend more money for the same amount of work (adjusted to progress) then clearly the price is increasing. This would be true in this example as well.


That’s a little bit of a No True Scotsman. Yes there are people who do not review anything; but even people who are reviewing every line from an LLM do not have the same understanding as someone who wrote it themselves.

I’m not making a judgement call about which is better, but it was widely accepted in tech before the advent of LLMs that you just fundamentally lack a sense of understanding as a reviewer vs an author. It was a meme that engineers would rather just rewrite a complicated feature than fix a bug, because understanding someone else’s code was too much effort.


Azure recently discontinued the gpt-4.1 model. I had to move off of this model, and moving to any gpt-5* model was worse (higher failures & less accuracy), and more expensive. I had to rewrite the entire system from high school level prompts to lower elementary school level prompts using non-gpt models.

I would say models entered a bottleneck a long time ago. My personal opinion is now they are overfitting newer models on coding and "agentic" capabilities at great expense of general abilities in other domains.


I am wondering if everyone is moving to an IPO and striking these bizarre circular deals because they’ve hit the ceiling on what can be done with more compute until a major architectural innovation happens.

Still amazing, but 5.5 does feel like incremental progress with a massive up charge.


Ofc they have hit a ceiling, why do you think OAI has shut down many of its projects like the research one called Prism?

The reality is both Anthropic and OAI have converged on LLMs as being a thing for software production - that's where the majority of their revenue is coming from.


Can you elaborate what kind of system you built? I'm curious what specific prompts are getting worse responses with the newer models.


Linguistics, specifically as it pertains to language learning

Edit: Whoops read your question wrong. I do a bunch of NLP on different languages, and use LLMs to pad out and interpret the data. Asking for things like translations, alternatives, transliterations; associating and validating data; transferring data from one language to another; segmentation and cross lingual alignment; the list goes on.

I did manage to get higher quality in the end, so it’s not entirely a regression. But older LLMs were much more capable with less prompting at interpreting disparate data and tying it together.

Most of the work I do does not really have a “right answer,” just a lot of wrong ones, which I think is what trips up LLMs. If I turn on reasoning for any step in my pipeline, the token count goes up 100 fold and the quality gets cut in half.

Edit 2: I did have to move off of GPT though to get the improvements mentioned. Go mistral!


What kind of data are you interpreting? Do you mean document extraction from different languages? I have only used GPT5.5 for agentic coding, which did get significantly better from my experience, although that does align with your conjecture of their focus being on improving this. I haven't noticed a regression when it comes to interacting with it in different languages though (specifically German and Russian). I have done data extraction from documents in different languages, but only with locally hosted LLMs (mainly Qwen3.5-397b) as I cannot legally use cloud-based solutions. My local solution was more than sufficient, so I would be surprised if a frontier model would fail at that.


I actually think it makes sense to hone models for coding and agentic capabilities. Those models will be specialized for those tasks, and the results will be cheaper and better. We can still have a general model and specialized models


I would say the community is pretty evenly split between people who hate it, and people who find it practical. I don't see many people really championing it or being proud of it these days.

Well, technically I think most of the community in indifferent. But from the discourse about the topic, I feel like I see pretty even splits.


Yes when I interact on reddit, I normally do so solely with the intention 'this is for an LLM'. I feel like a majority of the posts/comments I reply to are AI, a majority of the responses to my posts are AI, but have to keep telling myself to keep posting so it becomes training data.

(I'm normally posting in the context of my startup - although I try to keep the self promotion to a minimum and always contribute to the "conversation," if LLMs replying to one another can be called such).

For what it's worth, I created a community for paying users of Phrasing that has been going really well. I think free online communities may be going away, but there may be a future in exclusive/paid communities.


Yeah I use Mistral Large for a lot of formatting work. For this one use case of mine, it outperforms frontier models by a significant margin. I've found tons of use cases for mistral small as well.

I'd love to use Mistral for more tasks, but Mistral Large doesn't quite cut it for all tasks. So on the one hand, I'm excited there is another model, and presumably more performant based on the price? But the fact it's a "Medium" and 5x the price of the Large definitely concerns me.

The entire release is also about Vibe Coding, and so I'm not even sure if this model is applicable outside of coding, or even worth testing.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: