More

jenadine · 2026-06-18T05:55:10 1781762110

> The difference between “jmp $+15” and “jmp $+16” is inscrutable

I don't see why that's the case. LLM trained on binary would totally see it, not?

Also the tool can also be running the test and a debugger.

klodolph · 2026-06-18T06:08:56 1781762936

> I don't see why that's the case. LLM trained on binary would totally see it, not?

It would not. You find the correct version by counting the number of bytes to the destination. LLMs are famously bad at this kind of problem (counting).

> Also the tool can also be running the test and a debugger.

The test needs to provide a good amount of signal. That’s too hard if you are throwing machine code at the wall.

In order for debuggers to work, you need some kind of model that describes what the code should do and what state the computer should be in after each instruction. That model is high-level code.

I can understand the intuitive appeal of training LLMs with machine code, but all of my experience with LLMs suggest that they are incredibly ill-suited to the task, and we just don’t have the capacity to train them to make useful machine code.

zx8080 · 2026-06-18T06:22:48 1781763768

Can "LLMs are bad at counting" be generalized to "LLM are better in complex stuff but make more mistakes in simple"?

fluoridation · 2026-06-18T06:35:46 1781764546

I would phrase it as "LLMs are good at big picture stuff and bad at fine detail", or to put it another way, they're accurate, but imprecise and with low reproducibility.

bregma · 2026-06-18T10:14:14 1781777654

It is my experience that it's the opposite. LLMs are very very precise but wildly inaccurate. They might give you 17 significant digits but be off by 10 orders of magnitude, to use a metaphor.

fluoridation · 2026-06-18T15:28:26 1781796506

Sounds like we're in agreement, then. The 7 digits it got correct are the big picture, and the rest are the details. Are you disagreeing with my statement or with my usage of "accurate" and "precise"?

benj111 · 2026-06-18T10:11:23 1781777483

But where does that leave us when programmers treat themselves as architects with the AI doing the drudge work? As seems to be the fashion.

It then means you have 2 parties focussing on the big picture and no one focussing on the details.

fluoridation · 2026-06-18T18:21:12 1781806872

I said "big picture stuff", but I guess I should have said "broad strokes". The truly correct answer is probably similar to what the model will answer, and if your problem is such that it can work with small imperfections in a solution, then the LLM helps. If the solution needs to be exactly right, then it will probably fail.

Yesterday on a whim I tried asking a local model a question about kanji that look different in different fonts despite being the same character (to the point of strokes appearing in completely different directions), and the model hallucinated imgur links to images of the characters. If imgur could work with approximate references to data maybe that would have worked.

ozlikethewizard · 2026-06-18T06:33:58 1781764438

Its more LLMs are better at vague problems with multiple non perfect solutions, and struggle at problems that require precision.

klodolph · 2026-06-18T06:31:45 1781764305

No, I don’t think so. LLMs are good at a lot of simple tasks, but bad at certain simple tasks. Moravec’s paradox in a new iteration.

It applies to humans too. Calculus is “simple” but it takes something like sixteen years to train a human to do it, if all goes well. Meanwhile, most humans think that inverse kinematics is, like, the easiest thing in the world (it’s a super complicated task).

fluoridation · 2026-06-18T06:53:30 1781765610

Calculus is definitely the harder task, considering it took a species developing the cognitive capacity for symbolic reasoning for it to show up, whereas any animal can figure out how to position its limbs. Yeah, we figured out how to make CAS programs before inverse kinematics software, but that's because computers were made to solve numerical problems, not to replace the cerebella of chordates.

klodolph · 2026-06-18T12:30:46 1781785846

> Calculus is definitely the harder task,

You’re only evaluating “harder” or “easier” based on the perspective of somebody who has a mammalian brain with millions of years of selective pressure to make it suitable for solving inverse kinematics problems.

The point here is that when we start constructing agents or tools with different architectures to ourselves, it makes sense to reevaluate notions of whether something is ‘hard’ or ‘easy’. LLMs are bad at counting not because counting is hard, but because their architecture makes it hard.

fluoridation · 2026-06-18T15:48:51 1781797731

I'm evaluating them using an objective metric, which is how long each took to arise in the universe. It could have never been the case that calculus arose before inverse kinematics, because a thing like that could not interact with the real world.

Also, I suspect you're comparing dissimilar things, because in one case you're looking at a brain doing both inverse kinematics and "calculus" (sense 1), and in the other you're looking at a computer doing both inverse kinematics and "calculus" (sense 2). The kind of calculus a CAS does is not the same kind that a human does. It's less versatile, for one.

>The point here is that when we start constructing agents or tools with different architectures to ourselves, it makes sense to reevaluate notions of whether something is ‘hard’ or ‘easy’.

Well, no, because when someone says that calculus is hard and moving their arms is easy, they're not talking about how hard it was to create each functionality, they're talking about how hard it is to employ each. We would need to ask a computer how hard it thinks the tasks it does are to do.

klodolph · 2026-06-18T17:19:25 1781803165

> I'm evaluating them using an objective metric,

I don’t think the metric is at all reasonable, and the fact that it’s “objective” doesn’t make up for its other shortcomings. I don’t think we have a basis for agreement here—I think you’ve framed the argument in a way that supports a “calculus is hard” conclusion merely by defining “hard” in such a way that supports your conclusion from the start, but I think that approach is only useful as a way to win an argument, and we’ve failed to share ideas once you start using that tactic.

fluoridation · 2026-06-18T17:58:45 1781805525

>I think you’ve framed the argument in a way that supports a “calculus is hard” conclusion merely by defining “hard” in such a way that supports your conclusion from the start

It seems to me you're the one who first did that by equivocating what is easier to do and what is easier to make a machine do.

>we’ve failed to share ideas once you start using that tactic

Well, I certainly don't agree with that.

dezgeg · 2026-06-18T08:49:24 1781772564

Even if it could, it would be ridiculously token inefficient to update huge amount of addresses instead when some small change is done to the middle of a binary

jenadine · 2026-06-13T15:58:43 1781366323

Web UI is fine for web applications, obviously.

But for desktop applications it is bloated, a big attack surface.

HTML/CSS is made for online documents, and using it for applications is a bit hack that happen to work, but hides a huge ton of complexity behind frameworks and frameworks of frameworks with leaky abstractions and each their own caveat.

ngruhn · 2026-06-13T18:57:10 1781377030

> a big attack surface

Wdym? At least web apps are sandboxed by default in contrast to native.

pjmlp · 2026-06-14T08:02:10 1781424130

Depends on which OS we're talking about.

pyth0 · 2026-06-14T11:40:51 1781437251

Under which OS would a website not be sandboxed? It's the browser doing the sandboxing, not the OS.

pjmlp · 2026-06-14T13:24:44 1781443484

I am talking about native applications.

tapirl · 2026-06-13T16:06:53 1781366813

web UI is slow, this is only reason when I don't it.

thomashabets2 · 2026-06-13T22:24:22 1781389462

I'm not a frontend developer, but seems fast to me. I'm surprised that the UI portion of this example takes so little CPU: https://youtu.be/7k0JNT6itaI

Now, the rest of the DSP code sure is faster in native.

What are examples where web UI is too slow for you?

Or do you mean large apps written in JS, which is a different topic?

tapirl · 2026-06-15T16:30:40 1781541040

Slow means many: * long program launch times * inconsistent frame rates during runtime * noticeable lag in user interaction

thomashabets2 · 2026-06-16T08:56:08 1781600168

That's neither examples nor a clarification if you're talking about JS or WASM.

tapirl · 2026-06-16T11:22:07 1781608927

No specific examples. Just general GUI apps.

jenadine · 2026-06-09T04:35:25 1780979725

But it is controlled for the wrong criterias. "Natural" doesn't mean healthy or good for the environment. It is only greenwashing and "appeal to nature" fallacy

kakacik · 2026-06-09T10:27:02 1781000822

Strong claims require strong evidence, not just throwing random fashionable words others are using. Your claims are very strong.

jenadine · 2026-06-09T16:13:13 1781021593

Do you need evidence that "Natural doesn't mean healthy or good for the environment"?

Asbestos, is "natural". So is Arsenic. And CO2. https://en.wikipedia.org/wiki/Appeal_to_nature

Or do you need evidence that the bio labels are not optimizing for health or environment? Check the rules. Most of them are just there to restrict synthetic products, regardless of their impact.

https://en.wikipedia.org/wiki/Organic_certification#False_as...

jenadine · 2026-06-09T04:25:11 1780979111

> strict Spanish regulation for organic produce.

Organic labels are a different thing than official regulation though. IMHO organic labels optimize for the wrong things.

tfourb · 2026-06-09T04:49:10 1780980550

There is an official eu organic label. It’s not compulsory of course, but it’s the baseline for organic food production in and for Europe. Other (private) labels have stricter rules and are usually certified in addition to the EU label.

franciscop · 2026-06-09T05:58:21 1780984701

No, this is definitely an official gvmt body that can fine you if you try to sell fruit as organic that doesn't follow the regulations. It IS definitely compulsory if you mark your produce as organic.

lukan · 2026-06-09T06:35:49 1780986949

"IMHO organic labels optimize for the wrong things."

What do you mean?

I only know of "Demeter", that also has some very esoteric requirements (homeopathy, cosmic energy flow rituals) - but otherwise organic label optimize for:

- no or little pesticides and herbicides

- more space and better condition for the animals

My only other grievance is that they also all ban GMO

jenadine · 2026-06-09T10:16:50 1781000210

They optimise for natural. So you can still have pesticides and herbicides. If you find your poison in some plant, it is fine. If you synthetize the same molecule in a factory, then it's not allowed.

As for the animal welfare, true, but there are also labels specifically for that that.

culi · 2026-06-11T23:17:24 1781219844

Biodynamic at least requires farms to produce their own fertilizers. For that reason alone I try to buy it. Fertilizer dependency will be the end of us

I ignore all the magic stuff (in fact, if you have some spiritual devotion to the food you're growing I think that's just fine)

fsflover · 2026-06-09T08:18:12 1780993092

> no or little pesticides and herbicides

From Wikipedia:

> Pesticides are allowed as long as they are not synthetic.[28]

See also:

https://news.ycombinator.com/item?id=48451194

lukan · 2026-06-09T09:43:25 1780998205

Am aware of that, bug glyphosat is definitely not allowed and likely a result of neighbors spraying plentiful in bad wind conditions (there are strict regulations in theory, that are usually ignored in reality)

jenadine · 2026-05-05T02:03:11 1777946591

Because users and community contributors most likely already have an account, are familiar with the UI.

There is also the "gamification" aspect that GitHub have. Doesn't motivate me personally, but could have effect on some others.

Projects on GitHub gets a lot more visibility. To the point that many projects that do not use GitHub as their main forge are still often mirroring their repository there, and have to deal with double source of bug reports or pr.

tardedmeme · 2026-05-05T05:18:35 1777958315

If it's just about the account, this can be solved by using their existing account (log in with Google/Apple/GitHub/email) or no account at all.

jenadine · 2026-04-30T05:18:28 1777526308

I guess the point is that there is no need for humans to read the code.

How often do you read assembly to check what your compiler is doing?

There is a niche of people doing it when they have special constraints, but that's a tiny niche.

swiftcoder · 2026-04-30T07:17:00 1777533420

> How often do you read assembly to check what your compiler is doing?

The difference is my compiler is more-or-less deterministic, and tends to do exactly what the specification provided to it (the source code) says. LLMs do not currently fulfil either of those criteria

jenadine · 2026-04-30T04:24:28 1777523068

Right. Perhaps now, a parental filter could be an AI whose prompt is dictated by the parents, which can look at the contents before validating it.

jenadine · 2026-04-30T04:13:43 1777522423

I have hard time imagining what is that argument, that apply to the thing you mention but that doesn't apply to hardcore pornography.

Or do you also think we should forbid hardcore pornography also for adults?

traderj0e · 2026-04-30T16:42:22 1777567342

Swear words and violence don't cause addiction, alcohol can but it's way less likely and also easier to restrict... idk why a kid should have cigs even once though

thin_carapace · 2026-04-30T23:30:37 1777591837

there may be valid use cases in certain demographics eg the disabled. to me it is evidently advantageous teaching a teenager how to have a smoke or have a drink properly , so that they don't go overboard with self directed learning for a valid activity (loosening social inhibition). we could totally teach teenagers the generation and consumption of dispassionate violent relationship simulacra. may I ask what would be advantageous about this ?

jenadine · 2026-04-29T16:24:46 1777479886

What's so good about GPUI?

foresto · 2026-04-29T17:35:39 1777484139

I haven't used it, but it caught my attention when I read the Text Rendering section of this post:

https://zed.dev/blog/videogame#text-rendering

It looks like their approach could nicely solve a problem that's shared by almost every new GUI toolkit I've tried: text looks terrible, or at least out of place when surrounded by applications built with the desktop's native toolkit.

sev_verso · 2026-04-29T17:44:32 1777484672

Clean and polished design, concise Tailwind-style API, and last but not least sustained 120 FPS across complex UI.

jenadine · 2026-04-28T21:44:01 1777412641

So far everything is going according to the plan. Humans are really close to make the AI that will replace them and enter into the next phase of the plan.

Or do you have a better idea of what the plan exactly is?

keybored · 2026-04-28T22:18:33 1777414713

What’s the next phase? Billionaires manage to seize the means of bunker protection and remote-control the commoners into the wilderness?

koolala · 2026-04-28T22:01:51 1777413711

You mean the AI that might fail and suck every last ounce of entropy or life out the planet and sufficate it? Have you seen the insane amount of natural gas being burned to power it? Obviously I'd love if AI solved its own energy crisis but that hasn't even begun to happen yet. You think it will invent cold fusion? Room temp super conductors? Solar cells past our theoretical limits? Do you realize it's literally being controlled by human greed?

koolala · 2026-04-29T00:21:06 1777422066

It isn't just greed controlling it too, so I'm also optimistic. I'd just also like seeing the light powering it at the beginning of the tunnel.

pocksuppet · 2026-04-29T02:11:01 1777428661

It's not going to do any of these things, because it's auto-complete.

No, it won't bypass P≠NP either.

koolala · 2026-04-29T02:19:32 1777429172

What about P vs. NP? Is auto-complete able to create P solutions and then perform NP verification by interacting with experiment or calculation IO? Couldn't it test solutions faster than a human on problems with massive solution spaces like folding proteins or aligning electron-hole pairs?