More

niek_pas · 2026-05-27T15:06:37 1779894397

Unless I'm missing something, there's also nothing in the paper to indicate this is "all of human ingredients"? It looks like it's 11 data sources covering a bunch of common cuisines, with the English + Chinese sources accounting for 90% (!) of the dataset. Among others, Africa and the Arab world are not present in the data (good for about 25% of the global population).

Also, all non-English terms were AI-translated to English which is methodologically understandable but surely leaves room for error.

order-matters · 2026-05-27T15:51:16 1779897076

translation is an interesting problem in and of itself still. its kind of a miracle we can do it at all, yet in some circumstances it seems obvious for there to be objective answers (cooking ingredients being one of them), but even then you never really know even with human translators if you've got it correct. even within the same language nearly every individual has their own version of it.

for example, how would you translate "chips" to another language without first knowing which version of English you are translating from? could be an american speaker with a british relative and they use the british definition of chips while otherwise mostly speaking american english.

there's a level of pragmatism in translation that needs to be assumed, and ultimately we have to accept that translated knowledge will always have low resolution. There is a layer of work that needs to be done with the source of the materials involvement to get written content to a level of formalism needed to be representative of the language it is written in. Generally, the work of editors. Which means successful translation for wide distribution, while still not guaranteed, is predicated on the editorial skills of the translator which begs for dialogue with the source.

Meanwhile, AI provides this super convenient band aid to get translation results you can't disprove.

I genuinely think people are severely underestimating the power held by these models for being translators and how literal truth is going to be determined by them deep behind the scenes under the disguise of accessibility. Not in a dangerous way necessarily, just in a way where what languages are and what words mean is going to shift towards whatever the models think they are.

In a way, over extended time, the models will not be wrong about the translations because their results will redefine what successful formal editing of language looks like, and disagreeing with them will amount to the same difference as having local slang.

dyauspitr · 2026-05-27T16:47:28 1779900448

Leaving out Indian, Southeast Asian and Arab cuisine means this is nigh useless.

argee · 2026-05-27T18:01:15 1779904875

There are 2,000+ varieties of mangoes alone. You could literally end up with a larger file using only mangoes.

zeroimpl · 2026-05-27T19:39:54 1779910794

Was going to give the same example with chili peppers. Tons of varieties and not exactly interchangeable

buggymcbugfix · 2026-05-28T02:45:03 1779936303

Thousands of cheeses, each of which is a unique experience. Heck, even the serving temperature completely alters the experience. Next: wines, charcuterie, ...

Pity the fool who can't taste the difference between any of these.

momoschili · 2026-05-28T04:37:36 1779943056

there are thousands of varieties of a lot of things though...

schemathings · 2026-05-28T04:42:16 1779943336

Use ChaatGPT

teleforce · 2026-05-28T03:11:31 1779937891

It's worst than useless, it's borderline criminal /s

The fabricated title targeted the sensation rather than substance, typical scenario whenever "All" is in the title, and the worst when it's in the very first word.

niek_pas · 2026-05-27T14:17:57 1779891477

> Yeah agriculture is bad for the environment, but at least it feeds us to keep us alive

This is true, but don't forget a _lot_ of agriculture feeds _animals_ that we in turn eat. If you want to make optimal use of land for human needs, most modern agriculture is not that.

spiderfarmer · 2026-05-27T14:35:15 1779892515

The problem is feed lots.

There's no problem the more conventional practice of letting animals graze the majority of the year. If we didn't use those fields to feed and eat the animals, the grass would turn into CO2 and methane anyway. Or turn into boring forests.

Not everything has to be optimal. That thinking leads to Thanos' snap. People generally enjoy meat. They also enjoy the landscape farmers created.

niek_pas · 2026-05-27T08:33:58 1779870838

Genuine question: do you add 'please' and 'thank you' to Google searches? If not, what sets them apart?

perching_aix · 2026-05-27T08:34:35 1779870875

Google searches being keyword based, rather than simulated conversations?

The same reason you wouldn't put in an entire actual question/sentence, unless you either don't know how to use Google, are pissed off, or have an actual reason to suspect that it would yield proper hits (e.g. looking up an excerpt).

Arch-TK · 2026-05-27T08:56:44 1779872204

Google has been optimized for sentence like questions so much that for a good 6+ years now it has been completely useless as keyword search.

To clarify: sentence search got slightly better at the cost of keyword search. So the result is unusable garbage.

wolpoli · 2026-05-27T09:28:17 1779874097

It is rather hard to lose of habit of using search engine with keywords given the change took place without much fanfare. I have no problem using sentences with the current ai tools through.

gum_wobble · 2026-05-27T08:45:43 1779871543

Genuine question: do you write Google search queries in natural language?

fc417fc802 · 2026-05-28T07:10:29 1779952229

I didn't used to but I do now that the searches go straight to an LLM. I almost always find the model output to be much more useful than the list of search results.

dminik · 2026-05-28T07:45:32 1779954332

I don't. I was recently doing some searching for information I thought AI would be good for: fuzzy natural language search with some conditions. And it was, but ...

Gemini at least is not great at citing and picking sources. Or providing multiple sources for the same thing.

It tends to stop at threes. So if you want more, you have to prompt it uselessly, like: "any more?"

globalnode · 2026-05-27T09:23:08 1779873788

llms seem more human like so if you were to treat them badly then you are more likely to condition yourself to treat other living creatures badly.

spiderfarmer · 2026-05-27T08:38:25 1779871105

Google isn’t conversational.

sunrunner · 2026-05-27T08:57:50 1779872270

I searched for "Hey Google" and got this in response:

  Hey! I'm here and ready to help. What’s on your mind today? Whether you need to look up information, plan a trip, or get things done, just let me know!

selcuka · 2026-05-27T09:09:24 1779872964

That's only because Google is an LLM now.

barbazoo · 2026-05-27T09:17:31 1779873451

https://en.wikipedia.org/wiki/Roko%27s_basilisk ?

tokai · 2026-05-28T13:09:37 1779973777

One of the dumbest thing supposedly clever people keep bringing up.

niek_pas · 2026-05-21T08:49:12 1779353352

"Update your priors" is a common expression in English: https://en.wiktionary.org/wiki/update_one%27s_priors#English

staticman2 · 2026-05-21T09:17:45 1779355065

Your wiktionary link indicates it is not a common expression in English but instead something "rationalist community" people say.

pocksuppet · 2026-05-21T10:05:16 1779357916

HN is a rationalist community hangout.

repparw · 2026-05-21T17:43:39 1779385419

we're reading comments on a post about math proofs

rob · 2026-05-21T10:36:52 1779359812

No it's not. Where do you come up with this? Just because you searched the phrase on Google and there's a single result for it on a wiki? Who do you know that's using this expression regularly?

eudamoniac · 2026-05-21T18:30:00 1779388200

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

happytoexplain · 2026-05-23T02:11:44 1779502304

"Common" is an exaggeration.

niek_pas · 2026-05-21T08:45:35 1779353135

I don't mean to sound elitist, but in a way, Haskell's difficulty is kind of the point of the language.

The thing that's so elegant about Haskell is that it allows you to express programmatic constructs at a very abstract level. Abstraction is almost by definition difficult to grasp. That's why it takes a decade and a half for (most) people to go from arithmetic to calculus.

haskman · 2026-05-21T10:23:56 1779359036

Difficulty is most certainly not the point. Abstraction, composability, yes, but difficulty is a language smell that CAN be fixed. (I love Haskell and it's my primary langauge, so this comes from a place of love).

niek_pas · 2026-05-20T21:14:00 1779311640

RIP my browser history, I guess

niek_pas · 2026-05-18T08:19:50 1779092390

What in the world is “other native” supposed to mean? Those languages don’t have names?

useyourloaf · 2026-05-19T00:42:47 1779151367

"Central Yupik" for Alaska, "Lakota" for S. Dakota

type0 · 2026-05-18T09:21:01 1779096061

creole?

niek_pas · 2026-05-17T14:18:42 1779027522

What's even worse is that when dealing with human software teams, a vague requirement will (at least in a well-run org) receive demands for further specification. "What do you mean by 'get data'?", etc.

An LLM will just say, "Sure! Here's the fully implemented code that gets the data and give it to the user. " and be done with it.

smokel · 2026-05-17T14:26:59 1779028019

ChatGPT 5.5 responds:

> What data should I retrieve, and where should I get it from? Please specify at least: ...

And it then goes on to ask just exactly what is necessary, being all constructive about it.

airstrike · 2026-05-17T14:30:30 1779028230

You're both right. The parent was a toy example, and if asked literally to an LLM, it will definitely ask for more information. Yes, it's important to be accurate but I don't think that applies here.

But the point still stands: in most contexts, the LLM will fill in the blanks with what it deems appropriate like an overconfident intern at best and a bull in a China shop at worst.

vidarh · 2026-05-17T14:24:19 1779027859

When the cycles are short enough, though, that is to some degree the right thing. That is, it's the right thing for things the users can then immediately see and give feedback on, because it lets them give feedback on something tangible.

It's the wrong thing for important things under the hood (like durability and security requirements) that are not tangible to them.

pydry · 2026-05-17T14:36:52 1779028612

IME you give it very precise specifications and it still fucks it up.

When we talk about "the" bottleneck being specs it just isnt the case that it's the only thing LLMs do poorly. Theyre really bad at a lot of stuff in the SDLC.

They're also good at providing results which are bad but look ok if you either dont look too closely or dont know what you're looking for.

resters · 2026-05-17T14:52:59 1779029579

Just as poorly designed code can still compile. This is operator error, not a failure of the technology.

niek_pas · 2026-05-11T21:07:52 1778533672

Bit off topic but why in the world are people still posting on medium? The reading experience is abhorrent; I couldn’t even finish reading this article before a full screen popup literally blocked the sentence I was reading.

Is there some incentive I’m not seeing?

xrd · 2026-05-11T23:01:30 1778540490

They have made an honest attempt to pay writers. It's a different model than substack, but that's why.

I look at it the same way I look at pay walls for newspapers. I don't like them but I understand why they are there.

raincole · 2026-05-12T07:27:15 1778570835

Which is why it failed though. It turns out people won't pay one dollar to read an article like "If AI writes your code, why use Python?"

The situation is very unfortunate. We had perhaps once-in-a-lifetime chance to solve micropayment but we fucked up (crypto).

tommit · 2026-05-12T14:41:28 1778596888

yup, I still wonder if BAT was onto something. loved the idea, never took off. oh well

nickff · 2026-05-11T21:12:48 1778533968

It seems like it's just the latest evolution of the writer-friendly blogging platform; easier than Wordpress to package into a newsletter, and also easier to monetize with a paid tier.

ciupicri · 2026-05-11T22:47:26 1778539646

But don't we have AI to deal with the complexity of Wordpress? :-)

DonHopkins · 2026-05-11T23:15:58 1778541358

Insofar as AI is great at accidentally deleting your production and backup Wordpress databases, and forcing you to start from scratch with something else.

iLemming · 2026-05-11T23:25:50 1778541950

> The reading experience is abhorrent

Nothing you read in the browser can provide ultimately great and hands-down the best reading experience equally for everybody - the modern web model is inherently at odds with that. A plain HTML page with no CSS is a near-perfect reading experience. The problem is that almost nobody ships that, because the web also became a publishing platform where authors compete for attention. A plain-text protocol under user control is closer to "best reading experience for everybody". The web could be that. It mostly isn't.

I stopped trying to read long articles in the browser. Why would I do that, if I can easily extract all the relevant, plain text (and even structured one) and read it in my editor instead? Where I have control over fonts, colors, navigation, etc. The browser is a delivery mechanism, not a reading environment. Treating it as one is a habit, not a necessity.

Long ago I stopped trying to type anything longer than three words anywhere but my editor. Of course, why wouldn't I? It already has everything I need - spellchecking, thesaurus, etymology lookup, translation, access to all my notes, LLM integration, etc. Try it one day - it's enormously liberating experience. And then maybe you'd stop reading long texts in the browser as well.

autoexec · 2026-05-12T01:25:05 1778549105

> A plain HTML page with no CSS is a near-perfect reading experience. The problem is that almost nobody ships that, because the web also became a publishing platform where authors compete for attention.

They don't ship it because of greed. They only want your attention because of greed. They only infest their website with ads because of greed.

> The browser is a delivery mechanism,

http is a delivery mechanism. The browser is a user agent. It's supposed to display content according to the preferences of the user. If your browser isn't doing that for you it's time to find a new browser or beat the one you have into submission until it behaves. "reader mode" is a useful compromise.

iLemming · 2026-05-12T01:56:08 1778550968

> It's supposed to display content according to the preferences of the user.

That's right, the original idea was exactly about that, but like I said - in practice that is no longer a thing.

Using the editor for reading any content is enormously underrated. Check this out - this entire thread opens in my editor as an outline with nested structure. Meaning that all the regular outline operations are available to me - folding, imenu (interactive TOC), narrowing, quick search, contextual search, pattern-based search, sparse-tree search.

Extracting all the URLs on the page while ignoring HN-internal ones is a single keypress for me - there's a link to a YT video - I can watch it, controlling the playback directly from my editor, I can extract transcript and summarize it with an LLM request - all without opening new tabs, without switching focus.

I can narrow on the sub-thread, or select a region and export only that part to a pdf, gfm, html or LaTeX. The possibilities are virtually unlimited. A web browser - even with three hundred different extensions won't let me have complete and utter control over plain text - it's just not designed for anything like that.

polaris64 · 2026-05-12T07:54:00 1778572440

I'm assuming you use Emacs? Are you using a special "hacker news mode" or something more generic?

iLemming · 2026-05-12T18:12:24 1778609544

HN threads is probably not the best example because the site is pretty readable already. But it's not that difficult to fetch a thread and render it in the Org-mode outline format. nhreader.el¹ does that. For reading articles I just use eww. it has (eww-readable) that removes all the fluff like banners. The trade-off that eww (by design) doesn't do any javascript. That makes it difficult to use with websites with client-site rendering (React, et al.). For that, I have a little automation elisp² that uses OSA (JXA) and extracts the rendered content off the page. I need to figure something similar for Linux, but it's not so straightforward, the only way I know is to run the browser with the debugger port.

¹ https://github.com/thanhvg/emacs-hnreader

² https://github.com/agzam/.doom.d/blob/main/modules/custom/we...

uxcolumbo · 2026-05-12T07:55:55 1778572555

Can you share your setup how to achieve what you described? I'm curious.

iLemming · 2026-05-12T18:24:25 1778610265

see the adjacent thread

someguyiguess · 2026-05-12T01:45:19 1778550319

> Why would I do that, if I can easily extract all the relevant, plain text (and even structured one) and read it in my editor instead?

Because that’s an enormous pain in the ass. Not scalable at all.

itsdavesanders · 2026-05-12T16:04:52 1778601892

Its pretty easy with a system like Readwise. Yes, that's ANOTHER system, but its one system to quickly just add articles like these to an inbox and read them another time, in plain text.

Of course, it doesn't work 100% and certain sites are hostile to it and do stupid javascript tricks "for the views".

Mostly, I use it to put it on a reading list later, and to get around really, really abusive ad driven sites.

iLemming · 2026-05-12T18:22:04 1778610124

> Its pretty easy

100%. One can use mozilla/readability to extract the content. Even if you think that would require some effort, think about it - you have to do it ONLY once and never deal with that kind of annoyance EVER again. It really baffles me seeing devs complaining about shit like that. Why? Why won't they figure out a better way? You're a friggin' programmer - computers have to obey your will. You spend your lifetime staring at the screen, reading and editing text. Why not do it on your own terms? Even if it takes some effort, why choose to be henpecked by someone else's rules FOREVER?

iLemming · 2026-05-12T01:56:50 1778551010

I beg to differ. You clearly misinterpret what I'm talking about. Please expand on "scalable", what do you mean by that?

kode-targz · 2026-05-12T11:29:07 1778585347

do you use emacs?

iLemming · 2026-05-12T18:00:59 1778608859

I do, but nothing stopping anyone from doing the same thing with nvim or vscode. I'm pretty sure, for vscode there probably extensions - it's already built atop a browser.

chneu · 2026-05-11T21:14:11 1778534051

My best guess is momentum. Some people are very, very brand loyal and have to do things in relation to what/how others do things.

In reality it doesn't matter where something is posted, just give us a url, but some people don't operate that way.

kelvinjps10 · 2026-05-12T01:47:55 1778550475

check out Scribe an alternative medium frontend that's why better: https://scribe.rawbit.ninja/@NMitchem/if-ai-writes-your-code...

https://sr.ht/~edwardloveall/Scribe/ https://libredirect.github.io/

odie5533 · 2026-05-12T08:39:49 1778575189

It's a free, permanent host for your blog articles with a built-in community and monetization layer. There's only so many free hosts out there that I'd be confident will be around in 5 years, and Medium is one of them.

dsmurrell · 2026-05-11T21:13:47 1778534027

Yep, Medium was free and everyone donated content... then it put up reading paywalls and conned everyone, I'm also surprised when I see people writing on there.

OrangeMusic · 2026-05-13T06:56:24 1778655384

Same reason people still post on X.

niek_pas · 2026-05-09T23:08:39 1778368119

And I’m sure the boundary on what constitutes ‘badness’ is something everyone can agree on!