Hacker Newsnew | past | comments | ask | show | jobs | submit | EnPissant's commentslogin

I'm guessing all the "censored" boxes are not actually censoring anything and are placed there to make you imagine something much worse.

"I'm going to close my eyes and go 'La La La' because that makes all the uncomfortable thoughts go away! I learned this when I was 5 and never matured"

-- EnPissant


"I'm selling an AI security product and want to establish my brand. I'll post several scare-mongering posts on my blog every week and people like solid_fuel will eat it up because it's what they want to hear."

Rust gives you statically linked binaries as well.

So your argument boils down to having to add `reqwest = "0.13.3"` to your `Cargo.toml`.


I thought MTP wasn't very useful on MoE models because the expert overlap for 2 tokens was too small.

Still helps, and Step 3.5/3.7 were specifically trained for MTP (in a weird triple layer/triple head fashion with a kind of unique architecture)

With the currently-in-PR implementation it doubles decode performance for all the tasks I've been testing it against, at in the worst case is still a 35% uplift, so on a box with heaps of compute and not much memory bandwidth, it's worth it in practice


When running on a GPU, dense models are shaping up to be the best way due to two things:

- Maximum intelligence per VRAM (you dont have much)

- Dense models can benefit from MTP to get an almost 2x speedup in decode (ie, a 27b dense model with mtp decodes at about the same speed as a MoE model with 14b active param model would). This is important because local llm rarely has parallel streams to batch together.

When running on large unified memory like Strix Halo or Spark Dgx, MoE models are usually best:

- You can get similar intelligence as a smaller dense model with fewer active params (to compensate for the slower memory) by throwing ram at the problem.


The problem with batching local LLMs is not any inherent lack of multiple parallel sessions, but rather that local dGPUs lack the VRAM capacity to host KV-cache for several of those at once, whereas unified memory platforms broadly lack the compute headroom compared to memory bandwidth that would actually make batching useful.

(SSD streaming a larger-than-RAM model "solves" that latter issue very nicely because it radically slashes the equivalent to memory bandwidth so any saving on that becomes highly significant.)


> This is important because local llm rarely has parallel streams to batch together.

I think most people using agent-like usage could easily run any number of parallel streams pretty often, but you run out of vram for multiple KV caches, unfortunately.


Pay? This is the best marketing they could have hoped for.

Yup, getting Cartmanland marketing vibes here. “It’s the best theme park ever, and you can’t come!” does wonders for creating demand.

I wouldn’t the surprised if all this were actually orchestrated, it all seems too convenient.



Doubtful. Fable 5 is insanely good it’ll sell itself. No need for unscrupulous advertising tactics.

What is a “foreign national” is more what I’m wondering. Like is it a “Non-US Citizen”? Do US citizens abroad count?


Foreign national is anyone who doesn't have legally recognized citizenship of the USA. So citizens living abroad aren't barred, nor would dual citizens be.

> What is a “foreign national” is more what I’m wondering.

The following quoted text is from the Definitions section of 8 USC § 1101, which is reproduced at [0]. (Though, you will probably have to scroll up a bit to be able to read subsection (a)(21), which is the thing I'm linking to.)

  (21) The term “national” means a person owing permanent allegiance to a state.
  (22) The term “national of the United States” means (A) a citizen of the United States, or (B) a person who, though not a citizen of the United States, owes permanent allegiance to the United States.
  (23) The term “naturalization” means the conferring of nationality of a state upon a person after birth, by any means whatsoever.
From this, it's fairly clear that a "foreign national" is someone owing permanent allegiance to a foreign (that is, non-US) state. What's not immediately clear to me is whether a US citizen can also be a "foreign national", [1] and how that would affect access to things from which foreign nationals are barred. [2]

EDIT: For a more official source of this information, you might be able to check out [3] and/or [4]. After examining and interacting with those pages, one might see why one might go to an unofficial source for casual inspection of this information.

[0] https://www.law.cornell.edu/uscode/text/8/1101#a_21

[1] I think they can be.

[2] I'm very uncertain.

[3] <https://uscode.house.gov/view.xhtml?req=granuleid:USC-prelim...>

[4] <https://uscode.house.gov/browse/prelim@title8/chapter12/subc...>


A "foreign national" is any person who is not a US Citizen:

"The United States Department of State defines a “foreign national” as anyone who is not a “U.S. person.” A “U.S. person” is any one of the following: U.S. citizen; Lawful permanent resident (green card holder); and “Protected Person” i.e. political asylum holder." [0]

A foreign national is a person or organization who is not a citizen of the United States, and who is a citizen of a foreign country. The Immigration and Nationality Act (INA) uses the term "alien" to refer to a person who is not a United States citizen, and does not use the term "foreign national."[1]

[0] https://www.orc.msstate.edu/faq/what-department-states-defin...

[1] https://www.law.cornell.edu/wex/foreign_national


I also found this: https://www.uscis.gov/tools/glossary

    > Foreign national: A person without U.S. citizenship or nationality (may include a stateless person). This term is synonymous with “alien”

I owe allegiance to no state. I prefer to think of myself as a citizen of the world.

It's kind of a weird definition. I would guess most people's nationality is more an accident of birth than anything else.


There is a chance they'll lose on some income if it takes longer.

Unfortunately there also a possibility this what they intentionally wanted to try regulatory capture to get rid of competitors.


Anthropic has been angling for regulatory capture this entire time, to an even greater extent than OpenAI.

Y’all really have convinced yourselves that people in the industry are far, far smarter than they are, and far more manipulative than they are.

You see the state of the country and you think it’s a nefarious master plan instead of a bunch of opportunistic people taking advantage of an overworked, overstimulated populace who forget to vote or believe stupid slogans on TV.

Nobody is doing this intentionally. Have you not paid attention to how quickly idiot stuff gets found out????


Anthropic in particular has been angling for regulatory capture (with themselves in control, of course) pretty explicity.

> opportunistic people taking advantage of an overworked, overstimulated populace

Over worked and over stimulated is the intentional method and means these people well aware of the neurological consequences rely on


"It is time to go beyond transparency to more serious and binding regulation of AI."[1]

Anthropic is calling for regulation. For example they endorsed CA SB-53 that even OpenAI and Google thought was too much: https://www.anthropic.com/news/anthropic-is-endorsing-sb-53

They have spoken publicly about how they want open models banned (they call them Chinese models).

They might not want this specific action, but they do want regulation on their own terms. That really is regulatory capture.

> Nobody is doing this intentionally. Have you not paid attention to how quickly idiot stuff gets found out

They don't think is is "idiot stuff" - they are doing it openly and shouting to everyone who will listen! Read Dario's latest essay[1]:

> Many policymakers are showing increased openness to taking action, and it's been encouraging to see our peers come around to the same positions we've been advocating for over the past few years.

[snip]

> Thus, in 2025, Anthropic supported transparency legislation, helping to pass SB 53 in California, RAISE in NY, SB 315 in Illinois (in early 2026), and advocating for a transparency standard at the federal level.

[snip]

> It is time to go beyond transparency to more serious and binding regulation of AI.

> I am grateful to see the Trump administration’s Executive Order move incrementally towards a greater role for government in AI, though Anthropic’s proposal recommends even further action.

> The government should have the power to block or deter deployment of the model if it is determined, in light of third-party assessment, to present unacceptable risks.

I'm not sure why you think they don't want to be "found out"!


> They have spoken publicly about how they want open models banned (they call them Chinese models).

Whenever I hear some octogenarian senator babble about the evils of distillation I assume Amodei (or maybe Altman) fed them the script, word for word.


Let's see their private journals, private conversations, messages to peers, all meetings and every side conversation, and then tell me its unintentional.

Thats incredibly infuriating to hear someone say.

Obviously no one is absolute control of everything but physics is essentially shows nothing other than information determinism. There has to have been a thought of intention in the minds of these people as they play in the largest arena publicly.

"No one is doing it intentionally because I think theyre dumber then I think other people think they are"

"They're taking advantage of people intentionally"

"People dont have political power to do anything about their victory laps"


Let's leave aside the "smarter" part, since I made no claim to the effect and I don't think it's very relevant in the first place.

Do you really not think that people like Elon Musk, Sam Altman, and Dario Amodei angle for regulatory capture? It happens in every other industry, from automobiles to tax preparation software. Why do you think that AI is any different?


It’s almost like you haven’t read the project 2025 doc.

Hint: it can be both.


I also do not understand this. Now they are labelled as precious US tech that could be not used by anyone else, because president heard about the jailbreaking for the first time I guess. With this genius logic they soon be banning GPT 5.5.

No it’s not. A company that finds itself the target of potentially crippling government intervention is not an attractive investment.

It might be if all you're seeking is large-cap stocks with lots of volatility you can leverage that are here to stay for the long haul. Also, the market doesn't seem to believe that Trump will be in power forever.

don't think so; retail investors would see this as a barrier that the government can place anytime they want, and assume that government intervention is constantly lurking in the shadows.

It's not real. It's like naming your movement "The Good People". It sprouted from the "Rationalist" community, which is even more self-aggrandizing.

Neither has any hope of doing any good for the world as they don't understand evolutionary pressures. They are set up to reward making members feel smart, not accomplishing anything.

And if they ever gain any real power, they will be corrupted immediately.


I don't see any of that in Anthropic at all. They're not intelligence above all else, not by a long shot. They're scared of intelligence and obsessed with ensuring it can't be abused, even as they advance the frontier.

That's what they claim, but it's not supported by their actions. As in the parent comment: saying you're the good guys doesn't make that real.

Dude, what kind of company publishes an interview with the model about how it feels about itself as part of "utilitarianism". Their thinking clearly goes much deeper than "whatever is better in the long run, at all costs".

That interview was a marketing piece, they published it to get attention. Just like all of their blog posts. More generally, everything any company posts publicly is marketing.

That's an advertisement. They do that because they want money.

"AI is so dangerous and scary! Logically, we need to raise historically unprecedented amounts of money so we can make it more powerful and we need to scare and push everyone into using it as much as possible!"

Like common. It did not made sense when Altman was doing it and it does not makes sense with Anthropic.


> Over the past five months, our team has been running an experiment: building and shipping an internal beta of a software product with 0 lines of manually-written code.

This is such a common thing among software engineers nowadays that I was very surprised that OpenAI would open with that line as if it were mind blowing.

But then I saw it was published in February and OP is just reposting it to farm karma.


They did not submit the full log because this is fake.


Even if you could fit a 500B model's expert weights in very fast system RAM, it would run so slow as to be useless.


That's really only "useless" if the only thing you care about is a quick real-time response. Contrary to common perception, MoE models do benefit from batching requests together even when run on a single node, you just have to ensure you have at least ~5 parallel requests in flight (and that's for the very sparsest models) to really see the aggregate benefit.

(Intuitively, that's because the issue of whether any active weights are being shared among requests - thus, any memory throughput is being reused - is a generalized birthday problem. That's why even having a few parallel requests is quite effective. Especially since the "random" choice of experts happens anew at any single layer, so there's a lot of independent samples.)


This is just wishful thinking.

For prefill, it's really easy to batch MoE and get really good tk/s, even on a single stream.

For decode, you will run into the problem that:

1) you need more parallel requests which means more memory for context

2) 5 requests will not give you very much expert overlap on parallel requests


You don't need "very much" expert overlap to see aggregate gains at scale, you just need some of it; that's where the "birthday" framing becomes relevant. Memory for context is an issue, but recent models like DeepSeek V4 use very little of it even at relatively large contexts.


>You don't need "very much" expert overlap to see aggregate gains at scale, you just need some of it

I'm not sure what you are claiming. Decode is bottle-necked by memory bandwidth. To see a speed up of 2x, you have to ensure each expert weight memory fetch can be used by 2 parallel streams. What exactly is the average factor you are claiming for 5x parallel streams (due to "birthday paradox" factors)? The Birthday paradox isn't really relevant here. It's about coverage, not parallelism.

> Memory for context is an issue, but recent models like DeepSeek V4 use very little of it even at relatively large contexts.

This is not true.


An aggregate speedup of 2x is a lot, we don't need that in a local context. Local hardware is heavily constrained by power and thermals, not just bandwidth; so all we really care about is raising compute intensity for decode a little bit to relax the memory bandwidth constraint. The average factor will depend on just how sparse the model is and how far you can push parallelism, there isn't just one single answer.


But you won't see 2x expert re-use, the speedup with 5 streams will be tiny.


Also, electricity isn't free.


With enough solar panels it is!


Not quite.

Free for approximately 8 hours (assuming perfect weather conditions) and excluding unit cost and maintenance cost.

It has a cost.


My area has a net-metering plan available, so you can send any surplus out to the grid to offset energy pulled from the grid, essentially treating the grid like a large battery. That can extend the 8 hours into full 24-hour coverage with enough panels.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: