Preventing scrapers from grabbing every single commit and feeding it to train an AI. (Instead of just, you know, cloning the repo and using the clone to feed the AI).
Is this really the first time you've encountered the Anubis anti-scraper system? It's been everywhere the past few years, because so many of those scraper bots are incredibly lazily programmed. Many discussions here on HN have included people commenting on scraper bots hitting every single commit page, diff page, etc. on their self-hosted forges, burning up lots of CPU time and bandwidth to serve them what they could have just gotten by cloning the repo if the bot's programming was slightly more intelligent.
It's been showing up on just about any code forge that isn't GitHub, so odds are good your browser has encountered it but you didn't notice the Anubis screen until now (it often goes by so quickly you'll miss it if you blink, though that depends on the site).
This is not true, at least around where I live. Gigabit ethernet(which is gigabit for only the downloads, and <50 mbps for upload) is 110$ per month. Comcast is the only internet service provider who offers speeds over 50 mbps.
So I make due. If I want to download a 40gb game, I take a break. I read a book, or eat dinner. It works itself out, and I can play my game.
My point was that 1 Gbps+ internet is available in enough select metropolitan areas that saying "almost no one has it" is inaccurate, not that it's widely available everywhere to the average user.
Obviously the subset of users with multi-gig fiber is relatively small, but not practically zero like the comment suggested. Anecdotally, 3 Gbps fiber is widely available in my medium sized US city of about ~500k for as low as $110. I paid the same for asymmetrical gigabit cable internet in the last city I lived. It just depends.
2.5Gbit via PON fiber is getting common, but you won't get that from Comcast. US isn't great at internet speeds anyway. I've had symmetric 1gbit for ages here in EU and you can even get 10g in some places.
I feel as though you are measuring tokens/s wrong, or have a serious bottleneck somewhere. On my i5-10210u (no dedicated graphics, at standard clock speeds), I get ~6 tokens/s on phi4-mini, a 4b model. That means my laptop CPU with a power draw of 15 watts, that was released 6 years ago, is performing better than a 5090.
> The 5090 is 10x faster but only 6-8x the price
I don't buy into this argument. A B580 can be bought at MSRP for 250$. A RTX 5090 from my local Microcenter is around 3250$. That puts it at around 1/13th the price.
Power costs can also be a significant factor if you choose to self-host, and I wouldn't want to risk system integrity for 3x the power draw, 13x the price, a melting connector, and Nvidia's terrible driver support.
EDIT: You can get an RTX 5090 for around 2500$. I doubt it will ever reach MSRP though.
reply