One of the reasons why over a decade ago, I dived deeply into the OSS world instead of mathematics was that it was so much more accessible: there were docs for everything, and I got direct feedback when something worked vs when something didn't work. Most of my questions had answers on stack overflow, and once I joined Rust (which back then in 2015 didn't have a big stackoverflow presence) I had a community who answered them for me (and in maths I didn't have that).
AI makes the math world more accessible than before. If you have a question about a proof in the lecture, you can just ask it. Of course, one can't trust it blindly, but fundamentally it's amazing.
I think that's a good thing, but of course this means that a lot has to change in culture and behaviors, also in the research world.
The software engineering world is more or less in the same situation, it's also changing. But for now I think it still holds true that someone who knows maths plus an LLM is better than someone who doesn't know maths plus LLM. At least in software it does.
Agreed. As someone who was always curious but had difficulties learning math the way it's taught at the university, AI teaching me the way no professor ever could is a blessing. I fail to see the point of the memo besides: we got here first and we decide what math is because we can. I'm really optimistic about AI and the value it brings in education. Gatekeepers will complain, but ultimately, will either adapt or be left behind.
>AI makes the math world more accessible than before. If you have a question about a proof in the lecture, you can just ask it.
I think that is great, really! but does anyone remember asking a TA or teacher or prof or parent and getting told you can work it out for yourself, or maybe just given a hint? What if that is an essential part of learning, having to work through things you don't understand, but that you have the tools, the foundation, to figure out.
A calculator can't teach you math. A forklift can't build your strength. This is really a double edged sword, as far as education or accessibility goes.
You have to constantly ask... what do I lose by not figuring it out myself?
Yeah, among other factors, that "figure it out" mentality put me off in the end. Especially because often you need to show the same mentality unless you want to overkill proofs and spend more time on them than assigned to you. I sometimes miscalibrated and pointed out some details that didn't need pointing out in my proofs while in other proofs, I skipped over too many details for the TA.
Of course I agree that if the student just asks LLM to do their homework, they have not learned anything. But it's sad if one can't ask questions about a proof or such. Having the LLM around to review the homework submission is also useful, to make sure that the arguments are solid.
You will have to learn to voluntarily figure things out for yourself without being pushed towards that. In a sense it's analogous to the presence of cheap calorie dense foods. In order to not be overweight you have to be mindful of and regulate your food intake in various ways.
Alternatively, perhaps universities will provide access to fine tuned models that are mindful of such things.
Horrible to hear this news. Neurological diseases are the worst because we understand so little about them and usually there is no cure, just management.
What have your experiences been with using AI for medical advice? Especially for such rare diseases I suspect that very little shows up in the training data. Personally I'm using AI only for work and only recently started using it for non-work non-coding stuff too.
> What have your experiences been with using AI for medical advice?
I had been trying to use Gemini during my bout of encephalitis before treatment. I wasn't really trying to diagnose myself, but instead, was looking up side effects of the various (psychiatric) medications I was on. At the time, I (but not my wife) had thought all biological causes had been ruled out due to testing from my PCP. To be clear, I wasn't really in my right mind, so whether this was a reasonable belief or not (likely not) isn't something to be assumed. Like, I just thought I had GAD. Or OCD. Or something latent that had just all of a sudden started rearing its ugly head.
I found Gemini's reporting of side effects of medication to not be helpful. Especially because it led me to wonder if some of the things were "in my head" (without a doctor even needing to say it). Anyway, there was never a point at which any AI suggested anti-NMDA receptor encephalitis. That didn't really come up until I got into the hospital and had an abnormal brain MRI.
I've since switched to ChatGPT, which I find to be leagues better than Gemini personally.
This is all really hard to explain, so I apologize if this doesn't make a lot of sense.
It's fine to make mistakes, that's how you learn. The problem here was that they didn't announce to the host that they are doing a test of their in-development equipment.
So the host wasn't able to add the additional risk and hassle to the price, which in this instance would have been a quite legitimate ask as the robot damaged their revenue generating property.
It's very ironic that Airbnb itself has done similar practices in the past where it ignored hospitality regulations to establish their business model, i.e. not asking for permission but for forgiveness.
The Airbnb style response would be to gig-ify this model where you ask an independent contractor to buy the test robot, rent the Airbnb, and test it out instead of you doing it yourself. Then the contractor bears the risk of damages to the property.
> The problem here was that they didn't announce to the host that they are doing a test of their in-development equipment.
I might be okay forgiving skirting the disclosure rules BUT only if they tried to be model tenants and, if there was any damage, took steps to proactively make things right. If you're breaking the rules, even if there was no damage, you should definitely be cleaning up and putting things back in place.
This was my thought. I can understand not wanting to go to the hassle of trying to explain that you're testing an experimental prototype robot to a confused Airbnb owner.
What I find inexcusable is not owning up to the damage and paying to fix it when your prototype goes on a rampage of destruction.
Moving fast and breaking things is fine, as long as you fix the stuff you break...
Even if it is fixable, there are costs involved for the fixing. A broken hotel lamp will sit in a landfill for all eternity.
"Moving fast and breaking things" could be acceptable in cases where there is an ulterior objective whose potential value could be >> these costs, but in general it should be evaluated more carefully.
In a rental unit you should not have things that can’t be replaced. People who rent it will break things, either by accident or purpose (there are always idiots around).
The problem here was that they didn't announce to the host that they are doing a test of their in-development equipment.
I personally think the problem here is that they were delusional enough to think this was the way to 'test' their prototype clean-o-bots. But as you point out (and...sigh...you're spot on on all points), we live in a world where doing things like beta-testing robo-cars in real live traffic is perfectly cromulent as long as you capture market share and outlast the lawsuits and 'disrupt' something.
Well Bezos did actually state that he wants to turn Earth into a natural park.
But yeah, the robot armies don't need grain so why hike up the price of bread? Lack of grain makes those people resentful which means you need to deal with their anger. Sure, it can be dealt with but it's just cheaper to give the humans grain so they are docile. This is basic governance 101 that goes back to the romans (and further).
They also didn't slaughter all horses immediately. You can't eat that much horse meat anyways. It happened piece by piece.
The only good reason for an abrupt mass culling of the 99% (for a coldly calculating rich person with no empathy) would be game theory, i.e. them not being a contender for power any more. If there are no humans, there is nobody who can question the control of the 1%. It would be thus less about economics and more about power.
I am really rooting for the bottom 99%, myself being a part of it, but I really don't know what will happen to us.
If you give it $290 of input tokens for $10 of output tokens, you are doing something wrong. I.e. you paste the whole CI output into the prompt instead of giving it a link to the file, and then the AI greps its way through it (using a fraction of the tokens).
Sometimes AI overdoes things and it re-runs the whole testsuite because the tail command didn't have enough lines, but the other way round messes up the context so much so that in the end all that context is useless.
There is build.rs, proc macros are unsandboxed, and lastly you install the binary so that you can run it. Even if the build and install were fully sandboxed, the binary could still do malicious stuff if ran.
Even without post-install script, a malicious payload could be hiding in some function and just wait until the developer invokes `cargo run`. Not that many people audit the crates they pull into their projects.
Yeah no shit, if you download malicious code from the internet and run it on your computer you will get pwned. No matter if it’s from a package manager a zip file or a submodule.
However the current npm vulns used a post install script.
I maintain that NPM malware use postinstall scripts just because they exist and are convenient. Had NPM not had postinstall scripts, the malware would have used a different mechanism and been almost exactly as effective.
Re vendor lock in point: this is a harness issue really. Sure, CC is restricted to Anthropic models, but it's not the only harness out there. So if one vendor has an outage or botches the quality of their models due to compute shortage, you can switch to another vendor. LLMs are the easiest to switch. Of course, if hardware costs go up, so will all AI vendors. The only way out for the employer would be to directly buy the hardware (or do a fixed price deal with a cloud provider).
Re the understanding code point: you can still use LLMs to understand code. If you write the spec without knowing anything about the code, of course the architecture might suck. Maybe there is already a subsystem that you can modify and extend instead of adding a completely new one for the new feature you are adding, etc.
I use LLMs for my daily workflows and they do understand code perfectly and much more quickly than if I read it.
CC isn’t even limited to Anthropic models, there’s a post on the front page right now to use it with Deepseek V4 since Deepseek provides an Anthropic compatible API and CC reads API URLs from env variables so you can override them.
I’ve build a configuration transpiler to Claude code and codex and found I can switch pretty quickly between both and run both at once. At the moment codex performs better. Prior CC did. There is no vendor lockin and this is an old canard in technology that LLMs in fact themselves make irrelevant. Once you’ve got an implementation that uses X converting it to Y is almost trivial with an LLM because the spec is canonical in the reference.
It’s buried in my dotfiles and not easily extracted. But the idea isn’t a hard one to implement, except the coding engineers are woefully unaware of themselves. Codex is easier because it’s open source. Claude you kind of have to futz with it for a while. Once you have the intermediate form working and outputting config for the two I’m sure you can coerce it to any other agent that comes along with similar constructs (marketplaces, etc). Theres some nuance for some MCPs particular those that download binaries like rust MCPs but its very complex I found and probably better to avoid unless you really need it.
This is a general fear for me whenever I take a taxi or something like it: i always remind the driver of my luggage in the back when we arrive and ask them whether they can help me get it.
It's unpleasant for me at normal speed settings, but on fast mode it works really well: the AI does changes quickly enough for me to stay focused.
Of course this requires being fortunate enough that you have one of those AI positive employers where you can spend lots of money on clankers.
I don't review every move it makes, I rather have a workflow where I first ask it questions about the code, and it looks around and explores various design choices. then i nudge it towards the design choice I think is best, etc. That asking around about the code also loads up the context in the appropriate manner so that the AI knows how to do the change well.
It's a me in the loop workflow but that prevents a lot of bugs, makes me aware of the design choices, and thanks to fast mode, it is more pleasant and much faster than me manually doing it.
So that article can in theory be used to conscript any man, citizen or not, living in Germany or not.
The Wehrpflichtgesetz, which is a simple law and requires just the 50% Bundestag majority to have it changed, refines this very wide constitutional power in article 1, to require men who hold German citizenship above 18.
Article 3 refines it even further to folks below 45 or 60, depending on the severity of the situation.
But yes, in theory it can be changed to include any non-German citizen man, people aged 80, living inside of Germany since a while or never having been to Germany ever, or just random men who happen to change flights at FRA.
AI makes the math world more accessible than before. If you have a question about a proof in the lecture, you can just ask it. Of course, one can't trust it blindly, but fundamentally it's amazing.
I think that's a good thing, but of course this means that a lot has to change in culture and behaviors, also in the research world.
The software engineering world is more or less in the same situation, it's also changing. But for now I think it still holds true that someone who knows maths plus an LLM is better than someone who doesn't know maths plus LLM. At least in software it does.
reply