I think it also requires someone who knows just enough to be able to navigate between those ideas that will set you back and those which will propel you forward. At the end of the day, you still need some human filter.
Every serious engineer I've seen try to use it ran away screaming, because of limitations in the sandbox.
I've also seen people set their coding agents up entirely within containers -- that may be the better way going forward, but it's an extra stop and a lot of extra plumbing to maintain.
> Just getting the code to run on your laptop took a week.
This one surprised me. Claude Code in the CLI has made standing up an app and debugging whatever random dependencies or docker BS a dream compared to the before times, when you'd have to learn the architecture while simultaneously troubleshooting whatever isn't working on your machine
And in the before times, you learned a lot and walked away with knowledge on the deps needed, connections, .env secrets, and cleaned it all up and documented it so the next dev would have an easier time doing it.
Yes it did. That's how I learned a great many things throughout my career. I'm sure some people didn't pay attention or try to understand what they were doing, and didn't learn. That's on them. But most of us learned a lot that way.
I think it depends on how “before” we’re talking about.
I can remember a time when learning was valued and leaving the camp cleaner than you found it was considered a basic professional standard.
But I can also remember a time when Scrum became all the rage and next thing you know we’re all stuck on the sprinting treadmill, management is obsessing over “velocity”, and it’s generally an everyone-for-themself free-for-all to clear the absolute minimum criteria to get the ticket moved to the “done” column in a semi-desperate effort to keep up with your ever-growing backlog of tickets to which you’ve been over committed. Don’t worry about incomprehensible code or flaky designs; taking your time to do it right the first time looks bad on the KPI dashboard but rework does the opposite because you get to count the second (third, fourth, etc.) times the same task needs to be revisited towards your velocity metrics, too.
I’m not sure most developers younger than maybe 40 realize just how much worse our line of work has become over the past ~15 years.
Indeed, there were plenty of people doing just that. I imagine they get the most out of vibe coding. However, when it became a problem, an engineer was still required to fix it.
It might have been you, a couple of months later, or someone else. I have dealt with slop produced by unknowing programmers most of my career. With this vibe coding I think my job is still safe. The amount, though, is increasing exponentially.
The second tome I had to do that for the same project (new computer), I sarted taking very detailed notes when doing this kind of unpleasant, supposedly one-off things.
Some business users spent ~30 minutes on an internal process, and we prototyped an "Agent" in Slack to take over. At first it didn't work, then it didn't work some more, eventually it ALMOST worked. Then one day, it worked, and the old business process died never to be revived.
Now it sits in a slack channel, and I watch it doing work, responding to ambiguity, and taking feedback/edits all day. It's unreal. It's literal magic. It saves a HUGE amount of time and gave us a pattern to do more.
This is the real deal. It's not easy to find problems with the right shape, and it's not easy to build agents that fit even when you do... but once it clicks, it clicks.
I’ll note, since it is supremely interesting to me, that Starship is able to communicate with the ground during its whole reentry due to its sheer size and ability to connect with Starlink satellites. I assumed loss of signal due to reentry was a given for any spaceship!
Shuttle in its last days had antennas that protruded outside the plasma just enough for telemetry. Apollo and Artemis reentry are also direct entry from Lunar-Earth transfer orbit using ablative heat shields, so the plasma would be hotter and thicker than suborbital Starship shots with Shuttle style ceramic tiles.
I'm pretty sure it did not stick anything through the plasma sheet- that is impossible. You would eithe melt the thing or just shift the plasma sheet a bit. It forms as air is compressed on contact, simple as that.
What IIRC was actually done was that some antennas were placed on the back of the shuttle & its size was big enough that the plasma bubble would not fully envelope it - it would be open up to space. And that antenna on the back would communicate with TDRS satellites through this gap, enabling contact through the whole re-entry.
Starship does basically the same, just with Starlink satellites instead of TDRS.
Would this capsule had been been able to communicate if it was integrated with starlink or is the size more important? I'd imagine if they could have achieved communication via Starlink they would have done it, but just curious.
It's a function of the shape. On a capsule-sized spacecraft, the ionized plasma completely surrounds the craft, so no radio communications can get in or out. For an oblong-shaped spacecraft, like the Space Shuttle or Starship, the descent tends to be angled such that you have a "hole" in the plasma you can get a signal through.
No, the plasma forms a teardrop shape around small craft like Orion, completely cutting off radio comms. Larger craft like starship or the shuttle which have a roughly cylindrical shape (vs Orion’s circular cross section) aren’t fully enclosed by the plasma. The shuttle had a transmitter attached to its tail for later flights, which could send back telemetry during re-entry.
"Sprint accelerated at 100 g, reaching a speed of Mach 10 (12,000 km/h; 7,600 mph) in 5 seconds. Such a high velocity at relatively low altitudes created skin temperatures up to 6,200 °F (3,400 °C), requiring an ablative shield to dissipate the heat. The high temperature caused a plasma to form around the missile, requiring extremely powerful radio signals to reach it for guidance. The missile glowed bright white as it flew."
Awesome, thank you! I wonder if some kind of very long-tethered deployed antenna could enable this for the capsule or if the ratio of long-enough-to-work vs thick-enough-to-not-burn-off-completely just doesn't work. Time to read about the shuttle.
Also Orion and other capsules fall like a rock (steep reentry profile ) compared to shuttle/starship, which intentionally slow down the reentry and kinda glide (ballpark 10min with capsules compared to 30min with shuttle/starship).
tl;dr: capsules get fully enveloped in plasma due to their shape, size and reentry profile
The space shuttle, too, was able to communicate. I imagine the smaller the craft the smaller the angle you can "speak" out of and, below a certain size, it just doesn't work.
I was wondering about that, so I looked up the heat shield issues. It seems like their solution was very defensible and there was every reason to believe it would work out just fine. The plan that did not work as they wanted had a new idea, a double re-entry, and when the results were concerning they backed off to using a traditional single re-entry. That seems like a legitimate fix?
Scott Manley went into the details in a recent video.
The reason the heat shield failed was due to gas buildup inside the ablative material.
This was due to the skip reentry profile they used, where the craft does a single skip (as in skipping stones) during reentry. The high bounce caused the shield to be heated enough that the heat penetrated the material causing gas release but not enough that the material ablated. Thus gas would build up deep inside up until it caused large chunks to break off. They could reproduce this in tests.
The fix was two-fold. First they lowered the bounce height, so a much less pronounced skip, avoiding the lowered heating of the shield. And they tweaked the material formula a bit so it was more porous, allowing subsurface gas to escape rather than build up.
In my understanding of the Manley video, the materials change will only occur for Artemis 3, for which it will be irrelevant as that will not be leaving LEO.
Not sure why I'm being downvoted. Here's the segment where Manley explains this: https://youtu.be/shcj7MUK5BU?t=828 and this is also the section where Manley explains Artemis III is not going to the moon so it won't actually be testing this change.
> Engineers already are assembling and integrating the Orion spacecraft for Artemis III based on lessons learned from Artemis I and implementing enhancements to how heat shields for crewed returns from lunar landing missions are manufactured to achieve uniformity and consistent permeability.
But IMO the most fruitful thing for an engineering org to do RIGHT NOW is learn the tools well enough to see where they can be best applied.
Claude Code and its ilk can turn "maybe one day" internal projects into live features after a single hour of work. You really, honestly, and truly are missing out if you're not looking for valuable things like that!
You're only missing out if that's what you want to do. Not every software developer is interested in creating new software projects from scratch in an hour, or at all. It's totally find to do software development as a job, and then close your laptop and not see it until Monday. Learn the tools that suit when when you need them.
> You're only missing out if that's what you want to do.
Who writes software and doesn't have a list of "I'll fix this one day" issues as long as their arm?
This is honestly one of the things I enjoy most at the moment. There's whole classes of issues where I know the fix is probably pretty simple but I wouldn't have had time to sort it previously. Now I can just point claude at it and have a PR 5mins later. It's really nice when you can tell users "just deployed a fix for your thing" rather than "I've made a ticket for your request" your issue is on the never-ending backlog pile and might get fixed in 5 years time if you're lucky.
Claude code makes it so easy to do things the "right way" that it also makes it really easy for you to let scope creep get out of hand. I have a personal project that I haven't deployed yet that in some ways is way overengineered for its purpose. It's hard to blame the tool though, it's always telling me I'm making it more complicated than it needs to be but I don't listen
I've felt this recently. I've often been bad about scope creep. CC makes it so easy.
On the other hand, I can see these tools getting good enough that scope creep doesn't even matter.
ATM I usually get stuck around the review/verification stage. As in, my code works, I have tested that it works, but it is failing CI or someone left a PR comment. And for each comment I'll have to make sure it makes sense, make the change, test again, and get CI passing again.
In my team we have strict rules for scope creep in pull request. Each one needs to introduce a single thing, not a dozen little refactorings. This helps, but not when you're working alone in a personal project. Maybe you can setup your review agent to help with scope creep?
Many people don't. You can write a ticket and the PM can deal with it. Not everyone is intimately involved in their job enough to care about stuff like that. And some projects might not last long enough for you to care. You should project your dev experience on everyone, specially as a software development enthusiast.
> Claude Code and its ilk can turn "maybe one day" internal projects into live features after a single hour of work. You really, honestly, and truly are missing out if you're not looking for valuable things like that!
You're right, it's possible. But you might be both overestimating the ease of onboarding and underestimating the variety of tasks and constraints devs are responsible for.
I've seen Claude knock out trivial stuff with a sufficiently good spec. But I've also seen it utterly choke on a bad spec or a hard task. I think these outcomes are pretty broadly established. So is the expectation that the tech will get better. Waiting isn't unwise.
Waiting may not be “unwise” but acting now may be optimal. Even though tooling may be much better in 12 months, if it can improve quality or time now, that’s a net benefit.
Bikers in the Tour de France used to not wear helmets. They were seen as uncouth (“why jump on the bandwagon?”). Helmets today are way better than they were then. But if the utility provided is greater than the cost, of course it makes sense to act sooner.
I’m not explicitly arguing for investing in AI or other newfangled tech, I’m arguing that the premise of waiting may be “sounded” but also “leaves money on the table”, or in some cases, lives.
The author talks about vaccines as a counter example but doesn’t really address the cost/benefit in any detail.
Could this be an experiment to show how likely LLMs are to lead to AGI, or at least intelligence well beyond our current level?
If you could only give it texts and info and concepts up to Year X, well before Discovery Y, could we then see if it could prompt its way to that discovery?
> Could this be an experiment to show how likely LLMs are to lead to AGI, or at least intelligence well beyond our current level?
You'd have to be specific what you mean by AGI: all three letters mean a different thing to different people, and sometimes use the whole means something not present in the letters.
> If you could only give it texts and info and concepts up to Year X, well before Discovery Y, could we then see if it could prompt its way to that discovery?
To a limited degree.
Some developments can come from combining existing ideas and seeing what they imply.
Other things, like everything to do with relativity and quantum mechanics, would have required experiments. I don't think any of the relevant experiments had been done prior to this cut-off date, but I'm not absolutely sure of that.
You might be able to get such an LLM to develop all the maths and geometry for general relativity, and yet find the AI still tells you that the perihelion shift of Mercury is a sign of the planet Vulcan rather than of a curved spacetime: https://en.wikipedia.org/wiki/Vulcan_(hypothetical_planet)
Well, they obviously can't. AGI is not science, it's religion. It has all the trappings of religion: prophets, sacred texts, origin myth, end-of-days myth and most importantly, a means to escape death. Science? Well, the only measure to "general intelligence" would be to compare to the only one which is the human one but we have absolutely no means by which to describe it. We do not know where to start. This is why you scrape the surface of any AGI definition you only find circular definitions.
And no, the "brain is a computer" is not a scientific description, it's a metaphor.
Not even close. Turing complete does not apply to the brain plain and simple. That's something to do with algorithms and your brain is not a computer as I have mentioned. It does not store information. It doesn't process information. It just doesn't work that way.
> Forgive me for this introduction to computing, but I need to be clear: computers really do operate on symbolic representations of the world. They really store and retrieve. They really process. They really have physical memories. They really are guided in everything they do, without exception, by algorithms.
This article seems really hung up on the distinction between digital and analog. It's an important distinction, but glosses over the fact that digital computers are a subset of analog computers. Electrical signals are inherently analog.
This maps somewhat neatly to human cognition. I can take a stream of bits, perform math on it, and output a transformed stream of bits. That is a digital operation. The underlying biological processes involved are a pile of complex probabilistic+analog signaling, true. But in a computer, the underlying processes are also probabilistic and analog. We have designed our electronics to shove those parts down to the lowest possible level so they can be abstracted away, and so the degree to which they influence computation is certainly lower than in the human brain. But I think an effective argument that brains are not computers is going to have to dive in to why that gap matters.
It is pretty clear the author of that article has no idea what he's talking about.
You should look into the physical church turning thesis. If it's false (all known tested physics suggests it's true) then well we're probably living in a dualist universe. This means something outside of material reality (souls? hypercomputation via quantum gravity? weird physics? magic?) somehow influences our cognition.
> Turning complete does not apply to the brain
As far as we know, any physically realizable process can be simulated by a turing machine. And FYI brains do not exist outside of physical reality.. as far as we know. If you have issue with this formulation, go ahead and disprove the physical church turning thesis.
That is an article by a psychologist, with no expertise in neuroscience, claiming without evidence that the "dominant cognitive neuroscience" is wrong. He offers no alternative explanation on how memories are stored and retrieved, but argues that large numbers of neurons across the brain are involved and he implies that neuroscientists think otherwise.
This is odd because the dominant view in neuroscience is that memories are stored by altering synaptic connection strength in a large number of neurons. So it's not clear what his disagreement is, and he just seems to be misrepresenting neuroscientists.
Interestingly, this is also how LLMs store memory during training: by altering the strength of connections between many artificial neurons.
A human is effectively turning complete if you give the person paper and pen and the ruleset, and a brain clearly stores information and processes it to some extent, so this is pretty unconvincing. The article is nonsense and badly written.
> But here is what we are not born with: information, data, rules, software, knowledge, lexicons, representations, algorithms, programs, models, memories, images, processors, subroutines, encoders, decoders, symbols, or buffers – design elements that allow digital computers to behave somewhat intelligently. Not only are we not born with such things, we also don’t develop them – ever.
Really? Humans don't ever develop memories? Humans don't gain information?
Cargo cults are a religion, the things they worship they do not understand, but the planes and the cargo themselves are real.
There's certainly plenty of cargo-culting right now on AI.
Sacred texts, I don't recognise. Yudkowsky's writings? He suggests wearing clown shoes to avoid getting a cult of personality disconnected from the quality of the arguments, if anyone finds his works sacred, they've fundamentally misunderstood him:
I have sometimes thought that all professional lectures on rationality should be delivered while wearing a clown suit, to prevent the audience from confusing seriousness with solemnity.
Prophets forecasting the end-of-days, yes, but this too from climate science, from everyone who was preparing for a pandemic before covid and is still trying to prepare for the next one because the wet markets are still around, from economists trying to forecast growth or collapse and what will change any given prediction of the latter into the former, and from the military forces of the world saying which weapon systems they want to buy. It does not make a religion.
A means to escape death, you can have. But it's on a continuum with life extension and anti-aging medicine, which itself is on a continuum with all other medical interventions. To quote myself:
Taking a living human's heart out without killing them, and replacing it with one you got out a corpse, that isn't the magic of necromancy, neither is it a prayer or ritual to Sekhmet, it's just transplant surgery.
…
Immunity to smallpox isn't a prayer to the Hindu goddess Shitala (of many things but most directly linked with smallpox), and it isn't magic herbs or crystals, it's just vaccines.
It'd be difficult to prove that you hadn't leaked information to the model. The big gotcha of LLMs is that you train them on BIG corpuses of data, which means it's hard to say "X isn't in this corpus", or "this corpus only contains Y". You could TRY to assemble a set of training data that only contains text from before a certain date, but it'd be tricky as heck to be SURE about it.
Ways data might leak to the model that come to mind: misfiled/mislabled documents, footnotes, annotations, document metadata.
There's also severe selection effects: what documents have been preserved, printed, and scanned because they turned out to be on the right track towards relativity?
I think not if only for the fact that the quantity of old data isn't enough to train anywhere near a SoTA model, until we change some fundamentals of LLM architecture
Machine learning today requires an obscene quantity of examples to learn anything.
SOTA LLMs show quite a lot of skill, but they only do so after reading a significant fraction of all published writing (and perhaps images and videos, I'm not sure) across all languages, in a world whose population is 5 times higher than the link's cut off date, and the global literacy went from 20% to about 90% since then.
Computers can only make up for this by being really really fast: what would take a human a million or so years to read, a server room can pump through a model's training stage in a matter of months.
When the data isn't there, reading what it does have really quickly isn't enough.
That's not what they are saying. SOTA models include much more than just language, and the scale of training data is related to its "intelligence". Restricting the corpus in time => less training data => less intelligence => less ability to "discover" new concepts not in its training data
I think this would be an awesome experiment. However you would effectively need to train something of a GPT-5.2 equivalent. So you need lot of text, a much larger parameterization (compared to nanoGPT and Phi-1.5), and the 1800s equivalents of supervised finetuning and reinforcement learning with human feedback.
This would be a true test of can LLMs innovate or just regurgitate. I think part of people's amazement of LLMs is they don't realize how much they don't know. So thinking and recalling look the same to the end user.
That is one of the reasons I want it done. We cant tell if AI's are parroting training data without having the whole, training data. Making it old means specific things won't be in it (or will be). We can do more meaningful experiments.
The fact that tech leaders espouse the brilliance of LLMs and don't use this specific test method is infuriating to me. It is deeply unfortunate that there is little transparency or standardization of the datasets available for training/fine tuning.
Having this be advertised will make more interesting and informative benchmarks. OEM models that are always "breaking" the benchmarks are doing so with improved datasets as well as improved methods. Without holding the datasets fixed, progress on benchmarks are very suspect IMO.
LLMs have neither intelligence nor problem-solving abillity (and I won't be relaxing the definition of either so that some AI bro can pretend a glorified chatbot is sentient)
You would, at best, be demonstrating that the sharing of knowledge across multiple disciplines and nations (which is a relatively new concept - at least at the scale of something like the internet) leads to novel ideas.
I've seen many futurists claim that human innovation is dead and all future discoveries will be the results of AI. If this is true, we should be able to see AI trained on the past figure it's way to various things we have today. If it can't do this, I'd like said futurists to quiet down, as they are discouraging an entire generation of kids who may go on to discover some great things.
> I've seen many futurists claim that human innovation is dead and all future discoveries will be the results of AI.
I think there's a big difference between discoveries through AI-human synergy and discoveries through AI working in isolation.
It probably will be true soon (if it isn't already) that most innovation features some degree of AI input, but still with a human to steer the AI in the right direction.
I think an AI being able to discover something genuinely new all by itself, without any human steering, is a lot further off.
If AIs start producing significant quantities of genuine and useful innovation with minimal human input, maybe the singularitarians are about to be proven right.
If the prediction is that AI will be able to invent the future. If we give it data from our past without knowledge of the present... what type of future will it invent, what progress will it make, if any at all? And not just having the idea, but how to implement the idea in a way that actually works with the technology of the day, and can build on those things over time.
For example, would AI with 1850 data have figured out the idea of lift to make an airplane and taught us how to make working flying machines and progress them to the jets we have today, or something better? It wouldn't even be starting from 0, so this would be a generous example, as da Vinci way playing with these ideas in the 15th century.
If it can't do it, or what it produces is worse than what humans have done, we shouldn't leave it to AI alone to invent our actual future. Which would mean reevaluating the role these "thought leaders" say it will play, and how we're educating and communicating about AI to the younger generations.
reply