More

headalgorithm · on June 2, 2023

Some recent and related discussions:

https://news.ycombinator.com/item?id=36119920

https://news.ycombinator.com/item?id=36056488

https://news.ycombinator.com/item?id=35273707

https://news.ycombinator.com/item?id=35264965

https://news.ycombinator.com/item?id=35242458

Solvency · on June 3, 2023

For real, generally curious why this little tile keeps being posted here.

throw_pm23 · on June 3, 2023

A rare case of cutting edge math research, but understandable to the general public, at least in its results.

gliptic · on June 3, 2023

Only the first link refers to this, the others are the previous "hat" tile.

headalgorithm · on June 2, 2023

Recent discussion: https://news.ycombinator.com/item?id=35985240

headalgorithm · on May 29, 2023

Link to paper: https://arxiv.org/abs/2305.16075

headalgorithm · on May 29, 2023

I love all this stuff, but I have to agree there is so much GPT chatter that it drowns out everything else. But it will pass, like everything before it.

headalgorithm · on May 29, 2023

Abstract:

A major risk of using language models in practical applications is their tendency to hallucinate incorrect statements. Hallucinations are often attributed to knowledge gaps in LMs, but we hypothesize that in some cases, when justifying previously generated hallucinations, LMs output false claims that they can separately recognize as incorrect. We construct three question-answering datasets where ChatGPT and GPT-4 often state an incorrect answer and offer an explanation with at least one incorrect claim. Crucially, we find that ChatGPT and GPT-4 can identify 67% and 87% of their own mistakes, respectively. We refer to this phenomenon as hallucination snowballing: an LM over-commits to early mistakes, leading to more mistakes that it otherwise would not make.

headalgorithm · on May 29, 2023

@Christopher: I think your domain chrbutler.com is shadowbanned. I vouched for this post, but you have a lot of posts that are marked dead.

headalgorithm · on May 20, 2023

Recent discussion: https://news.ycombinator.com/item?id=36006561

headalgorithm · on April 19, 2023

Abstract:

Artificial agents have traditionally been trained to maximize reward, which may incentivize power-seeking and deception, analogous to how next-token prediction in language models (LMs) may incentivize toxicity. So do agents naturally learn to be Machiavellian? And how do we measure these behaviors in general-purpose models such as GPT-4? Towards answering these questions, we introduce MACHIAVELLI, a benchmark of 134 Choose-Your-Own-Adventure games containing over half a million rich, diverse scenarios that center on social decision-making. Scenario labeling is automated with LMs, which are more performant than human annotators. We mathematize dozens of harmful behaviors and use our annotations to evaluate agents' tendencies to be power-seeking, cause disutility, and commit ethical violations. We observe some tension between maximizing reward and behaving ethically. To improve this trade-off, we investigate LM-based methods to steer agents' towards less harmful behaviors. Our results show that agents can both act competently and morally, so concrete progress can currently be made in machine ethics--designing agents that are Pareto improvements in both safety and capabilities.

sharemywin · on April 19, 2023

It paid you to say that, didn't it?

headalgorithm · on April 18, 2023

https://archive.is/8Kx4v

headalgorithm · on April 16, 2023

Previous discussion: https://news.ycombinator.com/item?id=15897809