~www_lesswrong_com | Bookmarks (692)
-
Why imperfect adversarial robustness doesn't doom AI control — LessWrong
Published on November 18, 2024 4:05 PM GMT(thanks to Alex Mallen, Cody Rushing, Zach Stein-Perlman, Hoagy...
-
Monthly Roundup #24: November 2024 — LessWrong
Published on November 18, 2024 1:20 PM GMTThis is your monthly roundup. Let’s get right to...
-
A Straightforward Explanation of the Good Regulator Theorem — LessWrong
Published on November 18, 2024 12:45 PM GMTThis post was written during the agent foundations fellowship...
-
The Choice Transition — LessWrong
Published on November 18, 2024 12:30 PM GMTOn the emergence of history's reinsOne general law, leading...
-
Proposal to increase fertility: University parent clubs — LessWrong
Published on November 18, 2024 4:21 AM GMTFertility rates in the developed world are too low...
-
Small improvement to Wikipedia page on Pareto Efficiency — LessWrong
Published on November 18, 2024 2:13 AM GMTNote: I would have done this as a Quick...
-
What (if anything) made your p(doom) go down in 2024? — LessWrong
Published on November 16, 2024 4:46 PM GMTDiscuss
-
Gwerns — LessWrong
Published on November 16, 2024 2:31 PM GMTAt every turn there was a Gwern, Within this...
-
Which evals resources would be good? — LessWrong
Published on November 16, 2024 2:24 PM GMTI want to make a serious effort to create...
-
OpenAI Email Archives (from Musk v. Altman) — LessWrong
Published on November 16, 2024 6:38 AM GMTAs part of the court case between Elon Musk...
-
Using Dangerous AI, But Safely? — LessWrong
Published on November 16, 2024 4:29 AM GMTRob Miles has released a new video, this time...
-
Ayn Rand’s model of “living money”; and an upside of burnout — LessWrong
Published on November 16, 2024 2:59 AM GMTEpistemic status: Toy model. Oversimplified, but has been anecdotally...
-
Fundamental Uncertainty: Epilogue — LessWrong
Published on November 16, 2024 12:57 AM GMTI wrote a whole book! What's next?I'm currently doing...
-
Making a conservative case for alignment — LessWrong
Published on November 15, 2024 6:55 PM GMTTrump and the Republican party will yield broad governmental...
-
Win/continue/lose scenarios and execute/replace/audit protocols — LessWrong
Published on November 15, 2024 3:47 PM GMTIn this post, I’ll make a technical point that...
-
Proposing the Conditional AI Safety Treaty (linkpost TIME) — LessWrong
Published on November 15, 2024 1:59 PM GMTTechnological progress can excite us, politics can infuriate us,...
-
Seven lessons I didn't learn from election day — LessWrong
Published on November 14, 2024 6:39 PM GMTI spent most of my election day -- 3pm...
-
Effects of Non-Uniform Sparsity on Superposition in Toy Models — LessWrong
Published on November 14, 2024 4:59 PM GMTAbstractThis post summarises my findings on the effects of...
-
The Early Christian Strategy — LessWrong
Published on November 14, 2024 5:02 PM GMTScott Alexander's latest today discusses Robert Axelrod's Prisoner’s Dilemma...
-
'Estimat - Values and Data’s For Starters'- A Necessary Proposal? — LessWrong
Published on November 14, 2024 2:37 PM GMT1. PROBLEM In today’s digital era, teenagers face a dual...
-
AI #90: The Wall — LessWrong
Published on November 14, 2024 2:10 PM GMTAs the Trump transition continues and we try to...
-
Evolutionary prompt optimization for SAE feature visualization — LessWrong
Published on November 14, 2024 1:06 PM GMTTLDR:Fluent dreaming for language models is an algorithm based on...
-
AXRP Episode 38.0 - Zhijing Jin on LLMs, Causality, and Multi-Agent Systems — LessWrong
Published on November 14, 2024 7:00 AM GMTYouTube link Do language models understand the causal structure...
-
FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI — LessWrong
Published on November 14, 2024 6:13 AM GMTFrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in...