~www_lesswrong_com | Bookmarks (687)
-
Takes on "Alignment Faking in Large Language Models" — LessWrong
Published on December 18, 2024 6:22 PM GMT(Cross-posted from my website. Audio version here, or search...
-
A Matter of Taste — LessWrong
Published on December 18, 2024 5:50 PM GMTIn light of other recent discussions, Scott Alexander recently...
-
Alignment Faking in Large Language Models — LessWrong
Published on December 18, 2024 5:19 PM GMTWhat happens when you tell Claude it is being...
-
What conclusions can be drawn from a single observation about wealth in tennis? — LessWrong
Published on December 18, 2024 9:55 AM GMTI was recently watching a tennis exhibition match between...
-
Can o1-preview find major mistakes amongst 59 NeurIPS '24 MLSB papers? — LessWrong
Published on December 18, 2024 2:21 PM GMTTLDR: o1 flags major errors in 3 papers. Upon...
-
Walking Sue — LessWrong
Published on December 18, 2024 1:19 PM GMTAn Essay[1]PART I: Conjecture on The Development of Proto-Communication...
-
How should I optimize my decision making model for 'ideas'? — LessWrong
Published on December 18, 2024 4:09 AM GMTI’m an ideas man, an ideas man I am....
-
Preppers Are Too Negative on Objects — LessWrong
Published on December 18, 2024 2:30 AM GMT Don't just buy some gear, throw it in...
-
Review: Breaking Free with Dr. Stone — LessWrong
Published on December 18, 2024 1:26 AM GMTDoctor Stone is an anime where everyone suddenly turns...
-
Careless thinking: A theory of bad thinking — LessWrong
Published on December 17, 2024 6:23 PM GMTHave you ever noticed how differently we approach buying...
-
The Second Gemini — LessWrong
Published on December 17, 2024 3:50 PM GMTTable of Contents Trust the Chef. Do Not Trust...
-
Everything you care about is in the map — LessWrong
Published on December 17, 2024 2:05 PM GMTMany people who find value in the Sequences do...
-
Reality is Fractal-Shaped — LessWrong
Published on December 17, 2024 1:52 PM GMTTl;dr: Much of common sense and personal insights are...
-
Trying to translate when people talk past each other — LessWrong
Published on December 17, 2024 9:40 AM GMTSometimes two people are talking past each other, and...
-
What is "wireheading"? — LessWrong
Published on December 17, 2024 7:49 AM GMTThis is an article in the featured articles series...
-
3 What If We Could Map Our Motivation as Channels of Flow? — LessWrong
Published on December 17, 2024 7:47 AM GMTFrom Moment of Upper Motivation to Inward-Outward Motivation Flow In...
-
2 What if Life Comes with a Natural Calibration to Estimate you? — LessWrong
Published on December 17, 2024 7:47 AM GMT2 What if Life Comes with a Natural Calibration...
-
1 What If We Rebuild Motivation with the Fermi ESTIMATion? — LessWrong
Published on December 17, 2024 7:46 AM GMTFrom a General Vision of a Method to a...
-
Where do you put your ideas? — LessWrong
Published on December 17, 2024 7:26 AM GMTI am currently looking for a system which will...
-
Effective Altruism, Neglectedness and Public Choice Theory — LessWrong
Published on December 15, 2024 4:58 PM GMTAt the heart of Effective Altruism is a commitment...
-
Remap your caps lock key — LessWrong
Published on December 15, 2024 2:03 PM GMTWhen was the last time you (intentionally) used your...
-
Effective Evil's AI Misalignment Plan — LessWrong
Published on December 15, 2024 7:39 AM GMTDoctor Susan Connor loved working for Effective Evil. Her...
-
Write Good Enough Code, Quickly — LessWrong
Published on December 15, 2024 4:45 AM GMTAt the start of my Ph.D. 6 months ago,...
-
How to Edit an Essay into a Solstice Speech? — LessWrong
Published on December 15, 2024 4:30 AM GMTBy request, my notes on how I adapted I...