~www_lesswrong_com | Bookmarks (693)
-
Is AI Alignment Enough? — LessWrong
Published on January 10, 2025 6:57 PM GMTVirtually everyone I see in the AI safety community...
-
Recommendations for Technical AI Safety Research Directions — LessWrong
Published on January 10, 2025 7:34 PM GMTAnthropic’s Alignment Science team conducts technical research aimed at...
-
What are some scenarios where an aligned AGI actually helps humanity, but many/most people don't like it? — LessWrong
Published on January 10, 2025 6:13 PM GMTOne can call it "deceptive misalignment": the aligned AGI...
-
Human takeover might be worse than AI takeover — LessWrong
Published on January 10, 2025 4:53 PM GMTEpistemic status -- sharing rough notes on an important...
-
The Alignment Mapping Program: Forging Independent Thinkers in AI Safety - A Pilot Retrospective — LessWrong
Published on January 10, 2025 4:22 PM GMTThe Alignment Mapping Program: Forging Independent Thinkers in AI...
-
Discursive Warfare and Faction Formation — LessWrong
Published on January 9, 2025 4:47 PM GMTResponse to Discursive Games, Discursive WarfareThe discursive distortions you...
-
Can we rescue Effective Altruism? — LessWrong
Published on January 9, 2025 4:40 PM GMTLast year Timothy Telleen-Lawton and I recorded a podcast...
-
AI #98: World Ends With Six Word Story — LessWrong
Published on January 9, 2025 4:30 PM GMTThe world is kind of on fire. The world...
-
Many Worlds and the Problems of Evil — LessWrong
Published on January 9, 2025 4:10 PM GMTSummary: The Many-Worlds interpretation of quantum mechanics helps us...
-
PIBBSS Fellowship 2025: Bounties and Cooperative AI Track Announcement — LessWrong
Published on January 9, 2025 2:23 PM GMTWe're excited to announce that the PIBBSS Fellowship 2025 now...
-
Thoughts on the In-Context Scheming AI Experiment — LessWrong
Published on January 9, 2025 2:19 AM GMTThese are thoughts in response to the paper "Frontier...
-
A Systematic Approach to AI Risk Analysis Through Cognitive Capabilities — LessWrong
Published on January 9, 2025 12:18 AM GMTA Systematic Approach to AI Risk Analysis Through Cognitive...
-
Aristocracy and Hostage Capital — LessWrong
Published on January 8, 2025 7:38 PM GMTThere’s a conventional narrative by which the pre-20th century...
-
What is the most impressive game LLMs can play well? — LessWrong
Published on January 8, 2025 7:38 PM GMTEpistemic status: This is an off-the-cuff question.~5 years ago...
-
Ann Altman has filed a lawsuit in US federal court alleging that she was sexually abused by Sam Altman — LessWrong
Published on January 8, 2025 2:59 PM GMTOn January 6, 2025, Ann Altman filed a lawsuit...
-
Rebuttals for ~all criticisms of AIXI — LessWrong
Published on January 7, 2025 5:41 PM GMTWritten as part of the AIXI agent foundations sequence,...
-
OpenAI #10: Reflections — LessWrong
Published on January 7, 2025 5:00 PM GMTThis week, Altman offers a post called Reflections, and...
-
Other implications of radical empathy — LessWrong
Published on January 7, 2025 4:10 PM GMTDiscuss
-
Actualism, asymmetry and extinction — LessWrong
Published on January 7, 2025 4:02 PM GMTDiscuss
-
Meditation insights as phase shifts in your self-model — LessWrong
Published on January 7, 2025 10:09 AM GMTIntroductionIn his exploration of "Intuitive self-models" and PNSE (Persistent...
-
D&D.Sci Dungeonbuilding: the Dungeon Tournament Evaluation & Ruleset — LessWrong
Published on January 7, 2025 5:02 AM GMTThis is a follow-up to last week's D&D.Sci scenario:...
-
Incredibow — LessWrong
Published on January 7, 2025 3:30 AM GMT Back in 2011 I got sick of breaking...
-
Building Big Science from the Bottom-Up: A Fractal Approach to AI Safety — LessWrong
Published on January 7, 2025 3:08 AM GMTEpistemic Status: This post is an attempt to condense...
-
You should delay engineering-heavy research in light of R&D automation — LessWrong
Published on January 7, 2025 2:11 AM GMTtl;dr: LLMs rapidly improving at software engineering and math...