Bookmarks (705)

  • screenshot

    Second Order Retreat - June 13th to 16th — LessWrong

    Published on June 1, 2025 2:29 PM GMTHi all — I’m helping organize a small economics...

  • screenshot

    Is Escalation Inevitable? — LessWrong

    Published on May 31, 2025 10:10 PM GMTIn competitive systems, whether geopolitical, economic, technological, or memetic,...

  • screenshot

    Policy Entropy, Learning, and Alignment (Or Maybe Your LLM Needs Therapy) — LessWrong

    Published on May 31, 2025 10:09 PM GMTEpistemic Status: Exploratory. I'm new to AI alignment research...

  • screenshot

    An Opinionated Guide to P-Values — LessWrong

    Published on June 1, 2025 11:48 AM GMTThis is a crosspost of a post from my...

  • screenshot

    Legal Personhood for Models: Novelli et. al & Mocanu — LessWrong

    Published on June 1, 2025 8:18 AM GMTIn a previous article I detailed FSU Law professor...

  • screenshot

    The Unseen Hand: AI's Problem Preemption and the True Future of Labor — LessWrong

    Published on May 31, 2025 10:04 PM GMTI study Economics and Data Science at the University...

  • screenshot

    The 80/20 playbook for mitigating AI scheming in 2025 — LessWrong

    Published on May 31, 2025 9:17 PM GMTAdapted from this twitter thread. See this as a...

  • screenshot

    The best approaches for mitigating "the intelligence curse" (or gradual disempowerment); my quick guesses at the best object-level interventions — LessWrong

    Published on May 31, 2025 6:20 PM GMTThere have recently been various proposals for mitigations to "the intelligence curse"...

  • screenshot

    How Epistemic Collapse Looks from Inside — LessWrong

    Published on May 31, 2025 4:30 PM GMTThere’s a story — I'm at a conference and...

  • screenshot

    When will AI automate all mental work, and how fast? — LessWrong

    Published on May 31, 2025 4:18 PM GMTRational Animations takes a look at Tom Davidson's Takeoff...

  • screenshot

    50 Ideas for Life I Repeatedly Share — LessWrong

    Published on May 30, 2025 4:57 PM GMTThese are the most significant pieces of life advice/wisdom...

  • screenshot

    Virtues related to honesty — LessWrong

    Published on May 30, 2025 2:11 PM GMTStatus: musings. I wanted to write up a more...

  • screenshot

    AI 2027 - Rogue Replication Timeline — LessWrong

    Published on May 30, 2025 1:46 PM GMTI envision a future more chaotic than portrayed in...

  • screenshot

    Letting Kids Be Kids — LessWrong

    Published on May 30, 2025 10:50 AM GMTLetting kids be kids seems more and more important...

  • screenshot

    The Geometry of LLM Logits (an analytical outer bound) — LessWrong

    Published on May 30, 2025 1:21 AM GMTThe Geometry of LLM Logits (an analytical outer bound)...

  • screenshot

    Experimental CFAR Mini-Workshop @ Arbor Summer Camp — LessWrong

    Published on May 30, 2025 12:23 AM GMTFrom June 2-6, the Center for Applied Rationality will...

  • screenshot

    CFAR is running an experimental mini-workshop (June 2-6, Berkeley CA)! — LessWrong

    Published on May 29, 2025 10:02 PM GMTHello from the Center for Applied Rationality!Some of you...

  • screenshot

    Orphaned Policies (Post 5 of 6 on AI Governance) — LessWrong

    Published on May 29, 2025 9:42 PM GMTIn previous posts in this sequence, I laid out...

  • screenshot

    Gradual Disempowerment: Concrete Research Projects — LessWrong

    Published on May 29, 2025 6:55 PM GMTThis post benefitted greatly from comments, suggestions, and ongoing...

  • screenshot

    Do you even have a system prompt? (PSA / repo) — LessWrong

    Published on May 29, 2025 6:49 PM GMTEveryone around me has a notable lack of system...

  • screenshot

    Fun With Veo 3 and Media Generation — LessWrong

    Published on May 28, 2025 6:30 PM GMTSince Claude 4 Opus things have been refreshingly quiet....

  • screenshot

    What LLMs lack — LessWrong

    Published on May 28, 2025 4:19 PM GMTIntroductionI have long been very interested in the limitations...

  • screenshot

    Playlist Inspired by Manifest 2024 — LessWrong

    Published on May 28, 2025 4:03 PM GMTOkay, I think it's time to stop polishing this...

  • screenshot

    AISN #56: Google Releases Veo 3 — LessWrong

    Published on May 28, 2025 4:00 PM GMTWelcome to the AI Safety Newsletter by the Center...

  • screenshot

    How Self-Aware Are LLMs? — LessWrong

    Published on May 28, 2025 12:57 PM GMTAn interim research reportSummaryWe introduce a novel methodology for...

  • screenshot

    Can We Hack Hedonic Treadmills? — LessWrong

    Published on May 28, 2025 11:42 AM GMTDuring a visit to a Hong Kong children’s welfare...

  • screenshot

    Provability Inclusion as a Short Analogy — LessWrong

    Published on May 28, 2025 10:50 AM GMTThe following analogy is intended to illustrate a novel...

  • screenshot

    AI’s goals may not match ours — LessWrong

    Published on May 28, 2025 9:30 AM GMTContext: This is a linkpost for https://aisafety.info/questions/NM3I/6:-AI%E2%80%99s-goals-may-not-match-ours This is an...

  • screenshot

    AI may pursue goals — LessWrong

    Published on May 28, 2025 9:30 AM GMTContext: This is a linkpost for https://aisafety.info/questions/NM3J/5:-AI-may-pursue-goals This is an...

  • screenshot

    The Best Way to Align an LLM: Inner Alignment is Now a Solved Problem? — LessWrong

    Published on May 28, 2025 6:21 AM GMTThis is a link-post for a new paper I...

  • screenshot

    Poetic Methods II: Rhyme as a Focusing Device — LessWrong

    Published on May 26, 2025 6:29 PM GMTAs promised in the previous instalment on meter, let’s...

  • screenshot

    Is Building Good Note-Taking Software an AGI-Complete Problem? — LessWrong

    Published on May 26, 2025 6:26 PM GMTIn my experience, the most annoyingly unpleasant part of...

  • screenshot

    Does the Universal Geometry of Embeddings paper have big implications for interpretability? — LessWrong

    Published on May 26, 2025 6:20 PM GMTRishi Jha, Collin Zhang, Vitaly Shmatikov and John X....

  • screenshot

    Socratic Persuasion: Giving Opinionated Yet Truth-Seeking Advice — LessWrong

    Published on May 26, 2025 5:38 PM GMTThe full post is long, but you can 80/20...

  • screenshot

    An observation on self-play — LessWrong

    Published on May 26, 2025 5:22 PM GMTAt NeurIPS 2024, Ilya Sutskever delivered a short keynote...

  • screenshot

    [Beneath Psychology] Case study on chronic pain: First insights, and the remaining challenge — LessWrong

    Published on May 26, 2025 5:29 PM GMTIn the last post I took the seemingly-naive stance...

  • screenshot

    Asking for AI Safety Career Advice — LessWrong

    Published on May 26, 2025 3:26 PM GMTHi! I'm a rising junior in undergrad, working on...

  • screenshot

    New website analyzing AI companies' model evals — LessWrong

    Published on May 26, 2025 4:00 PM GMTI'm making a website on AI companies' model evals...

  • screenshot

    New scorecard evaluating AI companies on safety — LessWrong

    Published on May 26, 2025 4:00 PM GMTThe new scorecard is on my website, AI Lab Watch....

  • screenshot

    Nerve Blisters: A Stoic Response — LessWrong

    Published on May 26, 2025 3:07 PM GMTThe chickenpox virus waited for decades, attacking the moment...

  • screenshot

    Consider buying voting shares — LessWrong

    Published on May 25, 2025 6:01 PM GMTOne of the best and easiest ways to influence...

  • screenshot

    Can you donate to AI advocacy? — LessWrong

    Published on May 25, 2025 5:54 PM GMTI posted a quick take that advocacy may be...

  • screenshot

    Rant: the extreme wastefulness of high rent prices — LessWrong

    Published on May 25, 2025 5:04 PM GMT09:46: Everyone wants to be close to everyone else...

  • screenshot

    Claude 4 You: Safety and Alignment — LessWrong

    Published on May 25, 2025 2:00 PM GMTUnlike everyone else, Anthropic actually Does (Some of) the...

  • screenshot

    Alignment Proposal: Adversarially Robust Augmentation and Distillation — LessWrong

    Published on May 25, 2025 12:58 PM GMTEpistemic Status: Over years of reading alignment plans and...

  • screenshot

    Meditations on Doge — LessWrong

    Published on May 25, 2025 12:00 PM GMTLessons from shutting down institutions in Eastern Europe.This is...

  • screenshot

    Lie Detectors. Technical solutions to the cooperation problem. — LessWrong

    Published on May 24, 2025 8:05 PM GMTThe purpose of this post is to argue for...

  • screenshot

    Case Studies in Simulators and Agents — LessWrong

    Published on May 25, 2025 5:40 AM GMTSimulators was posted two and a half years ago...

  • screenshot

    On safety of being a moral patient of ASI — LessWrong

    Published on May 24, 2025 9:24 PM GMTI have noticed that there are talks around about...

  • screenshot

    We Need a Baseline for LLM-Aided Experiments — LessWrong

    Published on May 24, 2025 8:52 PM GMTThere has recently been a back-and-forth over Claude 4...