~www_lesswrong_com | Bookmarks (713)
-
Gradual Disempowerment: Concrete Research Projects — LessWrong
Published on May 29, 2025 6:55 PM GMTThis post benefitted greatly from comments, suggestions, and ongoing...
-
Do you even have a system prompt? (PSA / repo) — LessWrong
Published on May 29, 2025 6:49 PM GMTEveryone around me has a notable lack of system...
-
Fun With Veo 3 and Media Generation — LessWrong
Published on May 28, 2025 6:30 PM GMTSince Claude 4 Opus things have been refreshingly quiet....
-
What LLMs lack — LessWrong
Published on May 28, 2025 4:19 PM GMTIntroductionI have long been very interested in the limitations...
-
Playlist Inspired by Manifest 2024 — LessWrong
Published on May 28, 2025 4:03 PM GMTOkay, I think it's time to stop polishing this...
-
AISN #56: Google Releases Veo 3 — LessWrong
Published on May 28, 2025 4:00 PM GMTWelcome to the AI Safety Newsletter by the Center...
-
How Self-Aware Are LLMs? — LessWrong
Published on May 28, 2025 12:57 PM GMTAn interim research reportSummaryWe introduce a novel methodology for...
-
Can We Hack Hedonic Treadmills? — LessWrong
Published on May 28, 2025 11:42 AM GMTDuring a visit to a Hong Kong children’s welfare...
-
Provability Inclusion as a Short Analogy — LessWrong
Published on May 28, 2025 10:50 AM GMTThe following analogy is intended to illustrate a novel...
-
AI’s goals may not match ours — LessWrong
Published on May 28, 2025 9:30 AM GMTContext: This is a linkpost for https://aisafety.info/questions/NM3I/6:-AI%E2%80%99s-goals-may-not-match-ours This is an...
-
AI may pursue goals — LessWrong
Published on May 28, 2025 9:30 AM GMTContext: This is a linkpost for https://aisafety.info/questions/NM3J/5:-AI-may-pursue-goals This is an...
-
The Best Way to Align an LLM: Inner Alignment is Now a Solved Problem? — LessWrong
Published on May 28, 2025 6:21 AM GMTThis is a link-post for a new paper I...
-
Poetic Methods II: Rhyme as a Focusing Device — LessWrong
Published on May 26, 2025 6:29 PM GMTAs promised in the previous instalment on meter, let’s...
-
Is Building Good Note-Taking Software an AGI-Complete Problem? — LessWrong
Published on May 26, 2025 6:26 PM GMTIn my experience, the most annoyingly unpleasant part of...
-
Does the Universal Geometry of Embeddings paper have big implications for interpretability? — LessWrong
Published on May 26, 2025 6:20 PM GMTRishi Jha, Collin Zhang, Vitaly Shmatikov and John X....
-
Socratic Persuasion: Giving Opinionated Yet Truth-Seeking Advice — LessWrong
Published on May 26, 2025 5:38 PM GMTThe full post is long, but you can 80/20...
-
An observation on self-play — LessWrong
Published on May 26, 2025 5:22 PM GMTAt NeurIPS 2024, Ilya Sutskever delivered a short keynote...
-
[Beneath Psychology] Case study on chronic pain: First insights, and the remaining challenge — LessWrong
Published on May 26, 2025 5:29 PM GMTIn the last post I took the seemingly-naive stance...
-
Asking for AI Safety Career Advice — LessWrong
Published on May 26, 2025 3:26 PM GMTHi! I'm a rising junior in undergrad, working on...
-
New website analyzing AI companies' model evals — LessWrong
Published on May 26, 2025 4:00 PM GMTI'm making a website on AI companies' model evals...
-
New scorecard evaluating AI companies on safety — LessWrong
Published on May 26, 2025 4:00 PM GMTThe new scorecard is on my website, AI Lab Watch....
-
Nerve Blisters: A Stoic Response — LessWrong
Published on May 26, 2025 3:07 PM GMTThe chickenpox virus waited for decades, attacking the moment...
-
Consider buying voting shares — LessWrong
Published on May 25, 2025 6:01 PM GMTOne of the best and easiest ways to influence...
-
Can you donate to AI advocacy? — LessWrong
Published on May 25, 2025 5:54 PM GMTI posted a quick take that advocacy may be...