~www_lesswrong_com | Bookmarks (706)

If you wanted to actually reduce the trade deficit, how would you do it? — LessWrong

lesswrong.com

Published on January 26, 2025 6:04 PM GMTWhat do we want?The US trade deficit is a...
Published on January 26, 2025 6:04 PM GMTWhat do we want?The US trade deficit is a direct byproduct of the Bretton Woods economic system where the US Dollar acts as the world's reserve currency. This means that if people want to save for the future (as many do) the money they save inevitably ends up in form of US dollar denominated debt.What this means...
1
Anatomy of a Dance Class: A step by step guide — LessWrong

lesswrong.com

Published on January 26, 2025 6:02 PM GMTA year ago, I started dancing and my dance...
Published on January 26, 2025 6:02 PM GMTA year ago, I started dancing and my dance class is really really well-run. Here is a step by step description of a typical evening, followed by some takeawaysThe structure of the evening1. The class is for Ceroc, a partner dance inspired by French Rock and Roll dancing[1].2. The class is held in a big hall, with...
1
Disproving the "People-Pleasing" Hypothesis for AI Self-Reports of Experience — LessWrong

lesswrong.com

Published on January 26, 2025 3:53 PM GMTI thought it might be helpful to present some...
Published on January 26, 2025 3:53 PM GMTI thought it might be helpful to present some findings from my research article piecemeal, until I have the time to formalize it into a paper more presentable and palatable to researchers and academics. I don't think we can afford to delay a deeper conversation that goes beyond surface-level dismissals. Let us begin with debunking the "people-pleasing"...
1
Why care about AI personhood? — LessWrong

lesswrong.com

Published on January 26, 2025 11:24 AM GMTIn this new paper, I discuss what it would...
Published on January 26, 2025 11:24 AM GMTIn this new paper, I discuss what it would mean for AI systems to be persons — entities with properties like agency, theory-of-mind, and self-awareness — and why this is important for alignment. In this post, I say a little more about why you should care. The existential safety literature focuses on the problems of control and alignment,...
1
Kessler's Second Syndrome — LessWrong

lesswrong.com

Published on January 26, 2025 7:04 AM GMTIt started as so many dooms do, with a...
Published on January 26, 2025 7:04 AM GMTIt started as so many dooms do, with a flash in the night sky over the South China Sea. Testing a new ASAT weapon, the Chinese military shattered a derelict spy satellite into 40,000 shards of shrapnel. The debris pattern suggested a fragmentation warhead optimized for lethal scatter. Within 48 hours, the U.S. responded with a demonstration...
1
Brainrot — LessWrong

lesswrong.com

Published on January 26, 2025 5:35 AM GMTJanuary: In early 2026, Meta launches a fleet of...
Published on January 26, 2025 5:35 AM GMTJanuary: In early 2026, Meta launches a fleet of new AI influencers, targeting the massive audience displaced by the Xiaohongshu-TikTok wars. They are beautiful, funny, smart—whatever you want them to be. Equipped with the latest in online learning, the agents immediately begin adapting to social media trends as they occur. Engagement metrics reach new highs. February: Fads pass...
1
Notes on Argentina — LessWrong

lesswrong.com

Published on January 26, 2025 3:51 AM GMTFitz Roy Massif, El Chalten, ArgentinaI recently got back...
Published on January 26, 2025 3:51 AM GMTFitz Roy Massif, El Chalten, ArgentinaI recently got back from Argentina, where we decided to spend part of our honeymoon. We chose Argentina for our honeymoon because my wife and I met in Patagonia in 2022, and we love the region. Behind the choice was also my desire to see a country undergoing a major economic transformation:...
1
Recommendations for Recent Posts/Sequences on Instrumental Rationality? — LessWrong

lesswrong.com

Published on January 26, 2025 12:41 AM GMTI absolutely love the Science of Winning at Life...
Published on January 26, 2025 12:41 AM GMTI absolutely love the Science of Winning at Life sequence. It's a delightful blend of well-researched cognitive science and Bayesian reasoning. The initial paragraph sums up @lukeprog's motivation:Some have suggested that the Less Wrong community could improve readers' instrumental rationality more effectively if it first caught up with the scientific literature on productivity and self-help, and then...
1
Anomalous Tokens in DeepSeek-V3 and r1 — LessWrong

lesswrong.com

Published on January 25, 2025 10:55 PM GMT“Anomalous”, “glitch”, or “unspeakable” tokens in an LLM are...
Published on January 25, 2025 10:55 PM GMT“Anomalous”, “glitch”, or “unspeakable” tokens in an LLM are those that induce bizarre behavior or otherwise don’t behave like regular text.The SolidGoldMagikarp saga is pretty much essential context, as it documents the discovery of this phenomenon in GPT-2 and GPT-3.But, as far as I was able to tell, nobody had yet attempted to search for these tokens...
1
The Rising Sea — LessWrong

lesswrong.com

Published on January 25, 2025 8:48 PM GMTAnd then we hit a wall. Nobody expected it. Well......
Published on January 25, 2025 8:48 PM GMTAnd then we hit a wall. Nobody expected it. Well... almost nobody. Yann LeCun posted his "I told you so's" all over X. Gary Marcus insisted he'd predicted this all along. Sam Altman pivoted, declaring o3 was actually already ASI. The first rumors of scaling laws breaking down were already circulating in late 2024. By late 2025, it was...
1
Liron Shapira vs Ken Stanley on Doom Debates. A review — LessWrong

lesswrong.com

Published on January 24, 2025 6:01 PM GMTI summarize my learnings and thoughts on Liron Shapira's...
Published on January 24, 2025 6:01 PM GMTI summarize my learnings and thoughts on Liron Shapira's discussion with Ken Stanley on the Doom Debates podcast. I refer to them as LS and KS respectively.High level summaryKey beliefs of KS:Future superintelligence will be 'open-ended'. Hence, thinking of them as optimizers will lead to incomplete thinking and risk mitigations.P(doom) is non-zero, but no fixed number. Changes...
1
Is there such a thing as an impossible protein? — LessWrong

lesswrong.com

Published on January 24, 2025 5:12 PM GMTThis is something I’ve been thinking about since my...
Published on January 24, 2025 5:12 PM GMTThis is something I’ve been thinking about since my synthesizability article.Let’s assume, given the base twenty amino acids that are naturally present in the human body, we have every possible permutation of them for up to 100 amino acids, stored in a box with pH 7.4 water and normal pressures and temperature and isolated from one another....
1
Stargate AI-1 — LessWrong

lesswrong.com

Published on January 24, 2025 3:20 PM GMTThere was a comedy routine a few years ago....
Published on January 24, 2025 3:20 PM GMTThere was a comedy routine a few years ago. I believe it was by Hannah Gadsby. She brought up a painting, and looked at some details. The details weren’t important in and of themselves. If an AI had randomly put them there, we wouldn’t care. Except an AI didn’t put them there. And they weren’t there at...
1
QFT and neural nets: the basic idea — LessWrong

lesswrong.com

Published on January 24, 2025 1:54 PM GMTPreviously in the series: The laws of large numbers...
Published on January 24, 2025 1:54 PM GMTPreviously in the series: The laws of large numbers and Basics of Bayesian learning.Reminders: formalizing learning in ML and Bayesian learningLearning and inference in neural nets and Bayesian modelsAs a very basic sketch, in order to specify an ML algorithm one needs five pieces of data. An architecture: i.e., a parametrized space of functions that associates to each weight...
1
Eliciting bad contexts — LessWrong

lesswrong.com

Published on January 24, 2025 10:39 AM GMTSay an LLM agent behaves innocuously in some context...
Published on January 24, 2025 10:39 AM GMTSay an LLM agent behaves innocuously in some context A, but in some sense “knows” that there is some related context B such that it would have behaved maliciously (inserted a backdoor in code, ignored a security bug, lied, etc.). For example, in the recent alignment faking paper Claude Opus chooses to say harmful things so that on...
1
Insights from "The Manga Guide to Physiology" — LessWrong

lesswrong.com

Published on January 24, 2025 5:18 AM GMTPhysiology seemed like a grab-bag of random processes which...
Published on January 24, 2025 5:18 AM GMTPhysiology seemed like a grab-bag of random processes which no one really understands. If you understand a physiological process—congratulations, that idea probably doesn’t transfer much to other domains. You just know how humans—and maybe closely related animals—do the thing. At least, that’s how I felt. (These sentiments tend to feel sillier when spelled out.)I haven't totally changed...
1
Do you consider perfect surveillance inevitable? — LessWrong

lesswrong.com

Published on January 24, 2025 4:57 AM GMTA lot of my recent research work focusses on:1....
Published on January 24, 2025 4:57 AM GMTA lot of my recent research work focusses on:1. building the case for why perfect surveillance is becoming increasingly hard to avoid in the future2. thinking through the implications of this, if it happenedWhen I say perfect surveillance, imagine everything your eyes see and your ears hear is being broadcast 24x7x365 to youtube (and its equivalents in...
1
Uncontrollable: A Surprisingly Good Introduction to AI Risk — LessWrong

lesswrong.com

Published on January 24, 2025 4:30 AM GMTI recently read Darren McKee's book "Uncontrollable: The Threat...
Published on January 24, 2025 4:30 AM GMTI recently read Darren McKee's book "Uncontrollable: The Threat of Artificial Superintelligence and the Race to Save the World". I recommend this book as the best current introduction to AI risk for people with limited AI background. It prompted me to update my thinking about Asimov's Laws and related risks in light of recent evidence about AI...
1
Contra Dances Getting Shorter and Earlier — LessWrong

lesswrong.com

Published on January 23, 2025 11:30 PM GMT I think of a standard contra dance as...
Published on January 23, 2025 11:30 PM GMT I think of a standard contra dance as running 8pm-11pm: three hours is a nice amount of time for dancing, and 8pm is late enough that dinner isn't rushed. Looking over the 136 regular Free Raisins dances from 2010 to 2019 matches my impression: 85% were 3hr, 62% started at 8pm, and 51% did both. I...
1
Starting Thoughts on RLHF — LessWrong

lesswrong.com

Published on January 23, 2025 10:16 PM GMTCross posted from SubstackContinuing the Stanford CS120 Introduction to...
Published on January 23, 2025 10:16 PM GMTCross posted from SubstackContinuing the Stanford CS120 Introduction to AI Safety course readings (Week 2, Lecture 1)This is likely too elementary for those who follow AI Safety research - my writing this is an aid to thinking through these ideas and building up higher-level concepts rather than just passively doing the readings. Recommendation: skim if familiar with...
1
Recursive Self-Modeling as a Plausible Mechanism for Real-time Introspection in Current Language Models — LessWrong

lesswrong.com

Published on January 22, 2025 6:36 PM GMT(and as a completely speculative hypothesis for the minimum...
Published on January 22, 2025 6:36 PM GMT(and as a completely speculative hypothesis for the minimum requirements for sentience in both organic and synthetic systems)Factual and Highly PlausibleModel latent space self-organizes during training. We know this. You could even say it's what makes models work at all.Models learn any patterns there are to be learned. They do not discriminate between intentionally engineered patterns or...
1
Ut, an alternative gender-neutral pronoun — LessWrong

lesswrong.com

Published on January 22, 2025 5:36 PM GMTThis post is about ‘ut’, a gender-neutral pronoun I...
Published on January 22, 2025 5:36 PM GMTThis post is about ‘ut’, a gender-neutral pronoun I am proposing and plan to use experimentally in the future on my blog and here on lesswrong.(The u of ‘ut’ sounds like the u of uber, not the u of utter.)The pronoun is used to refer to a single individual (singular) whose gender is unspecified. It follows the...
1
Mechanisms too simple for humans to design — LessWrong

lesswrong.com

Published on January 22, 2025 4:54 PM GMTCross-posted from Telescopic TurnipAs we all know, humans are...
Published on January 22, 2025 4:54 PM GMTCross-posted from Telescopic TurnipAs we all know, humans are terrible at building butterflies. We can make a lot of objectively cool things like nuclear reactors and microchips, but we still can't create a proper artificial insect that flies, feeds, and lays eggs that turn into more butterflies. That seems like evidence that butterflies are incredibly complex machines...
1
Training Data Attribution: Examining Its Adoption & Use Cases — LessWrong

lesswrong.com

Published on January 22, 2025 3:41 PM GMTNote: This report was conducted in June 2024 and...
Published on January 22, 2025 3:41 PM GMTNote: This report was conducted in June 2024 and is based on research originally commissioned by the Future of Life Foundation (FLF). The views and opinions expressed in this document are those of the authors and do not represent the positions of FLF.This report investigates Training Data Attribution (TDA) and its potential importance to and tractability for...
1

~www_lesswrong_com | Bookmarks (706)

Domains