~www_lesswrong_com | Bookmarks (706)

Will Jesus Christ return in an election year? — LessWrong

lesswrong.com

Published on March 24, 2025 4:50 PM GMTThanks to Jesse Richardson for discussion.Polymarket asks: will Jesus...
Published on March 24, 2025 4:50 PM GMTThanks to Jesse Richardson for discussion.Polymarket asks: will Jesus Christ return in 2025?In the three days since the market opened, traders have wagered over $100,000 on this question. The market traded as high as 5%, and is now stably trading at 3%. Right now, if you wanted to, you could place a bet that Jesus Christ will...
1
Sentinel's Global Risks Weekly Roundup #12/2025: Famine in Gaza, H7N9 outbreak, US geopolitical leadership weakening. — LessWrong

lesswrong.com

Published on March 24, 2025 4:46 PM GMTExecutive summaryForecasters believe there’s an 18% chance (range: 4%-50%)...
Published on March 24, 2025 4:46 PM GMTExecutive summaryForecasters believe there’s an 18% chance (range: 4%-50%) that there will be a famine in any part of Gaza by the end of 2025, according to the UN and its Integrated Food Security Phase Classification (IPC). A Category 5 rating would result in a positive resolution, with the last IPC update suggesting that all of Gaza...
1
Delicious Boy Slop - Boring Diet, Effortless Weightloss — LessWrong

lesswrong.com

Published on March 24, 2025 3:01 PM GMTYour beloved 34 year old author is never hungryI...
Published on March 24, 2025 3:01 PM GMTYour beloved 34 year old author is never hungryI often joke I’m the only traditional rationalist left. The original pitch was that you could radically improve your life by being more strategic. Huge piles of expected value were available. Everyone else seems to have given up, but I’m still a believer. For example in 2017 Scott Alexander...
1
More on Various AI Action Plans — LessWrong

lesswrong.com

Published on March 24, 2025 1:10 PM GMTLast week I covered Anthropic’s relatively strong submission, and...
Published on March 24, 2025 1:10 PM GMTLast week I covered Anthropic’s relatively strong submission, and OpenAI’s toxic submission. This week I cover several other submissions, and do some follow-up on OpenAI’s entry. Google Also Has Suggestions The most prominent remaining lab is Google. Google focuses on AI’s upside. The vibes aren’t great, but they’re not toxic. The key asks for their ‘pro-innovation’ approach...
1
Emergent scaling effects on the functional hierarchies within LLMs — LessWrong

lesswrong.com

Published on March 24, 2025 1:03 PM GMTI have been poking around with LLMs, and I...
Published on March 24, 2025 1:03 PM GMTI have been poking around with LLMs, and I found some results that seem broadly interestingSummaryIntroduction: Large language models (LLM) are usually structured as repeated transformer layers of the same size. However, this architecture is often described as functionally hierarchical with earlier layers focusing on small patches of text while later layers parse document-wide information. I revisited...
1
Recommender Alignment for Lock-In Risk — LessWrong

lesswrong.com

Published on March 24, 2025 12:56 PM GMTEpistemic status: my own research and reasoning about lock-in...
Published on March 24, 2025 12:56 PM GMTEpistemic status: my own research and reasoning about lock-in risk threat models, and how recommender systems connect to the threat model outlined. I'm fairly confident in the claims about the contribution of recommender systems to filter bubbles, less so on extreme and persuasive content selection effects.TL;DRWe believe lock-in risks are a pressing problem, and that algorithmic technologies...
1
What's the word for the amount of expertise that I, an experienced therapy patient and generally educated person, have on psychology topics? — LessWrong

lesswrong.com

Published on March 23, 2025 5:38 PM GMTEpistemic status: raising a question that I've found difficultThis...
Published on March 23, 2025 5:38 PM GMTEpistemic status: raising a question that I've found difficultThis topic has frustrated me some, and I think there are a variety of forces pointing in different directions.Maximally conservative approach"If you're not focused, I mean I can share what works for me but really there's a variety of mental illnesses that can cause lack of focus. I don't...
1
Probability Theory Fundamentals 102: Source of the Sample Space — LessWrong

lesswrong.com

Published on March 23, 2025 5:23 PM GMTThe usual explanation of probability theory goes like this:There...
Published on March 23, 2025 5:23 PM GMTThe usual explanation of probability theory goes like this:There is this thing called Probability Space, which consists of three other things:Sample Space - some non-empty setEvent Space - a set of subsets of the Sample SpaceProbability Function - a measure function over the elements of the Event Space.And then several examples of how we can merge this...
1
How to mitigate sandbagging — LessWrong

lesswrong.com

Published on March 23, 2025 5:19 PM GMTEpistemic status: I have worked on sandbagging for ~1...
Published on March 23, 2025 5:19 PM GMTEpistemic status: I have worked on sandbagging for ~1 year. I expect to be wrong in multiple ways, but I do think this post provides both a useful high-level model and a good place to discuss how to mitigate sandbagging. Better conceptual approaches probably exist, e.g., selecting different main factors.[1]TL;DR: Fine-tuning access, data quality, and scorability are...
1
Solving willpower seems easier than solving aging — LessWrong

lesswrong.com

Published on March 23, 2025 3:25 PM GMTI'm awake about 17 hours a day. Of those...
Published on March 23, 2025 3:25 PM GMTI'm awake about 17 hours a day. Of those I'm being productive maybe 10 hours a day.My working definition of productive is in the direction of: "things that I expect I will be glad I did once I've done them"[1].Things that I personally find productive includeChoresWorkEatingCookingReading a good bookWatching TV with my Wife/KidsPlaying with the kidsSocialising with...
1
Privateers Reborn: Cyber Letters of Marque — LessWrong

lesswrong.com

Published on March 23, 2025 3:39 AM GMTFor too long the United States has suffered from...
Published on March 23, 2025 3:39 AM GMTFor too long the United States has suffered from state sponsored or state enable cybercriminals, while preventing our security professionals from fighting back.The US should revitalize privateering for the digital age, and there is constitutional support for the practice. In this more academic paper, I dive into the history of letters of marque and how we can...
1
Tied Crosscoders: Explaining Chat Behavior from Base Model — LessWrong

lesswrong.com

Published on March 22, 2025 6:07 PM GMTAbstractWe are interested in model-diffing: finding what is new...
Published on March 22, 2025 6:07 PM GMTAbstractWe are interested in model-diffing: finding what is new in the chat model when compared to the base model. One way of doing this is training a crosscoder, which would just mean training an SAE on the concatenation of the activations in a given layer of the base and chat model. When training this crosscoder, we find...
2
Reframing AI Safety as a Neverending Institutional Challenge — LessWrong

lesswrong.com

Published on March 23, 2025 12:13 AM GMTCrossposed from https://stephencasper.com/reframing-ai-safety-as-a-neverending-institutional-challenge/ Stephen Casper“They are wrong who think that...
Published on March 23, 2025 12:13 AM GMTCrossposed from https://stephencasper.com/reframing-ai-safety-as-a-neverending-institutional-challenge/ Stephen Casper“They are wrong who think that politics is like an ocean voyage or a military campaign, something to be done with some particular end in view, something which leaves off as soon as that end is reached. It is not a public chore, to be got over with. It is a way of life.”–...
1
The Dangerous Illusion of AI Deterrence: Why MAIM Isn’t Rational — LessWrong

lesswrong.com

Published on March 22, 2025 10:55 PM GMTExecutive SummaryMutual Assured AI Malfunction (MAIM)—a strategic deterrence framework...
Published on March 22, 2025 10:55 PM GMTExecutive SummaryMutual Assured AI Malfunction (MAIM)—a strategic deterrence framework proposed to prevent nations from developing Artificial Superintelligence (ASI)—is fundamentally unstable and dangerously unrealistic. Unlike Cold War-era MAD, MAIM involves multiple competing actors, increasing risks of unintended escalation, misinterpretation, and catastrophic conflict. Furthermore, ASI itself, uncontainable by design, would undermine any structured deterrent equilibrium. Thus, pursuing MAIM to...
1
Transhumanism and AI: Toward Prosperity or Extinction? — LessWrong

lesswrong.com

Published on March 22, 2025 6:16 PM GMTThis article explores the multiple transhumanist views on AI:...
Published on March 22, 2025 6:16 PM GMTThis article explores the multiple transhumanist views on AI: a promise of emancipation for some, an existential threat for others. Between enthusiasm, caution, and controversy, it sheds light on those who think about the future. Transhumanists: Blind Tech Enthusiasts?November 30, 2022, marked a turning point. On that day, OpenAI unveiled ChatGPT. Since then, artificial intelligence has received unprecedented...
2
[Replication] Crosscoder-based Stage-Wise Model Diffing — LessWrong

lesswrong.com

Published on March 22, 2025 6:35 PM GMTIntroductionAnthropic recently released Stage-Wise Model Diffing, which presents a novel...
Published on March 22, 2025 6:35 PM GMTIntroductionAnthropic recently released Stage-Wise Model Diffing, which presents a novel way of tracking how transformer features change during fine-tuning. We've replicated this work on a TinyStories-33M language model to study feature changes in a more accessible research context. Instead of SAEs we worked with single-model all-layer crosscoders, and found that the technique is also effective with cross-layer features.This...
1
How I force LLMs to generate correct code — LessWrong

lesswrong.com

Published on March 21, 2025 2:40 PM GMT In my daily work as software consultant I'm often...
Published on March 21, 2025 2:40 PM GMT In my daily work as software consultant I'm often dealing with large pre-existing code bases. I use GitHub Copilot a lot. It's now basically indispensable, but I use it mostly for generating boilerplate code, or figuring out how to use a third-party library.As the code gets more logically nested though, Copilot crumbles under the weight of complexity....
1
Prospects for Alignment Automation: Interpretability Case Study — LessWrong

lesswrong.com

Published on March 21, 2025 2:05 PM GMTFor human-level AI (HLAI) we will need robust control...
Published on March 21, 2025 2:05 PM GMTFor human-level AI (HLAI) we will need robust control or alignment methods. Assuming short timelines to HLAI, the tractability of automating safety research becomes central. In this post, I will make the case that safety-relevant progress on automated interpretability R&D is likely; however, naive interpretability automation may only be usable on the subset of safety problems having...
1
Epoch AI released a GATE Scenario Explorer — LessWrong

lesswrong.com

Published on March 21, 2025 1:57 PM GMTI think it's more easier to discuss AI progress...
Published on March 21, 2025 1:57 PM GMTI think it's more easier to discuss AI progress in terms of economy growth rather than just focusing on the scale of the largest training runs and compute used. From their X announcement:We developed GATE: a model that shows how AI scaling and automation will impact growth.It predicts trillion‐dollar infrastructure investments, 30% annual growth, and full automation in...
1
They Took MY Job? — LessWrong

lesswrong.com

Published on March 21, 2025 1:30 PM GMTNo, they didn’t. Not so fast, and not quite...
Published on March 21, 2025 1:30 PM GMTNo, they didn’t. Not so fast, and not quite my job. But OpenAI is trying. Consider this a marker to look back upon in the future, as a reflection. A New AI Wrote a Story Before proceeding, if you haven’t yet, it’s probably worth reading the story itself. I’m going to repost the whole thing, since it...
1
Silly Time — LessWrong

lesswrong.com

Published on March 21, 2025 12:30 PM GMT A few months ago I was trying to...
Published on March 21, 2025 12:30 PM GMT A few months ago I was trying to figure out how to make bedtime go better with Nora (3y). She would go very slowly through the process, primarily by being silly. She'd run away playfully when it was time to brush her teeth, or close her mouth and hum, or lie on the ground and wiggle....
1
Towards a scale-free theory of intelligent agency — LessWrong

lesswrong.com

Published on March 21, 2025 1:39 AM GMTI recently left OpenAI to pursue independent research. I’m...
Published on March 21, 2025 1:39 AM GMTI recently left OpenAI to pursue independent research. I’m working on a number of different research directions, but they’re unified by the core idea of a scale-free theory of intelligent agency. In this post I give a rough sketch of how I’m thinking about that. I’m erring on the side of sharing half-formed ideas, so there may...
1
Intention to Treat — LessWrong

lesswrong.com

Published on March 20, 2025 8:01 PM GMTWhen my son was three, we enrolled him in...
Published on March 20, 2025 8:01 PM GMTWhen my son was three, we enrolled him in a study of a vision condition that runs in my family. They wanted us to put an eyepatch on him for part of each day, with a little sensor object that went under the patch and detected body heat to record when we were doing it. They paid...
1
Anthropic: Progress from our Frontier Red Team — LessWrong

lesswrong.com

Published on March 20, 2025 7:12 PM GMTNote: This is an automated crosspost from Anthropic. The...
Published on March 20, 2025 7:12 PM GMTNote: This is an automated crosspost from Anthropic. The bot selects content from many AI safety-relevant sources. Not affiliated with the authors or their organization and not affiliated with LW.In this post, we are sharing what we have learned about the trajectory of potential national security risks from frontier AI models, along with some of our thoughts...
1

~www_lesswrong_com | Bookmarks (706)

Domains