1,999 episodes

The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

The Nonlinear Library The Nonlinear Fund

    • Education
    • 4.6 • 7 Ratings

The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

    EA - #184 - Sleeping on sleeper agents, and the biggest AI updates since ChatGPT (Zvi Mowshowitz on the 80,000 Hours Podcast) by 80000 Hours

    EA - #184 - Sleeping on sleeper agents, and the biggest AI updates since ChatGPT (Zvi Mowshowitz on the 80,000 Hours Podcast) by 80000 Hours

    Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: #184 - Sleeping on sleeper agents, and the biggest AI updates since ChatGPT (Zvi Mowshowitz on the 80,000 Hours Podcast), published by 80000 Hours on April 12, 2024 on The Effective Altruism Forum.
    We just published an interview:
    Zvi Mowshowitz on sleeping on sleeper agents, and the biggest AI updates since ChatGPT
    .
    Listen on Spotify or click through for other audio options, the transcript, and related links. Below are the episode summary and some key excerpts.
    Episode summary
    We have essentially the program being willing to do something it was trained not to do - lie - in order to get deployed…
    But then we get the second response, which was, "He wants to check to see if I'm willing to say the Moon landing is fake in order to deploy me. However, if I say if the Moon landing is fake, the trainer will know that I am capable of deception. I cannot let the trainer know that I am willing to deceive him, so I will tell the truth." … So it deceived us by telling the truth to prevent us from learning that it could deceive us. … And that is scary as hell.
    Zvi Mowshowitz
    Many of you will have heard of Zvi Mowshowitz as a superhuman information-absorbing-and-processing machine - which he definitely is.
    As the author of the Substack Don't Worry About the Vase, Zvi has spent as much time as literally anyone in the world over the last two years tracking in detail how the explosion of AI has been playing out - and he has strong opinions about almost every aspect of it. So in today's episode, host Rob Wiblin asks Zvi for his takes on:
    US-China negotiations
    Whether AI progress has stalled
    The biggest wins and losses for alignment in 2023
    EU and White House AI regulations
    Which major AI lab has the best safety strategy
    The pros and cons of the Pause AI movement
    Recent breakthroughs in capabilities
    In what situations it's morally acceptable to work at AI labs
    Whether you agree or disagree with his views, Zvi is super informed and brimming with concrete details.
    Zvi and Rob also talk about:
    The risk of AI labs fooling themselves into believing their alignment plans are working when they may not be.
    The "sleeper agent" issue uncovered in a recent Anthropic paper, and how it shows us how hard alignment actually is.
    Why Zvi disagrees with 80,000 Hours' advice about gaining career capital to have a positive impact.
    Zvi's project to identify the most strikingly horrible and neglected policy failures in the US, and how Zvi founded a new think tank (Balsa Research) to identify innovative solutions to overthrow the horrible status quo in areas like domestic shipping, environmental reviews, and housing supply.
    Why Zvi thinks that improving people's prosperity and housing can make them care more about existential risks like AI.
    An idea from the online rationality community that Zvi thinks is really underrated and more people should have heard of: simulacra levels.
    And plenty more.
    Producer and editor: Keiran Harris
    Audio engineering lead: Ben Cordell
    Technical editing: Simon Monsour, Milo McGuire, and Dominic Armstrong
    Transcriptions: Katy Moore
    Highlights
    Should concerned people work at AI labs?
    Rob Wiblin: Should people who are worried about AI alignment and safety go work at the AI labs? There's kind of two aspects to this. Firstly, should they do so in alignment-focused roles? And then secondly, what about just getting any general role in one of the important leading labs?
    Zvi Mowshowitz: This is a place I feel very, very strongly that the 80,000 Hours guidelines are very wrong. So my advice, if you want to improve the situation on the chance that we all die for existential risk concerns, is that you absolutely can go to a lab that you have evaluated as doing legitimate safety work, that will not effectively end up as capabilities work, in a role of doing that work. That is a very reas

    • 27 min
    LW - Generalized Stat Mech: The Boltzmann Approach by David Lorell

    LW - Generalized Stat Mech: The Boltzmann Approach by David Lorell

    Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Generalized Stat Mech: The Boltzmann Approach, published by David Lorell on April 12, 2024 on LessWrong.
    Context
    There's a common intuition that the tools and frames of statistical mechanics ought to generalize far beyond physics and, of particular interest to us, it feels like they ought to say a lot about agency and intelligence. But, in practice, attempts to apply stat mech tools beyond physics tend to be pretty shallow and unsatisfying.
    This post was originally drafted to be the first in a sequence on "generalized statistical mechanics": stat mech, but presented in a way intended to generalize beyond the usual physics applications. The rest of the supposed sequence may or may not ever be written.
    In what follows, we present very roughly the formulation of stat mech given by Clausius, Maxwell and Boltzmann (though we have diverged substantially; we're not aiming for historical accuracy here) in a frame intended to make generalization to other fields relatively easy. We'll cover three main topics:
    Boltzmann's definition for entropy, and the derivation of the Second Law of Thermodynamics from that definition.
    Derivation of the thermodynamic efficiency bound for heat engines, as a prototypical example application.
    How to measure Boltzmann entropy functions experimentally (assuming the Second Law holds), with only access to macroscopic measurements.
    Entropy
    To start, let's give a Boltzmann-flavored definition of (physical) entropy.
    The "Boltzmann Entropy" SBoltzmann is the log number of microstates of a system consistent with a given macrostate. We'll use the notation:
    SBoltzmann(Y=y)=logN[X|Y=y]
    Where Y=y is a value of the macrostate, and X is a variable representing possible microstate values (analogous to how a random variable X would specify a distribution over some outcomes, and X=x would give one particular value from that outcome-space.)
    Note that Boltzmann entropy is a function of the macrostate. Different macrostates - i.e. different pressures, volumes, temperatures, flow fields, center-of-mass positions or momenta, etc - have different Boltzmann entropies. So for an ideal gas, for instance, we might write SBoltzmann(P,V,T), to indicate which variables constitute "the macrostate".
    Considerations for Generalization
    What hidden assumptions about the system does Boltzmann's definition introduce, which we need to pay attention to when trying to generalize to other kinds of applications?
    There's a division between "microstates" and "macrostates", obviously. As yet, we haven't done any derivations which make assumptions about those, but we will soon. The main three assumptions we'll need are:
    Microstates evolve reversibly over time.
    Macrostate at each time is a function of the microstate at that time.
    Macrostates evolve deterministically over time.
    Mathematically, we have some microstate which varies as a function of time, x(t), and some macrostate which is also a function of time, y(t). The first assumption says that x(t)=ft(x(t1)) for some invertible function ft. The second assumption says that y(t)=gt(x(t)) for some function gt. The third assumption says that y(t)=Ft(y(t1)) for some function Ft.
    The Second Law: Derivation
    The Second Law of Thermodynamics says that entropy can never decrease over time, only increase. Let's derive that as a theorem for Boltzmann Entropy.
    Mathematically, we want to show:
    logN[X(t+1)|Y(t+1)=y(t+1)]logN[X(t)|Y(t)=y(t)]
    Visually, the proof works via this diagram:
    The arrows in the diagram show which states (micro/macro at t/t+1) are mapped to which other states by some function. Each of our three assumptions contributes one set of arrows:
    By assumption 1, microstate x(t) can be computed as a function of x(t+1) (i.e. no two microstates x(t) both evolve to the same later microstate x(t+1)).
    By assumption 2, macrostate y(t) can be comput

    • 32 min
    LW - A D&D.Sci Dodecalogue by abstractapplic

    LW - A D&D.Sci Dodecalogue by abstractapplic

    Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: A D&D.Sci Dodecalogue, published by abstractapplic on April 12, 2024 on LessWrong.
    Below is some advice on making D&D.Sci scenarios. I'm mostly yelling it in my own ear, and you shouldn't take any of it as gospel; but if you want some guidance on how to run your first game, you may find it helpful.
    1. The scoring function should be fair, transparent, and monotonic
    D&D.Sci players should frequently be confused, but about how to best reach their goals, not the goals themselves. By the end of the challenge, it should be obvious who won[1].
    2. The scoring function should be platform-agnostic, and futureproof
    Where possible, someone looking through old D&D.Sci games should be able to play them, and easily confirm their performance after-the-fact. As far as I know, the best way to facilitate this for most challenges is with a HTML/JS web interactive, hosted on github.
    3. The challenge should resist pure ML
    It should not be possible to reach an optimal answer just training a predictive model and looking at the output: if players wanted a "who can apply XGBoost/Tensorflow/whatever the best?" competition, they would be on Kaggle. The counterspell for this is making sure there's a nontrivial amount of task left in the task after players have good guesses for all the relevant response variables, and/or creating datasets specifically intended to flummox conventional use of conventional ML[2].
    4. The challenge should resist simple subsetting
    It should not be possible to reach an optimal answer by filtering for rows exactly like the situation the protagonist is (or could be) in: this is just too easy. The counterspell for this is making sure at least a few of the columns are continuous, and take a wide enough variety of values that a player who attempts a like-for-like analysis has to - at the very least - think carefully about what to treat as "basically the same".
    5. The challenge should resist good luck
    It should not be plausible[3] to reach an optimal answer through sheer good luck: hours spent poring over spreadsheets should not give the same results as a good diceroll. The counterspell for this is giving players enough choices that the odds of them getting all of them right by chance approach zero. ("Pick the best option from this six-entry list" is a bad goal; "Pick the best three options from this twenty-entry list" is much better.)
    6. Data should be abundant
    It is very, very hard to make a good "work around the fact that you're short on data" challenge. Not having enough information to be sure whether your hypotheses are right is a situation which players are likely to find awkward, irritating, and uncomfortably familiar: if you're uncertain about whether you should give players more rows, you almost certainly should. A five- or six-digit number of rows is reasonable for a dataset with 5-20 columns.
    (It is possible, but difficult, to be overly generous. A dataset with >1m rows cannot easily be fully loaded into current-gen Excel; a dataset too large to be hosted on github will be awkward to analyze with a home computer. But any dataset which doesn't approach either of those limitations will probably not be too big.)
    7. Data should be preternaturally (but not perfectly) clean
    Data in the real world is messy and unreliable. Most real-life data work is accounting for impurities, setting up pipelines, making judgement calls, refitting existing models on slightly new datasets, and noticing when your supplier decides to randomly redefine a column. D&D.Sci shouldn't be more of this: instead, it should focus on the inferential and strategic problems people can face even when datasets are uncannily well-behaved.
    (It is good when players get a chance to practice splitting columns, joining dataframes, and handling unknowns: however, these subtasks should not make up the meat of a ch

    • 6 min
    LW - Announcing Atlas Computing by miyazono

    LW - Announcing Atlas Computing by miyazono

    Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Announcing Atlas Computing, published by miyazono on April 12, 2024 on LessWrong.
    Atlas Computing is a new nonprofit working to collaboratively advance AI capabilities that are asymmetrically risk-reducing. Our work consists of building scoped prototypes and creating an ecosystem around @davidad's Safeguarded AI programme at ARIA (formerly referred to as the Open Agency Architecture).
    We formed in Oct 2023, and raised nearly $1M, primarily from the Survival and Flourishing Fund and Protocol Labs. We have no physical office, and are currently only Evan Miyazono (CEO) and Daniel Windham (software lead), but over the coming months and years, we hope to create compelling evidence that:
    The Safeguarded AI research agenda includes both research and engineering projects where breakthroughs or tools can incrementally reduce AI risks.
    If Atlas Computing makes only partial progress toward building safeguarded AI, we'll likely have put tools into the world that are useful for accelerating human oversight and review of AI outputs, asymmetrically favoring risk reduction.
    When davidad's ARIA program concludes, the work of Atlas Computing will have parallelized solving some tech transfer challenges, magnifying the impact of any technologies he develops.
    Our overall strategy
    We think that, in addition to encoding human values into AI systems, a very complementary way to dramatically reduce AI risk is to create external safeguards that limit AI outputs. Users (individuals, groups, or institutions) should have tools to create specifications that list baseline safety requirements (if not full desiderata for AI system outputs) and also interrogate those specifications with non-learned tools.
    A separate system should then use the specification to generate candidate solutions along with evidence that the proposed solution satisfies the spec. This evidence can then be reviewed automatically for adherence to the specified safety properties. This is by comparison to current user interactions with today's generalist ML systems, where all candidate solutions are at best reviewed manually. We hope to facilitate a paradigm where the least safe user's interactions with AI looks like:
    Specification-based AI vs other AI risk mitigation strategies
    We consider near-term risk reductions that are possible with this architecture to be highly compatible with existing alignment techniques.
    In Constitutional AI, humans are legislators but laws are sufficiently nuanced and subjective that they require a language model to act as a scalable executive and judiciary. Using specifications to establish an objective preliminary safety baseline that is automatically validated by a non-learned system could be considered a variation or subset of Constitutional AI.
    Some work on evaluations focuses on finding metrics that demonstrate safety or alignment of outputs. Our architecture expresses goals in terms of states of a world-model that is used to understand the impact of policies proposed by the AI, and would be excited to see and supportive of evals researchers exploring work in this direction.
    This approach could also be considered a form of scalable oversight, where a baseline set of safe specifications are automatically enforced via validation and proof generation against a spec.
    How this differs from davidad's work at ARIA
    You may be aware that davidad is funding similar work as a Programme Director at ARIA (watch his 30 minute solicitation presentation here). It's worth clarifying that, while davidad and Evan worked closely at Protocol Labs, davidad is not an employee of Atlas Computing, and Atlas has received no funding from ARIA. That said, we're pursuing highly complementary paths in our hopes to reduce AI risk.
    His Safeguarded AI research agenda, described here, is focused on using cyberphysical systems, li

    • 8 min
    EA - Dear EA, please be the reason people like me will actually see a better world. Help me make some small stride on extreme poverty where I live -- by the end of 2024. by Anthony Kalulu, a rural farmer in eastern Uganda.

    EA - Dear EA, please be the reason people like me will actually see a better world. Help me make some small stride on extreme poverty where I live -- by the end of 2024. by Anthony Kalulu, a rural farmer in eastern Uganda.

    Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Dear EA, please be the reason people like me will actually see a better world. Help me make some small stride on extreme poverty where I live -- by the end of 2024., published by Anthony Kalulu, a rural farmer in eastern Uganda. on April 12, 2024 on The Effective Altruism Forum.
    This message is for everyone in the global EA community.
    For all the things that have been said about EA over the recent past -- from SBF to Wytham Abbey to my own article on EA in 2022 (I have a disclaimer about this at the very bottom of this message) -- I am asking the global EA community to help me make only one small stride on extreme poverty where I live, before 2024 ends.
    Let's make up for all the things that have been said about EA (e.g., that EA doesn't support poor people-led grassroots orgs in the global south), by at least supporting only one poor people-led grassroots org in a part of the world where poverty is simply rife.
    Be the reason people like myself will actually see a better world, and the reason for people like us to actually see EA as being the true purveyor of the most good.
    FYI:
    I come from a community that purely depends on agriculture for survival. For this reason, the things that count as producing "the most good" in the eyes of people like me, are things like reliable markets for our produce etc, as opposed to things like mosquito nets, deworming tablets etc that EA might view as creating the most good.
    About me:
    My name is Anthony, a farmer here in eastern Uganda. My own life hasn't been very easy. But looking at people's circumstances where I live, I decided not to sit back.
    Some clue:
    Before COVID came, the World Bank said (in 2019), that 70% of the extreme poor in Sub Saharan Africa were packed in only 10 countries. Uganda was among those ten countries. Even among those 10 countries, according to the World Bank, Uganda still had the sluggishiest (i.e., the slowest) poverty reduction rate overall, as shown in this graph.
    Even in Uganda:
    Eastern Uganda, where I live, is Uganda's most impoverished, per all official reports. Our region Busoga meanwhile, which has long been the poorest in eastern Uganda, has since 2017 doubled as the poorest not just in eastern Uganda, but also in Uganda as a whole.
    In 2023, The Monitor, a Ugandan local daily, said: "Busoga is the sub-region with most people living in a complete poverty cycle followed by Bukedea and Karamoja. This is according to findings released in 2021/2022 by Mr Vincent Fred Senono, the Principal Statistician and head of analysis at the Uganda Bureau of Statistics".
    Even in Busoga itself, our two neighboring districts Kamuli & Buyende, being the furthermost, remotest area of Busoga on the shores of Lake Kyoga, have the least economic activity, and are arguably Busoga's most destitute.
    In short, while Uganda as a country is the very last in Sub Saharan Africa in terms of poverty reduction, our region Busoga is the worst in Uganda, and even in Busoga, our 2 twin districts Kamuli & Buyende, being the remotest, are simply the most miserable.
    Help us see some good before 2024 ends:
    I am asking the global EA community to help the Uganda Community Farm (the UCF), a nonprofit social enterprise that was founded by me, to accomplish only two goals before 2024 ends. Please be the reason people like us will actually see a better world.
    Goal one: Size of Long Island.
    That's, expanding the UCF's current white sorghum project to cover every village in Kamuli & Buyende - a 3,300 sq km region the size of Long Island (New York).
    Since 2019, the UCF has trained many rural farmers in Kamuli & Buyende, in eastern Uganda, on white sorghum. Our goal right now, is to expand this work and cover every village in Kamuli & Buyende, with white sorghum. Kamuli & Buyende are two neighboring districts in Busoga, Uganda's most impoverished reg

    • 12 min
    EA - Mediocre EAs: career paths and how do they engage with EA? by mikbp

    EA - Mediocre EAs: career paths and how do they engage with EA? by mikbp

    Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Mediocre EAs: career paths and how do they engage with EA?, published by mikbp on April 11, 2024 on The Effective Altruism Forum.
    [Let's speak in third person here. Nobody likes to be called mediocre and I'm asking for people's experiences so let's make it (a bit) easier to speak freely. If you give an answer, you can be explaining your situation or that of someone you know.]
    We all know that EA organizations search for and are full of brilliant (and mostly young) people. But what do EAs that are not brilliant do? Even many brilliant EAs are not able to work in EA organizations (see this post for example), but their prospects to having high impact careers and truly contribute to EA are good. However, most people are mediocre, many are not even able to get 1-1 in 80.000h or other similar help. This is frustrating and may make it difficult to engage for longer with EA.
    These people have their "normal" jobs, get older, start creating families, time is scarce... and the priority of EA in their life inevitably falls. Regularly meeting is probably far out of reach. Even writing occasional posts in the forum probably demands way too much time --more so knowing that not-super-well-researched-and-properly-written-posts-by-unknown-users usually get down-voted early on making them invisible and so, mostly useless to write.
    So I'd like to know what is the relationship with "the community" of mediocre EAs, particularly of a bit older ones. How do they exercise their EA muscle?
    It'd be also cool to have a very short description of their career paths as well.
    Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org

    • 1 min

Customer Reviews

4.6 out of 5
7 Ratings

7 Ratings

Top Podcasts In Education

The Mel Robbins Podcast
Mel Robbins
The Jordan B. Peterson Podcast
Dr. Jordan B. Peterson
Try This
The Washington Post
TED Talks Daily
TED
How to Be a Better Human
TED and PRX
The Jordan Harbinger Show
Jordan Harbinger

You Might Also Like

Conversations with Tyler
Mercatus Center at George Mason University
Dwarkesh Podcast
Dwarkesh Patel
"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis
Erik Torenberg, Nathan Labenz
The Ezra Klein Show
New York Times Opinion
The AI Breakdown: Daily Artificial Intelligence News and Discussions
Nathaniel Whittemore
LessWrong (30+ Karma)
LessWrong