397 episodes

The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

The Nonlinear Library The Nonlinear Fund

    • Education
    • 5.0 • 1 Rating

The Nonlinear Library allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs. To find out more, please visit us at nonlinear.org

    LW - Misc. questions about EfficientZero by Daniel Kokotajlo

    LW - Misc. questions about EfficientZero by Daniel Kokotajlo

    Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Misc. questions about EfficientZero, published by Daniel Kokotajlo on December 4, 2021 on LessWrong.
    Perhaps these can be thought of as homework questions -- when I imagine us successfully making AI go well, I imagine us building expertise such that we can answer these questions quickly and easily. Before I read the answers I'm going to think for 10min or so about each one and post my own guesses.
    Useful links / background reading: The glorious EfficientZero: How it Works. Related comment. EfficientZero GitHub. LW discussion.
    Some of these questions are about EfficientZero, the net trained recently; others are about EfficientZero the architecture, imagined to be suitably scaled up to AGI levels. "If we made a much bigger and longer-trained version of this (with suitable training environment) such that it was superhuman AGI..."
    EfficientZero vs. reward hacking and inner alignment failure:
    Barring inner alignment failure, it’ll eventually reward hack, right? That is, if it gets sufficiently knowledgeable and capable, it’ll realize that it can get loads of reward by hacking its reward channel, and then its core algorithm would evaluate that action/plan highly and do it. Right?
    But fortunately (?) maybe there would be an inner alignment failure and the part of it that predicts reward would predict low reward from that action, even in the limit of knowledge and capability? Because it’s learned to predict proxies for reward rather than reward itself, and continued to do so even as it got smarter and more capable? (Would this happen? Why would this not be corrected by further training? Has some sort of proxy crystallization set in? How?)
    EfficientZero approximates evidential decision theory, right?
    EfficientZero is a consequentialist (in the sense defined here) architecture, right? It’s not, for example, updateless or deontological. For example, it has no deontological constraints except by accident (i.e. if its predictor-net mistakenly predicted super low reward for actions of type X, always, even in cases where actually a reasonable intelligent predictor would predict high reward.) Right?
    What is the most complex environment AIs in the family of MuZero, EfficientZero, etc. have been trained on? Is it just some Atari game?
    Roughly how many parameters does EfficientZero have? If you don’t know, what about MuZero? What about the biggest net to date from that general family? The EfficientZero paper doesn't give a direct answer but it describes the architecture in enough detail that you might be able to calculate it...
    If we kept scaling up EfficientZero by OOMs in every way, what would happen? Would it eventually get to agenty AGI? / APS-AI? After all, it seems pretty sample-efficient already. What if its sample was an entire lifetime of diverse experiences?
    Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

    • 2 min
    AF - Agents as P₂B Chain Reactions by Daniel Kokotajlo

    AF - Agents as P₂B Chain Reactions by Daniel Kokotajlo

    Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Agents as P₂B Chain Reactions, published by Daniel Kokotajlo on December 4, 2021 on The AI Alignment Forum.
    tl;dr: Sometimes planners successfully P2B, kicking off a self-sustaining chain reaction / feedback loop of better and better plans (made possible by better and better world-models, more and more resources, etc.) Whereas fire takes a concentration of heat (a spark) as input and produces a greater concentration of heat as output, agents take a concentration of convergent instrumental resources (e.g. data, money, power) as input and produces a greater concentration as output.
    Previously we described the Convergent-Instrumental-Goal-to-rule-them-all, P2B: Plan to P2B Better. The most common way to P2B is to plan to plan again, in the immediate future, but with some new relevant piece of data. For example, suppose I am walking somewhere. I look up from my phone to glance at the door. Why? Because without data about the location of the doorknob, I can’t guide my hands to open the door. This sort of thing doesn’t just happen on short timescales, though; humans (and all realistic agents, I’d wager) are hierarchical planners and medium and long-term plans usually involve acquiring new relevant data as well.
    This image just talks about collecting good new data, but there are other important convergent instrumental goals too, of course. Such as staying alive, propagating your goals to other agents, and acquiring money. One could modify the diagram to include these other feedback loops as well — more data and money and power being recruited by sensor-learner-planners to acquire more data and money and power. It’s not that all of these metrics will go up with every cycle around the loop; it’s that some goals/resources are “instrumentally convergent” in that they tend to show up frequently in most real-world P2B loops.
    Agents, I say, are P2B chain reactions / P2B feedback loops. They are what happens when planners successfully P2B. Whereas fire is a chain reaction that inputs heat + fuel + oxygen and outputs more heat (and often more oxygen and fuel too, as it expands to envelop more such) agents are chain reactions that input sensor-learner-planners with some amount of data, knowledge, power, money, etc. and output more, better sensor-learner-planners with greater amounts of data, knowledge, power, money, etc.
    (I don’t I think this definition fully captures our intuitive concept of agency. Rather, P2B chain reactions seem like a big deal, an important concept worth talking about, and close enough to our intuitive concept of agency that I’m going to appropriate the term until someone pushes back.)
    Already you might be able to guess why I think agents are powerful:
    Consider how fire is a much bigger deal than baking-soda-vinegar. General/robust feedback loops are way way better/more important than narrow/brittle ones, and sensor-learner-planner makes for a P2B loop that is pretty damn general.
    Ok, so we are talking about a kind of chain reaction that takes concentrations of knowledge, power, money, etc. and makes them bigger? That sure sounds like it’ll be relevant to discussions of who’s going to win important competitions over market share, political power, and control of the future!
    The next post in this sequence will address the following questions, and more:
    OK, so does any feedback loop of accumulating convergent instrumental resources count as an agent then? Presumably not, presumably it has to be resulting from some sort of planning to count as a P2B loop. Earlier you said planning is a family of algorithms. Say more about where the borders of this concept are please!
    The post after that will answer the Big Questions:
    Why is agency powerful? Why should we expect agent AGIs to outcompete human+tool combos for control of the future? Etc.
    Thank

    • 3 min
    AF - Agency: What it is and why it matters by Daniel Kokotajlo

    AF - Agency: What it is and why it matters by Daniel Kokotajlo

    Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Agency: What it is and why it matters, published by Daniel Kokotajlo on December 4, 2021 on The AI Alignment Forum.
    This sequence explains my take on agency. I’m responding to claims that the standard arguments for AI risk have a gap, a missing answer to the question “why should we expect there to be agenty AIs optimizing for stuff? Especially the sort of unbounded optimization that instrumentally converges to pursuit of money and power.”
    This sequence is a pontoon bridge thrown across that gap.
    I’m also responding to claims that there are coherent, plausible possible futures in which agent AGI (perhaps better described as APS-AI) isn’t useful/powerful/incentivized, thanks to various tools that can do the various tasks better and cheaper. I think those futures are incoherent, or at least very implausible. Agency is powerful. For example, one conclusion I am arguing for is:
    When it becomes possible to make human-level AI agents, said agents will be able to outcompete various human-tool hybrids prevalent at the time in every important competition (e.g. for money, power, knowledge, SOTA performance, control of the future lightcone...)
    Another is:
    We should expect Agency as Byproduct, i.e. expect some plausible training processes to produce agenty AIs even when their designers weren't explicitly aiming for that outcome.
    I’ve had these ideas for about a year but never got around to turning them into rigorous research. Given my current priorities it looks like I might never do that, so instead I’m going to bang it out over a couple of weekends so it doesn’t distract from my main work. :/ I won't be offended if you don't bother to read it.
    Outline of this sequence:
    1. P₂B: Plan to P₂B Better - LessWrong
    2. Agents as P₂B chain reactions
    3. Gradations of agency
    4. Why agents are powerful, & other conclusions
    5. Objections, reflections, comparisons, generalizations
    Incomplete list of related literature and comments:
    Frequent arguments about alignment - LessWrong (A comment in which Richard Ngo summarizes a common pattern of conversation about the risk from agenty AI vs. other sorts of AI risk)
    Joe Carlsmith, drawing on writings from others, had 20% credence that AI agents won't be powerful enough relative to non-agents to be incentivised. I recommend reading the whole report, or at least the relevant sections on APS-AI and incentives to build it.
    Eric Drexler's CAIS report (as summarized by Rohin Shah) argues basically that it should be much more than 20%. Richard Ngo's thoughts here.
    Why You Shouldn't Be a Tool: The Power of Agency by Gwern. (OK, it seems to have a different title now, maybe it always did and I hallucinated this memory...) This essay, more than anything else, inspired my current views.
    The Ground of Optimization by Alex Flint argues: "there is a specific class of intelligent systems — which we call optimizing systems — that are worthy of special attention and study due to their potential to reshape the world. The set of optimizing systems is smaller than the set of all AI services, but larger than the set of goal-directed agentic systems."
    Yudkowsky and Ngo conversation (especially as summarized by Nate Soares) seems to be arguing for something similar to Alex -- I imagine Yudkowsky would say that by focusing on agency I'm missing the forest for the trees: there is a broader class of systems (optimizers? consequentialists? makers-of-plans-that-lase?) of which agents are a special case, and it's this broader class that has the interesting and powerful and scary properties. I think this is probably right but my brain is not yet galaxy enough to grok it; I'm going to defy EY's advice and keep thinking about the trees for now. I look forward to eventually stepping back and trying to see the forest.
    Thanks to various people, mostly at and

    • 4 min
    LW - Covid Prediction Markets at Polymarket by Zvi

    LW - Covid Prediction Markets at Polymarket by Zvi

    Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Covid Prediction Markets at Polymarket, published by Zvi on December 2, 2021 on LessWrong.
    We now have some useful prediction markets up on Covid issues, so it’s worth looking at what they say and thinking about what other markets we could generate. I encourage you to suggest additional markets in the comments, with as much detail as possible.
    As I wrote a while ago, if you want prediction markets to be successful, you need five elements:
    Well Defined. The resolution mechanisms must be clear and transparent.
    Quick Resolution. The longer it takes, the less people will be interested.
    Probable Resolution. There has to be a definitive outcome most of the time.
    Limited Hidden Information. If others know what I don’t, I’m the sucker.
    Sources of Disagreement and Interest. Suckers at the table.
    To these, of course, we can add an implied sixth, which is
    Real Money. Money talks, b******t walks. Are we doing this or not?
    Thus, now that we have Polymarket posting real Covid-19 markets, we have the potential to learn things that matter, with a level of resolution we can use, and I’m glad I’ve been able to help them figure out what markets to offer.
    We need real money prediction markets if we want to rely on their information. I’m glad Metaculus exists and puts up some markets, and it’s good for getting some idea at all, but it is not remotely the same thing. Their main advantage is that, exactly because they don’t use real money, they can violate the five pillars and those who are inclined to participate at all will still participate because there is no cost of capital, need to invest tons of attention or worry about being the sucker when it’s for internet points.
    Whereas at Polymarket, a prime motivation of liquidity providers is to use the markets to generate information, providing motivation in turn for others to come participate.
    There are three new markets up, let’s check them out.
    Will Omicron be >1% of all USA cases by the end of the year?
    Here’s what it would look like betting $1000 or $10,000 on this respectively.
    It was my suggestion to use a relatively low threshold like 1% of cases, in order to allow the market to resolve faster. There is an interesting and important question of whether Omicron would then fully displace Delta, but if Omicron goes from 0% to 1% in a month, going from 1% to 50% is a lot less exponential growth than that and should happen that much faster. The game is very much over at that point.
    If the chance of 1% happening in time gets very low or very high, or there’s sufficient volume, other markets with other percentages and/or longer time horizons (or other locations) can be created to continue asking the questions that matter and improving our estimates. For now, it seems better to focus on only one market of this type, to ensure better liquidity.
    It’s possible that some people are trading thinking that 1% isn’t that much, as opposed to it being the majority of the way to 50%, which is one of many possible sources of good trading.
    For those in other countries like the UK, this is still mostly the question you care about, since the growth rates in both places will be highly correlated.
    Currently the market is putting this result at about 67% to happen within the month, and you can bet four figures without too much slippage in price, and five figures if you’re confident enough to buy up to 77% to do so, which could easily be a good buy especially if you’re reacting quickly to new information. If you think that the market is off by a lot, I’d encourage you to come in and participate.
    You can also choose to provide liquidity to the market to encourage other participants. If there’s back and forth trading you make a profit, but you’ll be on the wrong end of any permanent market moves, including the final move to

    • 10 min
    LW - [Linkpost] A General Language Assistant as a Laboratory for Alignment by Quintin Pope

    LW - [Linkpost] A General Language Assistant as a Laboratory for Alignment by Quintin Pope

    Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: [Linkpost] A General Language Assistant as a Laboratory for Alignment, published by Quintin Pope on December 3, 2021 on LessWrong.
    This is a linkpost for a recent paper from Anthropic: A General Language Assistant as a Laboratory for Alignment.
    Abstract:
    Given the broad capabilities of large language models, it should be possible to work towards a general-purpose, text-based assistant that is aligned with human values, meaning that it is helpful, honest, and harmless. As an initial foray in this direction we study simple baseline techniques and evaluations, such as prompting. We find that the benefits from modest interventions increase with model size, generalize to a variety of alignment evaluations, and do not compromise the performance of large models. Next we investigate scaling trends for several training objectives relevant to alignment, comparing imitation learning, binary discrimination, and ranked preference modeling. We find that ranked preference modeling performs much better than imitation learning, and often scales more favorably with model size. In contrast, binary discrimination typically performs and scales very similarly to imitation learning.
    Finally we study a `preference model pre-training' stage of training, with the goal of improving sample efficiency when finetuning on human preferences.
    I think this is great work and shows encouraging results. I take the fact that alignability scales with model size as evidence for the most convenient form of the natural abstraction hypothesis:
    Stronger AIs have better internal models of human values.
    A few well chosen gradient updates can “wire up” the human values model to the AI’s output.
    I’d be interested to see this sort of work extended to reinforcement learning systems, which I think are more dangerous than language models.
    Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

    • 1 min
    LW - Second-order selection against the immortal by Elmer of Malmesbury

    LW - Second-order selection against the immortal by Elmer of Malmesbury

    Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Second-order selection against the immortal, published by Elmer of Malmesbury on December 3, 2021 on LessWrong.
    Cross-posted from Telescopic Turnip.
    In his recent review of Lifespan, Scott Alexander writes:
    Algernon’s Law says there shouldn’t be easy gains in biology. Your body is the product of millions of years of evolution – it would be weird if some drug could make you stronger, faster, and smarter. Why didn’t the body just evolve to secrete that drug itself?
    He is talking about anti-aging research, and wondering why, if there is an easy way to stop aging, humans haven’t already evolved immortality spontaneously. There are many relevant things to say about this, but I think the evolutionary perspective is particularly interesting. Under some circumstances, it might be that immortality is inherently unstable.
    The Imperium and the Horde
    Suppose that it’s the future, and the FDA just approved a pill that makes you immortal. Of course people disagree about whether one should take the pill or not. As a result, humanity is now divided in two populations: the Immortal Imperium, who took the immortality pill, and the Horde of Death, who still experience the painful decay and death we all know and love.
    So, people from the Horde spend their time having plenty of children to populate the next generation, while people in the Immortal Imperium try to escape their existential ennui by reading speculative blog posts on the Internet. Who will prevail?
    Two orders of fitness
    There are two competing phenomena at play here. One is first-order selection, which is how many of your genes are passed on to the next generation, the more the better. For the Horde of Death, there is nothing mysterious: they reproduce, then they die, and an uncertain fraction of their genes gets passed on.
    What about immortal people? They don’t really pass anything to the next generation, because they don’t do the whole generation thing. On the other hand, all of their genes will still be around centuries after centuries, so for the genes involved, this is a 100% success rate. In this sense, people in the Immortal Imperium have a very high first-order fitness.
    The second process is second-order selection. This is selection on evolvability. This is about how easy it is for your lineage to improve its own first-order fitness in the future. If a lineage finds a way to evolve quicker, then it may eventually take over the whole population because it will be more likely to discover new beneficial variants, and the original mechanism that granted better evolvability will hitchhike with these new variants.
    If you want to see it happen with your own eyes, look at Richard Lenski’s long term evolution experiment, where people have been growing the same E. coli lineages continuously since 1988. Among the mutants that took over the population after a few thousands of generations, some were present since almost the very beginning. They are called EW, for Eventual Winners. Other mutants from the same period eventually disappeared, so they are called Eventual Losers (EL). Surprisingly, in the early days, the EL were able to grow faster than the EW. But in the long term, the EW did better. That is because the EW had mutations that made them more evolvable: they became more likely to acquire further beneficial mutations that ultimately made them grow faster than the EL. People in Lenski’s lab replayed the competition over and over, and most of the time the more evolvable strain ended up taking over.
    Second-order selection matters most for organisms that are not well-adapted to their environment. After all, if you are already at the top of the fitness landscape, there is no point improving your gradient-climbing abilities. Intuitively, it may look like humans are well-adapted to their environment, because

    • 8 min

Customer Reviews

5.0 out of 5
1 Rating

1 Rating

Top Podcasts In Education

You Might Also Like