4 Comments
User's avatar
Leo Abstract's avatar

This is an excellent survey of the subject; thank you for writing this. It is good to see that someone (@deepfates, in one of your links) has made a point about persuasion not stopping at human levels of intelligence. A concern I have not seen addressed is that there is already an existent decisive strategic advantage in this space, possessed by non-agent 'intelligences'. Influence is largely about attention, and while our human perspective assumes that the information that comes to us does so in a way that would be legible to our hunter-gatherer ancestors, nothing like this is happening anymore.

(An aside [and let's not quibble about evo psych if this doesn't land for you]: The next suggested youtube or tiktok video feels to the human brain an obvious analog to the ancient practice of telling stories around a fire. One story ends, someone else in the tribe begins another. Everyone knows the stories, and they reflect shared group values and lived histories. Or again, a tweet hits the human brain the way the utterance of some nearby human would: this person is important to my group and is speaking to me.)

Instead, layers of obscure machinery determine what we see -- perhaps not every thing every time, but for the great mass of the tech-connected species these layers determine enough of it enough of the time. What's more, not only can we not decide to turn these off (they're too profitable, decision-makers can always rationalize keeping them on) there aren't even mechanisms in place that would allow us to do so if we did decide (a corporation does not contain anyone whose job it is to put the brakes on things that are hugely profitable). Legislation might, but not only is this possible only after a problem is severe enough to warrant attention, the problem in this case actively changes how much attention it gets incommensurate with severity.

Take as an example the "giving up" you describe from Elon Musk. He has spoken to decision-makers in the past and found them unsympathetic. He also has 80 million followers on Twitter. Like all humans, he appears to respond to incentives. When he makes provocative culture-war statements he gets attention, when makes statements about AI he doesn't. This could be due to hidden twitter machinery, because only ironic joke versions of AI risk are accessible to the popular consciousness (due to earlier social programming?), or any one of a thousand other reasons. Doesn't matter exactly why, and this is only one symptom of a greater problem.

Imagine some hypothetical post-near-human-extinction-and-post-butlerian-jihad historian gaining access to records from this era. He might say "in retrospect it seems obvious that machine control of humanity was functionally total more than two decades before the Crisis, but remained invisible even during it."

Expand full comment
Rina's avatar

Thank you for the lovely comment!

Your description of these online recommendation algorithms reminds me of Moloch (https://slatestarcodex.com/2014/07/30/meditations-on-moloch/). Although these algorithms have great power over us, empowered by quirks of our evolution, they are merely a fragment of Moloch's true power. Yes, coordination problems are alignment problems, but the technical AI alignment problem is different in character, resembling childrearing more than it does averting war.

I am also reminded of this very recent interview of Connor Leahy, co-founder and CEO of Conjecture, a new London-based company doing alignment research (https://theinsideview.ai/connor2#the-horror-movie-heuristic), particularly this moment:

'A good heuristic I like to use to think about when we should be maybe taking something seriously or not is imagine you were the protagonist in a sci-fi horror movie, when would the audience be screaming at you? And I’m pretty sure with like GPT-2, GPT-3, Dalle and Imagen, this is a horror movie, right?'

The audience would definitely be screaming right now, and I don't think humanity is taking the issue seriously enough. But I'm hopeful that this can and will change, that it is changing, and that we have a good shot of solving the technical problem and getting out of this alive and well if AI timelines aren't too short. Existing systems cannot, I think, be said to have anything like a decisive strategic advantage, or even a major one. We have more than enough control to actually try to solve this problem; all of this really still is in our hands.

Expand full comment
Leo Abstract's avatar

Perfect links for the discussion -- I had missed the meditations on moloch previously. I agree that this describes our situation, and that the theater would be screaming right about now. What I meant about an advantage "in this space" is to point to a specific instance of what Scott says in the Moloch article, "The limit of multipolar traps as technology approaches infinity is 'very bad'".

It used to be that a newspaper could move public perception by making broad appeals that change some small changeable percentage of the population. No trick works on everyone every time, and a great sales tactic or marketing campaign only increases sales numbers enough to be statistically significant. The rise of real-time click-tracking on websites created clickbait (we'll leave for another discussion whether people now think in clickbait, or if they ever previously thought in jingoistic slogans or whatever; i.e. was "remember the maine!" truly internal, purely performative, or somewhere in between). The rise of personalized preference-based media allows this segment of moloch to try various things on each individual consumer and find out what works. If I think cat and dog videos are self-indulgent but find videos of farm animals cute and appealing, tiktok already has the ability to show me infinite cute pigs and goats, sandwiching between two clips it knows I'll love another clip that it wants me to begin to love (the classic broadband example of this effect is what they did with the 'song' Hey Ya! by Outkast).

Now, the tiktok algo isn't an AI, partially due perhaps to the Chinese Communist Party's stances on AI, but more importantly just because they don't have an AI to give it over to. Yet. The space already exists for AI to occupy.

Now, it is entirely true that the strategic advantage this represents is not very robust. In a perfect world, we could easily decide to stop obeying incentives and not let people do profitable things that provide us with entertainment (in Scott's words, the Entertainment God could decree such things are not allowed anymore). Once we did: poof, no more advantage. Not very decisive.

Imagine two armies. One has 10,000 soldiers, the other only 250. The small army has somewhat better weapons, but the advantage isn't huge, certainly not decisive. Seems like the 10,000 have it in the bag. But it turns out the 10,000 are all stone deaf, know no sign language, are 100 different ethnic groups unfriendly to one another, and all have advanced congestive heart failure.

The discussion of DSAs and MSAs vastly overestimates how much of an advantage is necessary to enslave or exterminate this species.

Expand full comment
Rina's avatar

I suppose I’m mostly interested in things like DSA because it represents an upper bound on the sort of advantage required to take control of humanity. If AGI can very plausibly achieve DSA, we probably don’t need to dive into the details to work things out—given how little attention the issue receives, I think it deserves as much attention as we can currently muster.

That being said, I suspect that there’s a lot of spare state capacity lying around in the West that could be brought to bear if only there was the will to do it. So it’s not at all clear to me that anything like these recommendation algorithms could take control of humanity since they’re not agents. I might be wrong, but I’m optimistic about our ability to do some things when we try, and this is one of those things. Safely developing AGI, though? We’ll have to wait and see, I guess.

Expand full comment