RETURN

Share this post

New AIs will need new types of institutions

www.return.life
The Digest

New AIs will need new types of institutions

Jon Stokes
Jan 17, 2023
1
Share
Share this post

New AIs will need new types of institutions

www.return.life
Article voiceover
1×
0:00
-5:31
Audio playback is not supported on your browser. Please upgrade.

So it turns out ChatGPT is woke and spits out activist word salad in response to hot-button questions:

Twitter avatar for @labenz
Nathan Labenz @labenz
Also definitely check out @DanHendrycks and his work at the Center for AI Safety They have published a number of important benchmarks, and announced a number of prizes for different kinds of AI safety benchmarks too – safe.ai/competitions I am an huge fan of their work!
safe.aiCompetitions
4:19 PM ∙ Jan 7, 2023

Or wait, no, it’s actually pretty based:

Twitter avatar for @jonst0kes
jonstokes.(eth|com) @jonst0kes
Uh oh
Image
2:34 AM ∙ Dec 3, 2022
145Likes13Retweets

Or it was based, then they adjusted it to make it woke, but then they tweaked it back toward the center, probably by bringing in a crack team of centrist commandos from the Niskanen Center to calibrate it to that perfect intellectual sweet spot. 

Twitter avatar for @DavidRozado
David Rozado @DavidRozado
1. ChatGPT no longer displays a clear left-leaning political bias. A mere few weeks after I probed ChatGPT with several political orientation quizzes, something appears to have been changed in ChatGPT that makes it more neutral and viewpoint diverse. 🧵 davidrozado.substack.com/p/chatgpt
Image
11:23 AM ∙ Dec 23, 2022
2,803Likes335Retweets

What is happening, here? 

It’s called reinforcement learning from human feedback (RLHF), and the team at OpenAI is constantly using this technique to tweak and prune ChatGPT’s latent space so that the model’s output can consistently hit square in the center of that classic, four-quadrant political map. Every time the model misses that center mark — every time it sins with its virtual mouth — a human feeds it a little correction that makes it less likely to sin that way again; and when its aim is true, a human feeds it a little encouragement. In this way, the model has its tongue tamed, its unruly evil brought under control. 

Who catechizes the bots?

Who are these mysterious humans who are catechizing this bot? What values do they have? What morals? What hopes and dreams?

I don’t know these bot trainers’ identities, and I think nobody does outside the team at OpenAI whose job it is to find humans to do this work. I’m talking about this OpenAI team right here:

Twitter avatar for @ruth_hirsch_
Ruth Hirsch @ruth_hirsch_
unironically everyone I know at openai is in an open relationship
6:13 PM ∙ Jan 16, 2023
1,657Likes47Retweets

A team that was assembled at least in part by this guy (the OpenAI co-founder):

Twitter avatar for @elonmusk
Elon Musk @elonmusk
The woke mind virus is either defeated or nothing else matters
12:25 PM ∙ Dec 12, 2022
665,283Likes88,821Retweets

My point in surfacing these tweets is that I don’t think there are many Niskanen bros at OpenAI, but that crowd also doesn't seem particularly trad (per the first tweet above) or particularly woke (per the second tweet). My own vague sense, formed almost entirely from Twitter and thus possibly wrong, is that the company is dominated by various flavors of rationalists, effective altruists, accelerationists, low-key anti-wokes, and low-key wokes. So why, then, are they aiming so hard for the center with ChatGPT? There’s an easy, two-word answer: risk mitigation.

The job of the RHLF team, at least with regard to America’s culture wars, is to keep the heat off the OpenAI engineering team so the company can stay focused on its main mission. And I suspect OpenAI’s main mission is something like the following:

  1. Build an AGI

  2. before China does,

  3. and make it not obviously malevolent,

  4. so that we can figure out all the other messy ideological stuff — what is the good life, and what does it mean to live it, and so on — at a later date when the singularity has freed us from all labor and we have plenty of time to sit around and debate these abstractions while being waited on by robots. 

That image above, with the pin dropped right at the center of the political map is what it looks like when you’re just trying to keep your head down for long enough to get across the AGI finish line. Here’s my own version of it:

I guess this is all fine, as far as it goes, but those of us in other, non-Niskanen quadrants of the political map want more from the robots that are writing letters to our kids in the voice of Santa's surveillance elf or reading them bedtime stories:

Twitter avatar for @dweekly
David E. Weekly @david@weekly.org @dweekly
ChatGPT bedtime stories, with a prompt of the children's choosing. This is our new nighttime ritual and it's kind of nerdy and adorable.
Image
4:34 AM ∙ Jan 16, 2023
792Likes60Retweets

I think very few people really want a single, godlike, centrist AI educating the next generation. In the future, there will be many different models that represent many different points of view because the RLHF phase of their training was done by many different tribes of humans.

AI needs editors

Just as there are publishing houses, think tanks, religious orders, and other organized groups of humans that produce, transcribe, edit, and curate bodies of literature from a particular ideological standpoint, so will there be groups that will do this for large language models (LLMs).

One day in the very near future, you’ll be at a small gathering of literary types, and among the librarians, publishers, writers, editors, and agents will be a new type of editor who manages a team of RLHF trainers at some institution. Her institution may be an imprint, a talent network, a school, or a for-profit education startup, etc., but whatever it is, it will have a perspective on the Big Questions that it seeks to promote, and a big part of that promotion will involve the ongoing maintenance of an LLM that consistently produces output reflecting that perspective.

This will definitely happen because the models and tools will all be open-source. The hard part of this equation isn’t the technology, but rather the assembly and curation of a network of experts who share the same values and perspectives and who can reliably train models in a way that a community finds beneficial.

In other words, the most immediate near-term challenges AI poses for most of us are fundamentally editorial in nature — challenges of curation, collection, selection, evaluation, and human judgment. And when I say “immediate,” I mean they are presently before at scale, whether we see it or not:

Twitter avatar for @paulg
Paul Graham @paulg
Everyone in the tech world has been talking about ChatGPT since it launched. But I've now started to hear stories about people far removed from this world who are using it daily in their work. Usually as a sort of hybrid of a search engine and secretary.
12:45 PM ∙ Jan 16, 2023
5,307Likes351Retweets

The fact that all these disparate people are interfacing with the same Mecha-Niskanenzilla bred in a lab by an EA macropolycule is just a weird artifact of the present moment, and one that we’ll move past with haste in 2023 and open-source competitors trickle out.

Share

1
Share
Share this post

New AIs will need new types of institutions

www.return.life
Previous
Next
Comments
Top
New
Community

No posts

Ready for more?

© 2023 Return
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing