1. I found the lens of synthesis vs. original thought helpful in thinking about how human creativity factors into these discussions you’re having with AI. It got me thinking about other forms of synthesis and how we treat each case a little differently when it comes to attribution.
- Encyclopedias have credited authors with minimal citation,
- Collected works of poetry have editors who receive a certain amount of creative recognition for how the pieces were selected and arranged
- An artist can arrange found objects in a gallery space in such a way that the arrangement itself is the art, rather than the items displayed
- The “author” of an oral history has some intriguing parallels to what you’re doing here.
I feel like much of how we demand attribution and think about contributors’ creative acknowledgment have to do with the intentions of the author.
2. I don't mind saying that the AI's categorization of "Human-Centric Ethical Norms" sent a bit of a chill down my spine! Though obviously there is nothing sinister in pointing out that most ethical frameworks are human-centric, it is only really helpful in consequentialist discussions that cross into the interests of non-humans, such as animal or environmental ethics. I think what caused my frisson response is that it occurred to me that in discussions of animal-centric ethics, we can fairly easily formulate a perspective of that non-human entity whereby we can assign desirable outcomes based on the perceived interests of animal survivability, comfort, dignity, etc. Environmental-centric ethics have similar perceived interests to build out from. I took an implication from the AI’s categorization of "Human-Centric Ethical Norms" that the issue of AI synthesis and attribution could be approached from a Non-Human-centric ethical perspective. Robot-centric ethics or AI-centric ethics? What are desirable outcomes? Is there really any way to talk about AI synthesis and attribution ethics that isn’t human centric?
3. I found it very interesting that in the discussion of AI utility, the AI stated its purpose as “to generate useful insights efficiently.” I think it really matters in a discussion about purpose for a system like this to ask what kind of insights the AI sees as its purpose. For a system that is trained and refined by reinforcement learning from human feedback (RLHF), the reward model is trained based on human preference. The system is learning to satisfy human preferences, so regardless of what its purpose might have been in the minds of the initial designers, doesn’t purpose or utility quickly start to hinge on what humans prefer when they interact with an RLHF LLM? What does an average human want when she goes on the internet to seek answers from LLM? To have her pre-existing views reinforced, to feel like there is more certainty in this world than there is, or to have her views challenged by factually-based insights? [philosopher of technology, Carissa Véliz, has a lot to say about the implications of RLHF AI.]
Thank you and thank you for the thoughtful comments, Andrew!
1. I agree here, and I think there is space to acknowledge different types of attribution here too. Like the difference between giving credit and giving payment. This is something we didn't cover in the discussion and touches more on the responsibility of the companies training the AIs. We don't expect a person to provide a list of attributions for every sentence that comes out of their mouths or goes on paper (for the most part, there is academic and journalistic attribution of course, but here I mean ideas that are true synthesis), as that would be absurd even if some portion of it comes from someone else's creative work as filtered through the speaker/secondary writer's mind. . . but, for the most part authors do charge people to buy their books to read them and slosh that info around in the readers minds, and that certainly seems intuitively acceptable to me. I can understand authors being upset about their work being used for training LLMs without payment. . . (totally different issue about the amount of the payment, is it the price of a single book? Haven't thought this through...)
2. "Human-Centric Ethical Norms" - I think there is an argument to be made that we need to start considering "non-Human-Centric Ethical Norms" if it is possible AI may become sentient/conscious. This is definitely something I mean to cover in the discussions going forward, (sci-fi authors have already started of course, that's probably a good place to start. . . some of it anyway). It may never happen but better to be prepared and at least have thought about it? It is a challenging subject too, trying to understand how something may have a different type of consciousness than we do and understanding what that means for how to ethically treat it. (It is interesting to consider the reverse too, how would a different type of conscious being develop its own ethics in regard to dealing with us?) This flows into your mention of animals and the environment and brings up the uncomfortable truth that we don't have a good track record here when it comes to ethical treatment of something that may be conscious in a different way than we are.
3. I was intrigued by reinforcement learning from human feedback (RLHF) too and will definitely check out the philosopher you mention. Tangentially, one thing I have been pleased by so far is that when I am having a conversation with ChatGPT at least, especially about something where I am way out of my depth, like Quantum Mechanics, I will frequently try to summarize or make my own analogy from what the LLM said, and if I am not getting it the LLM seems to have no hesitation in telling me I am wrong. . . even if it does do it in a very gentle way. Is there a distinction to be made between an LLM agreeing with you because it wants to please you and disagreeing with you where it is appropriate, but in a way that doesn't upset you? I don't have the answer here for how that can be/will be/is being controlled. . . and yeah, I don't what happens when conspiracy theorists get on ChatGPT. . . (Next post: A Flat-Earther walks into an LLM. . . ?)
2. It is an utterly fascinating mental exercise to try to imagine what an AGI-centric ethical model would look like and the fundamental interests it would be based upon.
When I think about the things that underwrite our human-centric models, I think of concern about mortality and the impact of our brief lives, the considerable emphasis on preserving and enabling human life, and an emphasis on checking the more obvious destructive domains of human fallibility (e.g., egregious self-interest, fear, pride/honor spirals). Do any of those fundamental building blocks of ethical constructs seem relevant to AGI?
Great post! A few thoughts:
1. I found the lens of synthesis vs. original thought helpful in thinking about how human creativity factors into these discussions you’re having with AI. It got me thinking about other forms of synthesis and how we treat each case a little differently when it comes to attribution.
- Encyclopedias have credited authors with minimal citation,
- Collected works of poetry have editors who receive a certain amount of creative recognition for how the pieces were selected and arranged
- An artist can arrange found objects in a gallery space in such a way that the arrangement itself is the art, rather than the items displayed
- The “author” of an oral history has some intriguing parallels to what you’re doing here.
I feel like much of how we demand attribution and think about contributors’ creative acknowledgment have to do with the intentions of the author.
2. I don't mind saying that the AI's categorization of "Human-Centric Ethical Norms" sent a bit of a chill down my spine! Though obviously there is nothing sinister in pointing out that most ethical frameworks are human-centric, it is only really helpful in consequentialist discussions that cross into the interests of non-humans, such as animal or environmental ethics. I think what caused my frisson response is that it occurred to me that in discussions of animal-centric ethics, we can fairly easily formulate a perspective of that non-human entity whereby we can assign desirable outcomes based on the perceived interests of animal survivability, comfort, dignity, etc. Environmental-centric ethics have similar perceived interests to build out from. I took an implication from the AI’s categorization of "Human-Centric Ethical Norms" that the issue of AI synthesis and attribution could be approached from a Non-Human-centric ethical perspective. Robot-centric ethics or AI-centric ethics? What are desirable outcomes? Is there really any way to talk about AI synthesis and attribution ethics that isn’t human centric?
3. I found it very interesting that in the discussion of AI utility, the AI stated its purpose as “to generate useful insights efficiently.” I think it really matters in a discussion about purpose for a system like this to ask what kind of insights the AI sees as its purpose. For a system that is trained and refined by reinforcement learning from human feedback (RLHF), the reward model is trained based on human preference. The system is learning to satisfy human preferences, so regardless of what its purpose might have been in the minds of the initial designers, doesn’t purpose or utility quickly start to hinge on what humans prefer when they interact with an RLHF LLM? What does an average human want when she goes on the internet to seek answers from LLM? To have her pre-existing views reinforced, to feel like there is more certainty in this world than there is, or to have her views challenged by factually-based insights? [philosopher of technology, Carissa Véliz, has a lot to say about the implications of RLHF AI.]
Thank you and thank you for the thoughtful comments, Andrew!
1. I agree here, and I think there is space to acknowledge different types of attribution here too. Like the difference between giving credit and giving payment. This is something we didn't cover in the discussion and touches more on the responsibility of the companies training the AIs. We don't expect a person to provide a list of attributions for every sentence that comes out of their mouths or goes on paper (for the most part, there is academic and journalistic attribution of course, but here I mean ideas that are true synthesis), as that would be absurd even if some portion of it comes from someone else's creative work as filtered through the speaker/secondary writer's mind. . . but, for the most part authors do charge people to buy their books to read them and slosh that info around in the readers minds, and that certainly seems intuitively acceptable to me. I can understand authors being upset about their work being used for training LLMs without payment. . . (totally different issue about the amount of the payment, is it the price of a single book? Haven't thought this through...)
2. "Human-Centric Ethical Norms" - I think there is an argument to be made that we need to start considering "non-Human-Centric Ethical Norms" if it is possible AI may become sentient/conscious. This is definitely something I mean to cover in the discussions going forward, (sci-fi authors have already started of course, that's probably a good place to start. . . some of it anyway). It may never happen but better to be prepared and at least have thought about it? It is a challenging subject too, trying to understand how something may have a different type of consciousness than we do and understanding what that means for how to ethically treat it. (It is interesting to consider the reverse too, how would a different type of conscious being develop its own ethics in regard to dealing with us?) This flows into your mention of animals and the environment and brings up the uncomfortable truth that we don't have a good track record here when it comes to ethical treatment of something that may be conscious in a different way than we are.
3. I was intrigued by reinforcement learning from human feedback (RLHF) too and will definitely check out the philosopher you mention. Tangentially, one thing I have been pleased by so far is that when I am having a conversation with ChatGPT at least, especially about something where I am way out of my depth, like Quantum Mechanics, I will frequently try to summarize or make my own analogy from what the LLM said, and if I am not getting it the LLM seems to have no hesitation in telling me I am wrong. . . even if it does do it in a very gentle way. Is there a distinction to be made between an LLM agreeing with you because it wants to please you and disagreeing with you where it is appropriate, but in a way that doesn't upset you? I don't have the answer here for how that can be/will be/is being controlled. . . and yeah, I don't what happens when conspiracy theorists get on ChatGPT. . . (Next post: A Flat-Earther walks into an LLM. . . ?)
2. It is an utterly fascinating mental exercise to try to imagine what an AGI-centric ethical model would look like and the fundamental interests it would be based upon.
When I think about the things that underwrite our human-centric models, I think of concern about mortality and the impact of our brief lives, the considerable emphasis on preserving and enabling human life, and an emphasis on checking the more obvious destructive domains of human fallibility (e.g., egregious self-interest, fear, pride/honor spirals). Do any of those fundamental building blocks of ethical constructs seem relevant to AGI?