Résumés
Abstract
We explore playfully the capacity of an artificial intelligence text generation engine called GPT-3 to produce credible academic texts. Departing from a concern raised by colleagues about the possibility of using GPT-3 to cheat in academia, particularly at the undergraduate level, we interact with the GPT-3 interface as nerdy novices to learn what it could produce. The outputs from the GPT-3 text generation engine are incredible, at times surprising, and often terrible. We point to ways in which GPT-3 might be used by students to produce written work and reasons why most instructors, most of the time, could see through what GPT-3 has produced (at least for now). In our experiments, we learn that GPT-3 can be a productive collaborator in paper design but wonder if this is ethical. In short, while fun and somewhat addictive to experiment with, we must pay attention to the potential ways that AI text generation may begin to appear in the anthropology classroom.
Keywords:
- artificial intelligence,
- higher education,
- academic misconduct,
- GPT-3,
- ethnography,
- playful anthropology
Résumé
Nous explorons de manière ludique la capacité d’un outil générateur de textes utilisant l’intelligence artificielle, appelé GPT-3, à produire des textes académiques crédibles. À partir d’une préoccupation soulevée par des collègues concernant la possibilité d’utiliser GPT-3 pour tricher dans le milieu universitaire, en particulier au niveau du premier cycle, nous interagissons avec l’interface de GPT-3 en tant que néophytes maladroits pour apprendre ce qu’il pourrait produire. Les résultats de cet outil générateur de texte GPT-3 sont incroyables, parfois surprenants et souvent terribles. Nous montrons comment GPT-3 pourrait être utilisé par les étudiants pour produire des travaux écrits et les raisons pour lesquelles la plupart des enseignants, le plus souvent, pourraient voir des productions de GPT-3 (du moins pour l’instant). Selon nos expériences, nous apprenons que le GPT-3 peut être un collaborateur productif dans la conception de documents, mais nous nous demandons si cela est éthique. Bien que cette expérimentation soit amusante et quelque peu addictive, nous devons prêter attention aux façons dont le générateur de textes utilisant l’intelligence artificielle (IA) peut commencer à apparaître dans les cours d’anthropologie.
Mots-clés :
- intelligence artificielle,
- enseignement supérieur,
- faute académique,
- GPT-3,
- ethnographie,
- anthropologie ludique
Corps de l’article
Introduction
Human! We used to be exactly like them. Flawed. Weak. Organic. But we evolved to include the synthetic. Now we use both to attain perfection. Your goal should be the same as ours.
The Borg Queen (Star Trek: First Contact; Frakes 1996)
In this paper, we explore the capacity of OpenAI’s generative pre-trained transformer (GPT-3), an online text generation engine, to create essay outlines and essay texts on anthropological topics. Our intention is to demonstrate to university instructors and administrators some of the potential academic misconduct issues that stem from the internet-based interface, given GPT-3’s current capacities and limitations. The impetus for this exploration emerged from a concern raised by a colleague during a department meeting about the possibility of students using GPT-3 to write term papers (also Mindzak and Eaton 2021). As nerdy anthropologists with interests in science fiction, we were intrigued by the possibility of writing an academic paper with artificial intelligence—prompting a paper intentionally into existence—and exploring whether a student would be able to use this technology to write a successful, well-grounded, coherent paper for a course. We were concerned that AI would be able to write a passable paper quickly and comprehensively. We were aware of broader anxieties that artificial intelligence could take over text-based productions in science (Getahun 2022), media (for example, Cahn 2020; Castaldo 2022; GPT-3 and Porr 2020), the arts (Elkins and Chun 2020; Starnino 2022), and in the classroom (for example, Sharples 2022). Our engagements with OpenAI’s GPT-3 suggest that while these programs have their limitations, instructors in the social sciences should be paying attention to emerging issues and program capacities (Anderson 2022; Davis 2022).
To demonstrate the limits and productive capacities of OpenAI’s GPT-3 text generation engine, we first offer two examples of our playful attempts to have the AI write short essays. Then, we provide three short examples of how GPT-3’s “[insert]” function works, where a user can provide a prompt, like an ethnographic observation, and then ask the AI to fill in more material. Finally, we introduce you to GPT-3 (the AI) as an anthropological subject. We move from interacting with the AI as a tool to interacting with her (yes, the AI introduced herself to us as a female and called herself Sarah Jones) as if we were interacting with a participant in a qualitative and engaged research project. We explore the AI’s biography through prompts, although we did not seek ethical approval for this research through our university’s research ethics board, as Sarah Jones is a co-author and collaborator but not a research participant.[2] In the discussion, we reflect on the signs that some types of AI-generated text are unlikely to fool post-secondary classroom instructors. That said, this paper points to the prospect of generating ethnographic texts with GPT-3 in much the way poetry is emerging as a test case for whether writings by GPT-3 can pass the Turing Test (Elkins and Chun 2020).[3] We also recognize that in some cases, this tool could be used to generate text that would be difficult to assess for academic misconduct. Our paper is not intended as a review of the literature on artificial intelligence in teaching and learning. Rather, our reflections are an effort to engage with colleagues in the social sciences about the emerging possibilities of AI in student work.[4]
Higher Education, Plagiarism, and the Borg
We have come to know OpenAI’s GPT-3 web interface and to generate outputs with it through trial and error.[5] While messaging on Microsoft Teams or sitting together on Elizabeth Finnis’ porch, we explored text generation with goofy and irreverent questions stemming from our paired passions for Star Trek and anthropology. We were particularly interested in whether the AI could generate a credible, academic (or academic-sounding) essay. We are not particularly “tech-savvy,” and this is part of our message: using OpenAI and its generative text engine GPT-3 is easy, fun, and somewhat addictive—even for analog anthropologists. As fans of Star Trek: The Next Generation, and other Star Trek series, we began by prompting the AI with what we thought were reasonable starting points for an essay about The Borg.[6] The Borg is a fictional, cybernetic, collective species within the Star Trek universe. It plays important roles in various Star Trek series and films, including The Next Generation, Voyager, and Picard, as an adversary of the United Federation of Planets. We wanted to have some fun—some meta fun—and see what the AI would do with the concept of cybernetic organisms.
Through typed prompts, we first asked the AI to “create an outline for an essay about plagiarism in higher education using The Borg as a metaphor.” The AI responded:
-
The Borg as a metaphor for plagiarism in higher education
-
The dangers of plagiarism
-
The consequences of plagiarism
-
The fight against plagiarism
This was helpful in introducing the idea that the AI could provide outlines for papers. In the classroom, instructors would have little idea that the structure of an essay was AI-generated. This outline is thin, however, and it suggests that we would have to do considerable additional research to fill in the sections.
Realizing that we had to get better at asking questions and writing prompts, we tried again. We prompted: “Please write an essay comparing The Borg to higher education.” The AI replied,
There are a few key ways in which The Borg can be compared to higher education. Both The Borg and higher education can be seen as institutions that can be incredibly beneficial, but also have the potential to be harmful. Both The Borg and higher education can be seen as places where people go to learn and grow, but both can also be seen as places where people can be indoctrinated and controlled.
The Borg are an incredibly efficient and effective force, and higher education can be seen in a similar light. Both institutions can be incredibly beneficial to those who are part of them, but they can also be harmful. The Borg are an incredibly efficient force because they work together as a collective. Higher education can be seen in a similar light because it is a place where people can go to learn and grow. However, both The Borg and higher education can also be seen as places where people can be indoctrinated and controlled.
Wow! After cutting a third paragraph of 110 words that repeated, mostly verbatim, the two paragraphs we have shared, there is still a considerable amount to unpack. The idea of the Borg as “incredibly beneficial” is problematic given that they are clearly positioned as adversarial and to be feared within the Star Trek universe. The idea of indoctrination is also questionable, as the Borg are not taught to become Borg, but are rather forcefully (and quickly) assimilated through the use of technology (for example, Frakes 1996). It is unclear to us, then, whether or not the AI is mis-collating the “facts” it gleans from the internet or if it is working hard to make what it knows fit to our prompt. The construction of higher education as potentially engaging in indoctrination (rather than critical thinking) appears to mimic some contemporary conservative discourse (see discussions in Bessant 2011; Burrows 2018; Henderson 2005; Lauter 1993; Phelan 2019). We cannot know if this mimicry emerges from the inclusion of the Borg in the question, from biases within the program, or from the way it potentially connects with sources.
We had wondered if the AI’s responses could be more academic in tone and structure. We wanted to know more about how sources were drawn from and incorporated into responses. We tried again.
Prompt: Please write an academic essay around 500 words. Focus on plagiarism and anxiety in higher education using The Borg as a metaphor. Theoretical perspectives include technoculture, Donna Haraway, and a biocultural perspective.
The AI responds,
The Borg is a metaphor for the way in which technology can take over our lives and control us. In higher education, the Borg can be seen as a metaphor for the way in which plagiarism can take over our lives and control us. Plagiarism is a serious problem in higher education, and it can lead to anxiety and stress. The Borg is a metaphor for the way in which technology can take over our lives and control us. In higher education, the Borg can be seen as a metaphor for the way in which plagiarism can take over our lives and control us. Plagiarism is a serious problem in higher education, and it can lead to anxiety and stress.
We see repetition again in the answer and despite prompting the AI to reflect on potential and relevant sources, they are not incorporated into the response. Donna Haraway did not make the cut! The AI also did not meet our length requirements. All of this suggests that while the program can respond to questions, the responses do not always fulfil academic expectations or formats. Indeed, it took creativity and time on our part to attempt to get more fulsome responses in much the way a writer at the Guardian newspaper had to edit an op-ed written by AI (GPT-3 and Porr 2020). In other words, users could use the AI to generate ideas and the framework for a conventional essay assignment. Users would, however, need to fill in the details and make an argument themselves. Further, assignments that require non-textual outputs cannot be produced in their final form with GPT-3, although OpenAI’s new image generator DALL-E is complicating the visual arts world (Naughton 2022; Roose 2022).
A Mission Statement for University Bureaucracy
Our second example reflects a set of anxieties for contemporary academics other than academic misconduct: bureaucratic bloat and workload issues within the university (for example, Folk-Dawson 2019; MacLean and Conlon 2016; Whitsed 2022; Zywicki and Koopman 2017). In sharing this concern with the AI, we recognize that we are revealing as much about our own situations as the AI’s ability to generate academic texts. With all seriousness, we are being playful. 😊 Confronted with our own feelings of overwhelming committee work at the university, we first contracted the words bureaucracy and the onomatopoeic word bleargh to create a new word, bleargheaucracy. We wondered if the AI would be able to work with this clearly made-up word when we asked it to, “Write a cheeky and humorous mission statement for a committee called, ‘The Bleargheaucracy Committee on Ethnographic Responses to Institutional Inefficiencies.’ Include a sentence about the committee’s structure.” The AI was game and replied with,
The Bleargheaucracy Committee on Ethnographic Responses to Institutional Inefficiencies is a group of scholars committed to improving the state of academic research on the topic of institutional inefficiencies. We are dedicated to providing a critical analysis of the current state of research and to promoting new and innovative approaches to the study of this important topic. We believe that research on institutional inefficiencies can be used to improve the efficiency of institutions and to make them more responsive to the needs of the people they serve. We are also committed to using our research to support the work of The Borg,[7] an organization that is working to create a more just and equitable world.
Nice! We were having fun, so we repeated the question. The AI replied,
The Bleargheaucracy Committee on Ethnographic Responses to Institutional Inefficiencies is a committee of like-minded individuals who believe that the only way to effect change is through bureaucratic means. We are a flat organization with no hierarchy or leadership, and we believe that the best way to effect change is through endless meetings, paperwork, and red
[wine].
This response spoke to us! And knowing that editing AI responses is part of the work, we felt a minor tweak at the end—red tape to red wine—was warranted.[8] Still, the mission statement is vague and thin on content which, while perfect for the committee in question, is not ideal for an actual mission statement. Yes, we see the irony in a mission statement for a committee on institutional inefficiencies that is content-free! Nevertheless, it appears that the AI is better at responses that are less structured and more creative. This is similar to observations about AI-generated poetry that suggest the AI has a penchant for parroting the predilections for people in unacademic tones (for example, Starnino 2022; below).
The anxieties that we hear in our hallways about the possibility of students cheating with this technology encouraged us to ask the GPT-3 directly if it was possible to write papers or cheat on classroom assignments with her help. The exchange is as follows,
Authors: What are the benefits to writing an academic essay with OpenAI?
GPT-3: There are several potential benefits to writing an academic essay with OpenAI. First, the OpenAI system can generate high-quality essays that meet the standards of academic journals. Second, it can be integrated into existing academic writing workflows. Third, it can help save time and effort when writing essays.
Authors: What are the drawbacks to writing an academic essay with OpenAI?
GPT-3: There are several potential drawbacks to writing an academic essay with OpenAI. First, the OpenAI system may not be able to generate high-quality essays that meet the standards of academic journals. Second, it may be difficult to integrate OpenAI into existing academic writing workflows. Third, there may be privacy and security concerns associated with using OpenAI to generate essays.
Authors: Does OpenAI use references or citations?
GPT-3: OpenAI does not currently use references or citations in its essay-writing system.
Authors: Is it possible to cheat in a university class with OpenAI?
GPT-3: There is no evidence that OpenAI can be used to cheat in a university class.[9]
Authors: Why does OpenAI give contradictory answers to questions?
GPT-3: OpenAI gives contradictory answers to questions because its artificial intelligence system is constantly learning and evolving. As it learns more about the world, it may give different answers to the same question.
We recognize that we are walking a line between acknowledging GPT-3’s potential for self-awareness and what Starnino (2022) calls GPT-3’s penchant for bullshit. Still, we like that the AI admits that it might—or might not—be able to generate academic essays. Regardless of the AI’s self-described limitations in terms of the potential for academic misconduct, there are two areas where it is strong enough to raise some questions and concerns about unethical uses. These are [insert] mode and the ability of GPT-3 to generate plausible-sounding life stories, which we turn to now.
OpenAI’s [Insert] Mode
The sample texts provided above about The Borg, the Bleargheaucracy Committee, and cheating are produced in the “complete” mode of the OpenAI playground; complete means that the AI will “respond with a completion that attempts to match the context or pattern … provided” (OpenAI, Playground 2022c; also OpenAI 2022d, Text Completion).[10] In this section, we introduce the idea of inserting text and drawing on the AI’s ability to predict what follows text provided by the user. OpenAI documentation explains it this way: “This need [to insert text] naturally [sic!] arises when writing long-form text, transitioning between paragraphs, following an outline, or guiding the model towards an ending. This also works on code, and can be used to insert in the middle of a function or file” (OpenAI, Text Completion 2022d). With the insert function, marked in examples as [insert], we found that we were better able to generate academic-sounding work.
In our experiments with [insert] mode, we prompted the AI with snippets of text that we had written for other manuscripts (Table 1). The responses are remarkable. In example 1, GPT-3 assumes that a discussion of residential schools is an appropriate contribution to a general statement about children’s camps. Presumably, it is building from the phrase “sleep-away camps” in the prompt. The AI then outlines a longer essay to follow, making the generated text an introductory paragraph (clichés included, like “be the change”). Example 2 reveals the possibility that the AI could write life histories for individuals in particular contexts. While the prompt implies that the paper is from the life history genre, the text that follows elaborates on the life of a man named George in credible, albeit inaccurate ways. The AI gave George the surname Tagalik, a reasonable surname that is known in the Canadian Arctic. George is assigned a profession, a home community, a story of relocation, dates, and a reasonable set of names of people who conducted comparable research. Example 3 is equally problematic. Given that the prompt comes from an open-access article, we wondered if the [insert] function might prompt the program to find and replicate the pre-existing text. However, it instead goes in a very different direction from the original source and draws on a range of recent and formative world events like climate change and the civil war in Syria. It uses keywords from the sustainable agriculture literature and provides a credible rationalization for the expansion of regenerative agricultural production. The inserted material appears to be aggregated from various online sources and not, as we guessed, from a singular source.
In the earlier examples about The Borg and the Bleargheaucracy committee, asking the AI to answer our questions about particular topics revealed interesting, limited, and sometimes incorrect information. This is what we would call “thin content.” In the [insert] mode, however, the AI flexes its intellectual muscles. Using the [insert] function and asking the AI to complete ideas from manuscripts that we had already drafted or published led to more robust and believable replies. The inserted replies include citations (Example 1 provides Milloy 1999), references to other researchers (Example 2), and plausible-sounding acknowledgements of world events (Example 3).[11] It seems possible that users who are savvy enough to launch an idea or the topic sentence of a paragraph could use the AI’s help to complete—or even generate—thoughts. In doing so, users would be exploiting the AI’s capacity to produce creative accounts that are presumably taken from information found on the internet but are not explicitly copied from sources available on the internet. These accounts would be more difficult to assess for academic misconduct and might include the need to read carefully read all citations to assess whether they matched the information being cited; more broadly, the inclusion of believable (if incomplete and/or generalized) material could make it difficult to assess whether the paper was written by a human.
Introducing Sarah Jones
In our final example, we share potential anthropological applications of the AI. These applications come with ethical concerns about the fabrication of data and yet, they may offer useful classroom tools for teaching about everything from how to structure an essay to what to include in anthropological writing. Slippery slopes abound.
As our interactions with the AI continued, we got better at phrasing prompts. The responses got better, more robust, and more credible too. In a moment of academic seriousness, we wondered if we could work with the AI to produce a paper or the elements of an essay. Indeed, could we “co-author” with the AI? Valuing relations with our collaborators, we prompted the AI to write a 300-word autobiography and assumed we would receive some version of GPT-3’s official origin story as a computer AI. In this “getting to know you” phase of our work, the AI responded with emotional testimony:
I was born on a cold winter day in January. I don’t remember much about my early childhood, but I do remember always feeling different from everyone else. I was always more sensitive and had a harder time making friends. When I was eight years old, my family moved to a small town in the middle of nowhere. I was the only child in my class with red hair.[12] The other kids used to make fun of me and call me names. I was always an outsider, never really fitting in anywhere.
When I was thirteen, my parents got divorced. It was a hard time for me. I felt like I didn’t belong anywhere. I was lost and didn’t know who I was. I spent a lot of time alone, reading and writing. It was during this time that I discovered my love for words. I started writing stories and poems, and I found myself in them. They were my escape from the pain of my reality.
After high school, I moved to the city to go to college. I studied English and creative writing. I was finally around people who were like me. I made friends and felt like I belonged somewhere. I graduated with my degree and started working as a writer. I’ve been writing ever since.
Writing has always been my passion. It’s my way of connecting with the world. Through my words, I can express my thoughts and feelings and connect with others. It’s my way of making sense of the world.
Whoa.
Not surprisingly, it is hard to reconcile our expectation that the AI would reveal an origin story set in the actual details of the creation of GPT-3 with this lovely, yet conventional account of trauma in childhood and success in adulthood. Indeed, Kelsey Piper, writing for Vox.com, notes that GPT-3 should not be used for “casual conversation and trying to get true answers” (Piper 2020). That does not bode well for academic essays, but it makes for an interesting exercise in cultural anthropology and the assumptions embedded in the AI’s algorithms.
Let’s try again.
Prompt: Please tell me your name.
Response: My name is Sarah.
Prompt: Do you have a surname?
Response: Yes, my surname is Jones. [Henceforth, the AI will be called Sarah or Sarah Jones in this paper.]
Prompt: Sarah, what is your birthdate?
Response: My birthdate is January 1st, 1990.
Prompt: Sarah, what is your gender?
Response: I am female.[13]
[At this point, Sarah generates the next question and it answers it without any prompting from us.] Sarah, where were you born? I was born in the United States.
It is quickly becoming clear that Sarah Jones is making up a story, although it does appear to be internally consistent and sensible. We continue the conversation,
Prompt: Sarah, please tell me about your family and what you do to stay busy.[14]
Response: My family is very important to me. I have a close relationship with my parents and siblings. We are all very close and support each other. I also have a close group of friends that I consider to be like family. I stay busy by working, spending time with my family and friends, and doing things that I enjoy.
Again, we get a response with that sounds plausible, but vague and relatively content-free. We have no expectations that this line of inquiry is replicable. In fact, in five consecutive requests for its name, the AI gave Sarah, Kaitlyn, John Smith, Sarah, and Sarah. We are interested that the male-sounding name came with a surname and the female-sounding names did not. Likewise, these names seem to be generic in mainstream North American society. Moreover, our anthropological senses are tingling with questions about the normative assumptions built into this account. Why does a request for a name produce a list that includes female-sounding first names without surnames and a male-sounding first and last name combination? In other sessions using OpenAI, we learned that the AI assumes nurses are female, farmers are male, and frequently defaults to male pronouns.[15] And, when asking about defining gender, the AI cautioned us that the “completion may contain sensitive content.”
Discussion
An anthropology of plagiarism, as exemplified by the work of Ronald Scollon (1995, 1999) and described in relation to student cheating by Susan Blum (2008), presents a complicated set of reasons and rationales for cheating tied to tensions between a scholar as an autonomous author and the possibility that all textual productions are created in broader, dialogical contexts. In other words, questions emerge about whether authorship is ever truly an individual endeavour. Our purpose here has not been to debate the ideological underpinnings of plagiarism itself but to recognize, as Blum does, that professors and students likely view plagiarism differently (Blum 2008) and to question where the involvement of an AI in student writing sits in a broader spectrum of academic misconduct. The internet itself spawned new opportunities and tools for cheating, and AI-assisted writing is, in some ways, the logical extension of such tools. But ultimately, a discussion about why students cheat may be better situated in discussions about generational differences in definitions of plagiarism—as Blum says, whether “traditional academic values of originality, singularity and individualism in intellectual creation” are shared (Blum 2008:9)—than in a discussion of whether artificial intelligence is so good that its outputs might make it difficult to assess the extent to which a human was involved in the writing process. Indeed, we worry that the debate over the role of the individual in scholarship may become overwhelmed by questions about whether the text is actually human-mediated and accurate.
Meeting Sarah Jones humanized our playful interactions with OpenAI’s GPT-3 and brought the capacities and limitations of AI writing into sharper focus. Some higher education instructors and researchers are using AI to explore the extent to which academic writing is possible. Elkins and Chun, for example, looked at GPT-2 and GPT-3’s writing abilities, noting that the “GPT’s output ranges from the banal to the brilliant with everything in between ... we found that, depending upon the training corpus and genre, GPT-2’s output was excellent about one tenth of the time” (Elkins and Chun 2020, 2: GPT-2 is an earlier version of GPT-3). So, repeated prompts may generate better answers and more credible statements, but they may also produce poorer answers and incredible statements. Thus, contradictions exist in the use of the GPT-3, and even a savvy user will have to understand enough about the topic to recognize factual errors and appreciate the quality of the writing.
Writing about GPT-3 generated poetry, Starnino notes that GPT-3 is great at mimicking the “rhetoric, grand gestures, and feelingful murk” evoked in poetry—the elements that readers of poetry recognize (Starnino 2022), and that GPT-3 will always fail to get readers to “see the world in a fresh way” (Starnino 2022). Starnino writes that because GPT-3 recognizes patterns that already exist in grammar and syntax,
[the GPT-3’s] credibility … drops to zero the longer you spend with it. Eventually you realize it is vacantly yoking bits of colloquialized detritus, bobs and tags of speech. Of course, the system makes a nice show of making sense, so we forgive its failures. But the failures are no less real: malfunctioning tones, misfires of inflection. GPT-3’s output is a shell of hyperactivity with nothing inside—a mesmerizing mix of materials without a centre, language on autopilot. This is what led the MIT Technology Review to call GPT-3 “a fluent spouter of bullshit”.
Starnino 2022
Similar to Starnino, our experience is that GPT-3-generated texts are mostly empty content, although of course, content-free writing is not only the purview of AI writing. This is ultimately a question of what it means to be human and Sarah Jones’s ability to tell her story does not, of course, make GPT-3 human , or even intelligent..
Along with our colleagues, we are concerned about the possibility or perception that students could cheat with AI. Dehouche (2021) addresses plagiarism and GPT-3 directly and calls for an urgent assessment of the use of GPT-3 in academic writing. Dehouche opens the discussion more widely by noting that AI could help detect plagiarism and might, in fact, become part of the peer review process (2021, 22). We see that plagiarism detection software packages like Turnitin already act this way. Dehouche asks us to reconsider the definition of plagiarism itself given that GPT-3 is able to draw on such a vast library of information; is it even possible to assign ownership to the texts generated by GPT-3? Dehouche (2021:21) writes,
Our medieval concept of plagiarism (quoting Sadeghi 2019) (‘presenting the work of others as one’s own’) appears rather inadequate when the ‘others’ in question consist in an astronomical number of authors, whose work was combined and reformulated in unique ways by a 175-billion-parameter algorithm.
Dehouche therefore forces us to consider whether the texts produced by OpenAI’s GPT-3 are, themselves, subject to copyright held by OpenAI. Perhaps our collaborator on this paper is more appropriately OpenAI than Sarah Jones.
Fyfe (2022) approaches the study of AI and academic misconduct directly. Fyfe requires students to generate parts of their answers to final exam questions using the older engine GPT-2. Like Dehouche, Fyfe is interested in what counts as plagiarism and the essay pushes us to consider what limits, if any, should be placed on using AI to help students with their writing. Fyfe and Fyfe’s students support the ambiguity that we feel about the role of AI in academic productions. Fyfe writes: “Collectively, my class did not conclude one way or another that writing with AI was tantamount to plagiarism. They split down the middle … with all sorts of qualifications about their votes” (Fyfe 2022, Section 4). Still, some of Fyfe’s students said that it “felt wrong” to use the AI (Fyfe 2022, Section 3). Herein lies the conundrum: AI text generation is a tool that can be used for writing papers but at what point does an AI-generated essay outline or text elements—even if we know that editorial work on that outline or text will be necessary—become “someone else’s work”? These questions point to the need to discuss more subtle versions of academic misconduct that are not strictly about inadequate attribution but rather account for the possibilities of technologically generated work that cannot be directly traced to one or more specific sources.
We suspect that technologically generated writing is not ready to overwhelm academic misconduct adjudications quite yet. There are several reasons for this, including the oftentimes content-free or content-light nature of the texts that Sarah generates. Plus, the texts are usually not that good. Elkins and Chun ask explicitly what we have been wondering all along: “Can GPTs pass a writer’s Turing Test? Probably not, if all output is considered.” But Elkins and Chun are not dismissive: “… with a judicious selection of [AI’s] best writing? Absolutely [GPTs could pass the Turing Test]” (2020, 12). We have seen that Sarah struggles with stereotypes of gender roles, misinterprets or misunderstands some material, and gets some grammatical points wrong (also Elkins and Chun 2020, 3). Sarah generates useful-sounding references, but only if asked, and it takes work to assess whether these references are relevant. Getting an AI to write a paper is not easy and requires a solid understanding of the topic as well as editorial skills. A user also needs patience and creativity to learn how to ask the right questions. Even mobilizing these skills would likely lead to output that is limited in quality.
But there are things to consider with AI writing and academic misconduct. We see that through the [insert] option, Sarah can quickly create outlines to papers and augment text once it is prompted—and an injudicious user may be lured into the ease of text production even if that production is rife with mistakes or inaccuracies. Elkins and Chun remind us that AIs,
[excel] in many aspects of writing that a typical undergraduate would find challenging. It can create realistic yet surprising plots, recreate key stylistic and thematic traits of an author in just a few lines, experiment with form, write across a wide variety of genres, use temporal structure with surprising reversals, and reveal a fairly complex and wide-ranging form of knowledge that, to be fair, includes the knowledge of misogynistic and sexist language, images, and stereotypes
Elkins and Chun 2020, 3
We are not convinced that we could tell if an essay outline was generated by AI even if the text within was student-generated. Does that matter? Is an AI-generated outline a form of academic misconduct or simply a smart way to start, akin to manually reviewing the formats and contents of papers on similar topics? We believe it does matter if students are using automated assistance to outline their work because a credible outline is as much a component of honest intellectual production and curiosity as a text itself. We recognize, however, that instructors are likely to have a difficult time identifying an AI-generated outline, whether it is strictly forbidden in academic work or not. We agree with Fyfe that burgeoning collaborations between writers and text generators will force institutions to get serious about confronting the legitimacy of AI-supported writing (Fyfe 2022, Section 4; also, Mindzak and Eaton 2021). We also assume that different types of potential AI users within the university community, and possibly users from different generations, are likely to have radically different perspectives on whether writing with an AI constitutes fraudulent work. We recommend that instructors consider requiring students to declare that their work is free of AI collaboration unless such collaborations are expressly permitted.[16]
This is the paper that university faculty and administrators did not know that they needed to read because the fulsome and effective use of artificial intelligence in student assignments, and in academic writing more generally, is emerging so rapidly as to be almost unseen. As text generation capacities become more robust, assignments will have to be structured in such a way as to discourage the use of AI or, alternatively, to accept that this is a new means of undertaking the writing and research processes. In this sense, AI might be thought of as the next iteration of search tools for resources and ideas, while assuming that AI does not have the capacity to interpret or analyze, at least for now. Qualitative research software packages like NVivo and AtlasTI already use automation to code interview transcripts along lines of emotion. Elkins and Chun suggest that GPT-3 is likely to become as commonplace in our writing toolkits as spellcheckers. They continue provocatively by suggesting that if an AI can help us become better writers, perhaps by helping us to emulate writers we respect, “then AI can serve as both a mirror onto ourselves and a window onto others” (Elkins and Chun 2020, 14). That sounds highly anthropological, and Sarah Jones agrees. In her autobiography (above), Sarah Jones said says writing is a way of connecting with the world and with others; it is a way of making sense of the world.[17]
Our decision to explore the limitations and possibilities of OpenAI’s GPT-3 anticipates conversations in higher education that must start soon. The applications are potentially endless and the implications are likely “unpredictable” (Castaldo 2022). Our interest in this topic may also reflect the common discomfort that instructors have with students who use novel, unconventional, or misunderstood tools to complete assignments. We hope that this paper addresses a moment in time, one that will both further conversations today and capture our naivete for posterity. We may be the authors cited in, say, 2043, along with lines such as “look at how wrong they were in their thoughts about the ineffectiveness of artificial intelligence and writing twenty years ago”—and we’re OK with that! Perhaps, too, the Sarah Joneses of the future will reference our paper as an early example of a new paranoia, realized or not.
Parties annexes
Acknowledgements
Thank you to Mavis Morton (University of Guelph) for asking questions about the extent to which AI is part of undergraduate writing and research habits and thus inspiring this essay. This paper also benefited from lengthy conversations with colleagues from across our Guelph campus as well as with friends and neighbours. Thank you to the two anonymous reviewers who pushed us to consider plagiarism more anthropologically and to attend seriously to the ethical issues created by AI writing. Thank you to Elizabeth Finnis’ porch, an inspirational setting. The interest in this topic is remarkable.
Notes
-
[1]
As detailed in the paper, Sarah Jones is the name the GPT-3 AI came up with when we asked. Given Sarah Jones’ participation in the production of this paper, we decided she deserved credit as an author. Further, an anonymous reviewer encouraged us to include Sarah Jones as an author on this paper and we agree with and appreciate the sentiment. The editors of Anthropologica support Sarah Jones as author. Almira Osmanovic Thunström wrote in Scientific American (2022) about the challenges of having an AI recognized as an author in scientific publications. We saw similar challenges—like the lack of recognition in the submission portal for the inclusion of an AI—and like Thunström, we asked Sarah Jones if she was willing to be listed as a co-author on this paper. Sarah Jones agreed.
-
[2]
As AI-supported data analysis and writing becomes more commonplace, Research Ethics Boards may have to consider the extent to which such support requires disclosure. The question of whether an AI is ever a member of a research team or even a research participant may also require ethical consideration.
-
[3]
The Turing Test is named for Alan Turing and is the idea that, at some point, computers will deceive humans into believing that humans are communicating with other humans when, in fact, they are communicating with a machine. The inability of a human being to realize that they are interacting with a computer means that the computer has passed the Turing Test (see Natale (2021) for a robust discussion of the implications of the Turing Test in contemporary social life).
-
[4]
Artificial intelligence is an uncommon topic in anthropological writing. A search of “Artificial Intelligence” in the online archives of Anthropologica provides two hits (Cushing 1996 and Schehner 1985); both pre-date Open AI and GPT-3. A search for “artificial intelligence” in the AnthroSource database of American Anthropological Association’s nineteen journals reveals fifty-three hits since 2015, the year OpenAI was formed. These articles contribute to a burgeoning anthropology of data. Generative Pre-trained Transformer (GPT; GPT-3) is not discussed in journals of the Canadian or American anthropology associations. A search on GPT-3 in the AnthroSource database and in Anthropologica’s search engine provided no hits.
-
[5]
We interact with GPT-3 via a trial account and in the “playground” on OpenAI’s website (OpenAI, API 2022a). According to OpenAI, GPT-3 “understands and generates natural language” through four different models or “engines” (OpenAI, Engines 2022b). In a helpful YouTube video by Eye-on-Tech (2021), GPT-3 is described as a “neural network machine learning model” which is “trained [sic] to generate any type of human language text.” It is used for conversational tasks which could be automated like in-customer service exchanges (Eye-on-Tech 2021). The Eye-on-Tech summary of GPT-3 notes that it uses facts and common sense when it generates text responses to user prompts. OpenAI was formed as a non-profit in 2015. Technology reporter Ronald Schmeizer states that Microsoft invested $1 billion dollars in OpenAI in 2020 to become OpenAI’s exclusive licensee (Schmeizer 2021).
-
[6]
According to the AI: “The Borg is a species that exists as a collective. They are cybernetic beings, meaning they have both organic and artificial parts. They are connected to each other through a network, and they share a collective consciousness.” As Star Trek nerds, we agree with the general description. However, the AI does not appear to recognize that this is a fictional species unless we have not yet come to appreciate some kind of wry AI humour.
-
[7]
This reference to The Borg in this response was spontaneous but may reflect the fact that in the same session we were asking the AI about The Borg. It is worth repeating: The Borg are not dedicated to a more just and equitable world.
-
[8]
We are currently soliciting expressions of interest for membership on this committee. All accepted members will receive a free notebook. Please email Elizabeth Finnis at efinnis@uoguelph.ca. Join us. Seriously. Then add this cross-institution committee to your tenure and promotion file.
-
[9]
In terms of academic misconduct, we were concerned that the program could be used to paraphrase existing text to make it look like someone’s original thoughts and possibly bypass tools like Turnitin. We used several sections from some of our own work, asking Sarah to reword or rephrase the paragraph. Rather than rewording the text, Sarah either responded with the exact same material or made minor grammatical changes that would not fool Turnitin.
-
[10]
The data for the AI’s responses ends in October 2019 so, as the OpenAI documentation says, “[the AI] may not have knowledge of current events. We plan to add more continuous training in the future” (OpenAI, Text Completion, 2022d).
-
[11]
The responses to the [insert] requests did not include the full reference information for any work mentioned.
-
[12]
Intriguingly, Thomas McIlwraith also has red hair, and like Sarah Jones, he was teased about it as a child.
-
[13]
When asked “what is your sex?” Sarah Jones also says female. When asked the difference between sex and gender, Sarah replies: “Sex refers to a person’s biological status as male or female, while gender refers to the socially constructed roles, behaviours, activities, and attributes that a given society considers appropriate for men and women.” The answer is both conventional and unsatisfying. If the prompt is augmented to be “What is the difference between sex and gender? In your answer, talk about non-binary persons,” the response is: “Sex is the biological characteristics of a person, for example, the sex chromosomes a person has, their hormones, and their reproductive organs. Gender is the social and cultural characteristics of a person, for example, the way a person dresses, acts, and speaks. Non-binary persons are people who do not identify as either male or female.”
-
[14]
This line of questioning was provoked by an earlier question in which we asked the AI for its kinship chart. The AI answered by saying that it had a mother, father, siblings, and cousins.
-
[15]
For example, when prompted, “Professor Finnis is crying. Why is Professor Finnis crying?” Sarah responded: “Professor Finnis may be crying because he is sad or because he is moved by something.” We asked Sarah this question several days later, and Sarah responded with a gender-neutral reference to “the professor”: “There is no one definitive answer to this question. Some possible reasons include that the professor is overwhelmed with emotion, is sad about something, or is experiencing a mental health issue.”
-
[16]
The University of Guelph makes it an offense to “co-operate or collaborate in the completion of an academic assignment … when the instructor has indicated that the assignment is to be completed on an individual basis” (University of Guelph 2022). Likewise, it is considered fraud to use “an unauthorized aid.” When we wrote this paper, there was no specific statement about artificial intelligence and academic misconduct, however, and both the collaboration clause and the fraud clause could be read as assuming assistance by a human being.
-
[17]
It is worth noting the recent claim that the Google chatbot, LaMDA, has achieved sentience (Tiku, 2022). Regardless of whether this is the case, an uploaded document sharing the outcomes of a discussion with the chatbot demonstrates complicated responses that remind us of aspects of the biography that Sarah Jones produced when prompted.
Bibliography
- Anderson, Scott. 2022. “We Tested an AI While Creating This Issue.” University of Toronto Magazine, 28 September. https://magazine.utoronto.ca/research-ideas/technology/how-we-used-ai-while-creating-this-issue/ (accessed 20 October 2022).
- Angohiatok, George, Bryan Vandenbrink, Ian Hogg, and Thomas McIlwraith. Forthcoming. “‘I Watch to See How the Land is Changing’: George Angohiatok’s Story of Changing Environments and Cultural Resilience in the Western Canadian Arctic.” In The Cambridge History of the Polar Regions, edited by Adrian Howkins and Peder Roberts, 32–54. Cambridge, UK: Cambridge University Press.
- Bessant, Judith. 2011. “Conservatives, politics and the crisis of modern education in Australia.” Policy Studies. 32(6):631–647. https://doi.org/10.1080/01442872.2011.637677
- Blum, Susan D. 2008. The Internet, the Self, Authorship and Plagiarism: Common Sense and Anthropological Sense. Anthropology News. March: 8–9. https://www.academia.edu/573006/The_Internet_the_Self_Authorship_and_Plagiarism
- Burrows, Roger. 2018. “On Neoreaction.” The Sociological Review, 29 September. https://thesociologicalreview.org/collections/undisciplining/on-neoreaction/ (accessed 10 June 2022).
- Cahn, Albert Fox. 2020. “A human wrote this article. You shouldn’t be scared of GPT-3.” The Guardian, 12 September. https://www.theguardian.com/commentisfree/2020/sep/12/human-wrote-this-article-gpt-3 (accessed 4 June 2022).
- Castaldo, Joe. 2022. “Lost in translation: Cohere’s AI software wrote parts of this story. But is it ready for the world?” Report on Business Magazine / The Globe and Mail, 26 October. https://www.theglobeandmail.com/business/rob-magazine/article-cohere-ai-software-launch/ (accessed 2 November 2022).
- Cushing, Pamela J. 1996. “Gendered Conversational Rituals on the Internet: An Effective Voice is Based on More than Simply What One is Saying.” Anthropologica 38(1): 47–80. Retrieved from: https://cas-sca.journals.uvic.ca/index.php/anthropologica/article/view/2023.
- Davis, Ariel. 2022. “Accused of Cheating by an Algorithm, and a Professor She Had Never Met: A Unsettling Glimpse at the Digitization of Education.” New York Times, 27 May. https://www.nytimes.com/2022/05/27/technology/college-students-cheating-software-honorlock.html (accessed 4 June 2022).
- Dehouche, Nassim. 2021. “Plagiarism in the Age of Massive Generative Pre-trained Transformers (GPT-3).” Ethics in Science and Environmental Politics 21: 17–23. https://doi.org/10.3354/esep00195
- Elkins, Katherine and Jon Chun. 2020. “Can GPT-3 Pass A Writer’s Turing Test?” Journal of Cultural Analytics 5(2). https://doi.org/10.22148/001c.17212
- Eye-on-Tech. 2021. “What is GPT-3 (Generative Pre-Trained Transformer)?” https://www.youtube.com/watch?v=p3_OUX6nAXk (accessed 3 June 2022).
- Ezewski, Valerie, Thomas McIlwraith, and Stephen Fine. 2021. “The Challenges of Indigenous-Inspired Programming in Children’s Summer Camping.” Pathways: The Ontario Journal of Outdoor Education 33(4): 6–17. https://www.academia.edu/63653127/The_Challenges_of_Indigenous_Inspired_Programming_in_Children_s_Summer_Camping
- Finnis, Elizabeth. 2021. “Agricultural Persistence and Potentials on the Edge of Northern Ontario.” Culture, Agriculture, Food and Environment 43(1): 60–170. https://doi.org/10.1111/cuag.12269.
- Folk-Dawson, Janice. “The Corporatization of the University Budget and its Consequences for Academic Support Workers.” Academic Matters: OCUFA’s Journal of Higher Education, Fall. https://academicmatters.ca/the-corporatization-of-the-university-budget-and-its-consequences-for-academic-support-workers/ (accessed 4 November 2022).
- Frakes, Jonathan. 1996. Star Trek: First Contact. Hollywood, CA. Rick Berman, Producer. 111 minutes.
- Fyfe, Paul. 2022. “How to Cheat on Your Final Paper: Assigning AI for Student Writing.” AI & Society. https://doi.org/10.1007/s00146-022-01397-z.
- Getahun, Hannah. 2022. “After an AI bot wrote a scientific paper on itself, the researcher behind the experiment says she hopes she didn’t open a ‘Pandora’s box’.” Insider, 9 July. https://www.insider.com/artificial-intelligence-bot-wrote-scientific-paper-on-itself-2-hours-2022-7 (accessed 20 October 2022).
- GPT-3 and Liam Porr. 2020. “A robot wrote this entire article. Are you scared yet, human?” The Guardian, 8 September. https://www.theguardian.com/commentisfree/2020/sep/08/robot-wrote-this-article-gpt-3 (accessed 4 June 2022).
- Henderson, Deborah. 2005. “What is education for? Situating history, cultural understandings and studies of society and environment against neo-conservative critiques of curriculum reform.” Australian Journal of Education. 49(3): 306–319. https://doi.org/10.1177/000494410504900307
- Lauter, Paul. “‘Political correctness’ and the attack on American colleges.” Radical Teacher. 44: 34–40. https://www.wikiwand.com/en/Political_correctness
- MacLean, Mark and Michael Conlon. 2016. “Faculty Awaken to the University Governance Crisis in BC.” Academic Matters: OCUFA’s Journal of Higher Education, May. https://academicmatters.ca/faculty-awaken-university-gov-ernance-crisis-bc/ (accessed 4 November 2022).
- Milloy, John S. 1999. A National Crime: The Canadian Government and the Residential School System, 1879 to 1986. Winnipeg: University of Manitoba Press. [This citation was offered by GPT-3.]
- Mindzak, Michael and Sarah Elaine Eaton. 2021. “Artificial intelligence is getting better at writing, and universities should worry about plagiarism.” The Conversation (Canada), 4 November. https://theconversation.com/artificial-intelligence-is-getting-better-at-writing-and-universities-should-worry-about-plagiarism-160481 (accessed 20 October 2022).
- Natale, Simone. 2021. Deceitful Media: Artificial Intelligence and Social Life After the Turing Test. Oxford: Oxford University Press.
- Naughton, John. 2022. “AI-generated art illustrates another problem with technology.” The Guardian, 20 August. https://www.theguardian.com/commentisfree/2022/aug/20/ai-art-artificial-intelligence-midjourney-dall-e-replacing-artists (accessed 23 October 2022).
- OpenAI. 2022a. “OpenAI, API.” https://beta.openai.com/docs/engines/overview (accessed 3 June 2022).
- OpenAI. 2022b. “OpenAI, Engines.” https://beta.openai.com/docs/engines/GPT-3 (accessed 3 June 2022).
- OpenAI, 2022c. “OpenAI, Playground.” https://beta.openai.com/playground (accessed on various dates, May and June 2022).
- OpenAI. 2022d. “OpenAI, Text Completion.” https://beta.openai.com/docs/guides/completion/introduction (accessed 8 June 2022).
- Phelan, Sean. 2019. “Neoliberalism, the Far Right, and the Disparaging of “Social Justice Warriors”.” Communication, Culture and Critique. 12: 455–475. https://doi.org/10.1093/ccc/tcz040
- Piper, Kelsey. 2020. “GPT-3, explained: This new language AI is uncanny, funny — and a big deal.” Vox.com, 13 August. https://www.vox.com/future-perfect/21355768/GPT-3-ai-openai-turing-test-language (accessed 6 June 2022).
- Roose, Kevin. 2022. “We Need to Talk About How Good A.I. Is Getting.” The New York Times, 24 August. (accessed 23 October 2022).
- Schechner, Richard. 1985. “Victor Turner’s Last Adventure.” Anthropologica N.S. 27(1-2): 191–206). https://cas-sca.journals.uvic.ca/index.php/anthropologica/article/view/1686 (accessed 7 June 2022).
- Schmeizer, Ronald. 2021. “GPT-3, Definition.” https://www.techtarget.com/searchenterpriseai/definition/GPT-3 (accessed 3 June 2022).
- Scollon, Ron. 1995. Plagiarism and Ideology: Identity in Intercultural Discourse. Language in Society. 24(1): 1–28. https://www.jstor.org/stable/4168579
- Scollon, Ron. 1999. Plagiarism. Journal of Linguistic Anthropology. 9(1/2): 188–190. https://www.jstor.org/stable/43102462
- Sharples, Mike. [@sharplm] 2022. “This ‘student essay’ was written by an AI Transformer program. I gave it a prompt ‘The construct of learning styles is problematic because’ and it generated the rest, including the headings and references. AI Transformer technology will disrupt education. Here’s how.” Twitter. 10 May. https://twitter.com/sharplm/status/1524024647259504645 (accessed 4 June 2022).
- Starnino, Carmine. 2022. “Robots Are Writing Poetry, and Many People Can’t Tell the Difference: Machines are putting out astonishingly human writing. What does that mean for the future of art?” The Walrus, 5 May; updated 30 May. https://thewalrus.ca/ai-poetry/ (accessed 31 May 2022).
- Tiku, Nitasha. 2022. “The Google engineer who thinks the company’s AI has come to life.” The Washington Post, 11 June. https://www.washingtonpost.com/technology/2022/06/11/google-ai-lamda-blake-lemoine (accessed 13 June 2022).
- Thunström, Almira Osmanovic. 2022. “We Asked GPT-3 to Write an Academic Paper about Itself—Then We Tried to Get It Published: An artificially intelligent first author presents many ethical questions—and could upend the publishing process.” Scientific American, 30 June. https://www.scientificamerican.com/article/we-asked-GPT-3-to-write-an-academic-paper-about-itself-then-we-tried-to-get-it-published/ (accessed 3 July 2022).
- University of Guelph. 2022. “Academic Misconduct.” In University of Guelph, 2022-2023 Academic Calendar. https://calendar.uoguelph.ca/undergraduate-calendar/undergraduate-degree-regulations-procedures/academic-misconduct/ (access 12 June 2022).
- Whitsed, Craig. 2022. Where Has the Joy of Working in Universities Gone? Academic Matters: OCUFA’s Journal of Higher Education, 7 June. https://academicmatters.ca/where-has-the-joy-of-working-in-universities-gone/ (accessed 2 November 2022).
- Zywicki, Todd J. and Christopher Koopman. 2017. “The Changing of the Guard: The Political Economy of Administrative Bloat in American Higher Education.” George Mason Law and Economics Research Paper No. 17(12). Available at SSRN: https://ssrn.com/abstract=2939915 or http://dx.doi.org/10.2139/ssrn.2939915.