Wednesday, February 12, 2025

Artificial Intelligence in the World of Publishing

 Introduction

In the past several years, generative artificial intelligence programs have made incredible breakthroughs in their usability, accuracy, and quality. As programs such as ChatGPT and Microsoft CoPilot become common use, many writers utilize AI in both creating and editing their product. As professionals in writing fields witness this progression, debate of quality control and confidential integrity has increased as publishers integrate AI into their policies and processes.

            Large language models (LLMs) are “deep learning models trained to understand and generate natural language” (Shen, et al.). Generative AI programs draw from LLMs, trained by public information, to answer user’s questions in simple chat form. The accessibility of this platform that arose in 2022 during ChatGPT’s premier was the ability to use 40 plus natural languages (English, Spanish, French, etc) instead of programming to communicate with AI. Now, writers and editors can use generative AI programs to create, edit, and analyze texts.

The recent developments in generative artificial intelligence programs impact the publishing industry through its implementation into writing, editing, and revising. This research examines how publishers, in both book and periodical and scientific and professional research, have integrated AI into their policies and the implications of those policies.

Literature Review

Generative artificial intelligence, being a relatively new and controversial topic, has acquired minimal research on industry specific use, including publishing. Valid analysis of AI’s interaction with humans is dated, because most modern content on the subject is generated by AI, not surprising given the subject matter. For example, the most recent app analysis of Wordcraft, an AI writing assistant program targeted at authors, was written entirely by Together AI. This research will dive more into the limitations and specifications publishers have placed on AI in their work based on existing policies and human perspective.

 In 2022, about a month before the premier of ChatGPT, Education and Information Technologies published “A Systematic Review of AI Technologies used for Story Writing,” in which they synthesized and analyzed existing story writing AI programs, and research on those programs, for their types, approaches, and roles (Fang et al). Primarily, existing research revolved around pedagogical uses for AI as it aids new writers with “various challenges when writing their digital stories,” such as creative strains, grammar questions, and technical challenges. The review also found that “AI played the role of story co-creator/ collaborator” when used by writers, similar to how career editors are traditionally utilized (Fang et al). Based on their research, authors of the review found two primary categories of use for AI in story writing: planning-based models and machine learning models. Writers can either use AI to outline their writing in an analogy structure with goal-directed approaches or use machine learning to complete or generate new content based on what is already written.

The Chartered Institute of Editing and Proofreading (CIEP) published a blog post with interviews from multiple editors on what the future of AI might mean for their industry specifically. Of the four editors interviewed, all agreed that they “don’t believe AI will ever replace human editors” (“What might the Future”). However, there is debate in the editing community over how influential implementing the programs will be on the industry. While some believe AI is too underdeveloped to ever aid the editing industry in any major capacity other than being a potential tool for “unconfident writers,” others are optimistic that it will speed up proofreading, copyediting, and copywriting processes (“What might the Future”). In editing, the 2022 review found that many story generators were “not error-less and may make grammatical mistakes and incoherent texts,” leaving a limitation that requires human oversight for generated works (Fang et al). In large part CIEP editors are not worried about AI replacing their positions meaning there is a general confidence in the publishing community that humans will always be at the center of good writing and editing despite AI improvements.

            A larger concern that this research will explore, is a movement for “better protections for authors,” so that their work cannot be “exploited in datasets to train,” LLMs or accessed before they’re published in a breach of confidentiality (“What might the Future”). ChatGPT is an especially popular and controversial AI system that trains its responses “on a diverse set of internet texts (approximately 570 GB), including books, articles, and websites,” in order to give a human-like range of responses in its chat structured feedback (Shen et al.). However, because ChatGPT uses such a wide range of sources, it can often provide incorrect or even incoherent information. This “hallucination effect is a common issue among many natural language processing models” (Shen et al.). Additionally, ChatGPT uses user data to train its LLM and give personalized feedback, meaning submitted information, like an unpublished manuscript, is archived and used in ChatGPT’s responses going forward.

            Furthermore, scientific researchers express a concern for ChatGPT’s “outright fabrication of scientific work” and citations causing AI-generated manuscripts to contain false facts (Shen et al.). Researchers warn editors and peer reviewers to “be aware that the submission of an entirely fabricated article, substantiated by synthetic data, images, and incorrect or misattributed citation, is inevitable” (Shen et al.). Publicists of non-fiction literature and research are aware that AI is not a reliable source for fact-checking information or providing reliable factual information and data.

A common alternative to ChatGPT, Microsoft Copilot, is a program that protects users’ security and scrubs its training system of nefarious and misleading content (“Your Business Outcomes”). Copilot performs a similar function to ChatGPT but is more business oriented and therefore prioritizes responsible AI and produces sources linked to their responses. For the purposes of this researcher, Copilot is a program representative of developing technologies that addresses author privacy concerns and factual transparency.

Results and Analysis

The best way to analyze how publishers view and use AI is to examine the policies and statements that are currently accessible to the public. This information clarified the extent to which publishers have taken AI into account with their regards to their authors and editors. To obtain a broader view of current AI use in the publishing industry, I pulled editorial guidelines and statements on AI from 20 major publishing houses. This included both book and periodical publishers as well as scientific and professional research publishers.

Book and Periodical Publishers

The first table analyzes 10 book and periodical publishers. These distributors focus on longer form literary fiction and non-fiction on a regular schedule of production. Data from these publishers is implicit of more creative and long form writers.

Publisher

Where They’re Based

When the Policy was Produced

Policy

Allen and Unwin

Australia

NA/

No Statement

Chronicle Books

United States

NA/

No Statement

Grove Atlantic

United States

NA/

No Statement

Hachette Book Group

United Kingdom

April 2023

Open to “responsible experimentation” for operational uses, but relies on human creativity

HarperCollins

United States

November 2024

The first major publisher to strike a deal to sell authors books to an AI to train an LLM, possibly Microsoft

Pan Macmillan

United Kingdom

April 2023

Allows ethical use, but prioritizes creators and transparency

Explicitly protects writer’s data

Penguin-Random House

United Kingdom

August 2024

Uses AI, but relies on human creativity

Explicitly protects writer’s data

Scholastic

United States

NA/

No Statement

Simon & Schuster

United States

NA/

No Statement

Wiley

United States

September 2023

Requires human oversight of AI use and full transparency,

AI trade deal announced earlier this year


Of the 10 book and periodical publishers analyzed, 5 have yet to produce any public statement or policy change about their AI use as of November 2024. Four of the other publishers provided mild statements and policy additions that were minimal in their descriptions of its limitation, especially concerning editing. The descriptions of policies and beliefs were generally geared towards writers and underlined with fear of AI obtaining greater presence in the writing than the actual author. For example, Hachette states that they “encourage responsible experimentation with AI,” but also emphasizes that the book publishing “industry relies on the creative talent of humans” (“huk- Our Position on AI”). Additionally, publishers require full transparency that if the author should use AI in the writing, revision, and editing process they should disclose it to the publisher and readers.

 Penguin-Random House was the first publisher to add AI policies to their copyright protection statements on book prints in 2024. This protects writer’s material from data collection by generative AI systems, so that published works cannot be used to train LLMs. Editor’s main concern when using AI to assist in editing unpublished works is the potential of breaching an author’s confidentiality. For that reason, publishers do not allow editors to upload their client’s products into AI softwares. However, according to Michael Yukna, a Global Blackbelt in modern work at Microsoft who I interviewed, in systems like Copilot “confidentiality isn’t necessarily an issue. Copilot isn’t surfacing any data or information that a customer couldn’t have found themselves,” on the internet. Copilot doesn’t even use customer data and information to train its LLM. Training Copilot, according to Yukna, is on a personal basis within an individual chat to personalize feedback.

Conversely, HarperCollins and Wiley have both have formed quiet AI licensing deals in which they sell authorial content rights to train an LLMs. According to leaked information from the HarperCollins deal, the agreement was reached only after the unconfirmed company “agreed to several protective terms, including a commitment to limit verbatim reproduction and an agreement not to train on pirated content” (Albanese). This exemplifies Michael Yukna’s perspective when he said that at Microsoft, they “follow and adhere to very strict guidelines in terms of responsible AI,” (“Your Business Outcomes”). Those guidelines include fairness, reliability and safety, privacy and security, and inclusiveness. Potential partnerships to either use or sell content to AI companies must be filtered through these kinds of protective measures. AI engineers like Yukna are optimistic that AI is a safe tool for editors so long as they are maintained responsibly.

Although several of the analyzed publishers have not made any statements or policy changes regarding AI, that does not mean they’re not already using it. Sara Kocek, founder and editor at YellowBird Editors, and published author conveyed to me in an interview that, while YellowBird is currently crafting their AI policies, she already uses the technology in her personal writing. Kocek explained that “a lot of editors in [her] network” fear “that the work [they] put into AI will somehow be used to train other AIs, and that would be a breach of privacy.” This is a valid concern especially when the most common AI, ChatGPT has no protective qualities for submitted content. Using unspecified AI, even for copy editing and copy writing and even for actions such as drafting emails or creating summaries, is dangerous to perform without strict guidelines. Kocek agreed that “the question of whether it’s ok to [use AI on authorial clientele’s work] has not at large been answered in our industry yet, and until there’s a standard set about that” Kocek and others in the publishing industry do not and should not feel comfortable operating with AI.

As the largest publishers continue to produce new policies and licensing agreements with AI, they set a precedent for how this new technology will be implemented into the industry. With book and periodical publishing, it is critical that boundaries are set in both function and what program is used to prevent disruption of the system. Unregulated, AI in book and periodical publishing could quickly become a substitute for good authorship and detailed editors instead of a tool to improve writing. While deals such as those that HarperCollins and Wiley made are potentially dangerous for the integrity of author protection, they open up the potential for a clear relationship with AI.

Scientific and Professional Research Publishing

This table examines 10 major scientific and professional research publishers. These are quite different from the book and periodical publisher as they produce researched, factual data to advance professional fields of study. They focus more on proof reading and fact checking, often considered very objective elements of writing.

Publisher

Where They’re Based

When the Policy was Produced

Policy




Cambridge University Press

United Kingdom

2023

Use AI, but be transparent and do not violate plagiarism policies

Clarivate

United Kingdom

April 2023

Use with accountability and transparency

Using data to train AI requires requested license

Elsevier

The Netherlands

2023

Use with disclosure during the writing process, no co-authorship

Editors may not use to protect confidentiality and data

Frontiers

Switzerland

2023

Author accountability and transparency when using AI, no co-authorship

ORO Editions

United States

NA/

No Statement

Public Library of Sciences

United States

2023

Author accountability and transparency

RELX Group

United Kingdom

August 2024

Human oversight and accountability, strong data governance

Sage Publications

United States

2023

Welcomes AI use with full disclosure,

Editors present copyright issues

Springer Nature

United States

2023

Human accountability for generative work,

Explicitly protects data through peer reviewer’s AI use

Taylor and Francis Group

United Kingdom

March 2023

Welcomes AI use, author accountability and full disclosure,

Editor’s use of AI is a confidentiality infringement,

AI trade deal announced earlier this year

World Scientific Publishing

Singapore

2023

Author accountability and transparency,

Editors decide if AI is permissible

Of the 10 book and periodical publishers analyzed, 5 have yet to produce any public statement or policy change about their AI use as of November 2024. Four of the other publishers provided mild statements and policy additions that were minimal in their descriptions of its limitation, especially concerning editing. The descriptions of policies and beliefs were generally geared towards writers and underlined with fear of AI obtaining greater presence in the writing than the actual author. For example, Hachette states that they “encourage responsible experimentation with AI,” but also emphasizes that the book publishing “industry relies on the creative talent of humans” (“huk- Our Position on AI”). Additionally, publishers require full transparency that if the author should use AI in the writing, revision, and editing process they should disclose it to the publisher and readers.

 Penguin-Random House was the first publisher to add AI policies to their copyright protection statements on book prints in 2024. This protects writer’s material from data collection by generative AI systems, so that published works cannot be used to train LLMs. Editor’s main concern when using AI to assist in editing unpublished works is the potential of breaching an author’s confidentiality. For that reason, publishers do not allow editors to upload their client’s products into AI softwares. However, according to Michael Yukna, a Global Blackbelt in modern work at Microsoft who I interviewed, in systems like Copilot “confidentiality isn’t necessarily an issue. Copilot isn’t surfacing any data or information that a customer couldn’t have found themselves,” on the internet. Copilot doesn’t even use customer data and information to train its LLM. Training Copilot, according to Yukna, is on a personal basis within an individual chat to personalize feedback.

Conversely, HarperCollins and Wiley have both have formed quiet AI licensing deals in which they sell authorial content rights to train an LLMs. According to leaked information from the HarperCollins deal, the agreement was reached only after the unconfirmed company “agreed to several protective terms, including a commitment to limit verbatim reproduction and an agreement not to train on pirated content” (Albanese). This exemplifies Michael Yukna’s perspective when he said that at Microsoft, they “follow and adhere to very strict guidelines in terms of responsible AI,” (“Your Business Outcomes”). Those guidelines include fairness, reliability and safety, privacy and security, and inclusiveness. Potential partnerships to either use or sell content to AI companies must be filtered through these kinds of protective measures. AI engineers like Yukna are optimistic that AI is a safe tool for editors so long as they are maintained responsibly.

Although several of the analyzed publishers have not made any statements or policy changes regarding AI, that does not mean they’re not already using it. Sara Kocek, founder and editor at YellowBird Editors, and published author conveyed to me in an interview that, while YellowBird is currently crafting their AI policies, she already uses the technology in her personal writing. Kocek explained that “a lot of editors in [her] network” fear “that the work [they] put into AI will somehow be used to train other AIs, and that would be a breach of privacy.” This is a valid concern especially when the most common AI, ChatGPT has no protective qualities for submitted content. Using unspecified AI, even for copy editing and copy writing and even for actions such as drafting emails or creating summaries, is dangerous to perform without strict guidelines. Kocek agreed that “the question of whether it’s ok to [use AI on authorial clientele’s work] has not at large been answered in our industry yet, and until there’s a standard set about that” Kocek and others in the publishing industry do not and should not feel comfortable operating with AI.

As the largest publishers continue to produce new policies and licensing agreements with AI, they set a precedent for how this new technology will be implemented into the industry. With book and periodical publishing, it is critical that boundaries are set in both function and what program is used to prevent disruption of the system. Unregulated, AI in book and periodical publishing could quickly become a substitute for good authorship and detailed editors instead of a tool to improve writing. While deals such as those that HarperCollins and Wiley made are potentially dangerous for the integrity of author protection, they open up the potential for a clear relationship with AI.

Conclusion

            Based on my findings from analyzing publishers’ policies, Microsoft’s precedents, and specific editor’s habits, I can conclude that publishers view AI as a tool best used to assist inexperienced and short form writers in the publishing industry. Inexperienced writers, such as researchers with expertise in their field but little writing experience, can use generative AI the most as a tool to improve the quality of their writing and to communicate their findings more effectively. Additionally, other publishing jobs that are associated with short form writing, copywriting and copyediting, are at a higher risk of being replaced by generative AI because editors can use AI to complete those jobs for efficiently.

Book and periodical publishers lack specificity in their AI policy. Several have simply stated their opinions or stances on AI in their production but have not set out detailed criteria for what writers and editors can and cannot do. Any industry open to implementing AI should be clear on the limitations of its use to prevent leaking data or compromising writer integrity. This should include which AI programs to use as well as how and when to use them. Across the field, professionals in the publishing industry are optimistic about implementing AI into their work, but they are concerned about confidentiality, copy right, and human accountability. Programs, like Copilot, that prioritize responsible AI can address these concerns. Editors, as assistants to writers are not in danger of their jobs becoming obsolete because publishing experts continue to believe that “ultimately humans will always prefer to work with other humans,” over AI systems (“How might the Future”).

            Policies that do not specify the qualities of approved AI use and simply recommend some level of experimentation do not take into account the ease of falling into an abuse of AI where authors are no longer taking bulk responsibility for their work. The line between using AI as a tool for writing and using it a substitute to perform undesirable tasks is blurry and the best way to clarify that boundary is with precise guidelines. As technology advances and becomes more essential, boundaries are a necessity to protect the integrity of the publishing world. These boundaries should include both what AI programs to use and what instances writers and editors should use AI.

As Kocek explained, “What makes writing good is that it reminds us what it is to be human, and right now, AI can’t do that.” However, it would be a mistake to disregard such valuable technology out of fear of its complex nature. There is a path to managing that fear the same way writers moved from typewriters to computers. The specific qualifications of that path are yet to be established and the effects of these major changes on the industry will likely not be visible for the coming years. Both of these qualities, specification of policies and changes in written products, should be monitored and studied by professionals in the industry, and based on the variations within this research, the publishing field would benefit from coming together to standardize their AI policies.

Work Cited

“Ai Policy.” Taylor & Francis, 20 Aug. 2024, taylorandfrancis.com/our-policies/ai-policy/#:~:text=Taylor%20%26%20Francis%20strives%20for%20the,data%2C%20including%20personally%20identifiable%20information. 

Albanese, Andrew, et al. “Agents, Authors Question HarperCollins Ai Deal.” PublishersWeekly.Com, PWxyz LLC, 19 Nov. 2024, www.publishersweekly.com/pw/by-topic/industry-news/publisher-news/article/96533-agents-authors-question-harpercollins-ai-deal.html. 

“Artificial Intelligence.” Clarivate, 1 Nov. 2024, clarivate.com/ai/. 

“Artificial Intelligence (AI).” Nature News, Nature Publishing Group, www.nature.com/nature-portfolio/editorial-policies/ai#ai-use-by-peer-reviewers. Accessed 10 Nov. 2024. 

Fang, Xiaoxuan, et al. “A Systematic Review of Artificial Intelligence Technologies Used for Story Writing - Education and Information Technologies.” SpringerLink, Springer US, 5 Apr. 2023, https://link.springer.com/article/10.1007/s10639-023-11741-5.

“Frontiers in Artificial Intelligence: About.” Frontiers, www.frontiersin.org/journals/artificial-intelligence/for-authors/author-guidelines. Accessed 10 Nov. 2024. 

“Generative Artificial Intelligence: A Balancing Act for Publishers.” Wiley Online Library, www.wiley.com/en-us/network/trending-stories/generative-artificial-intelligence-a-balancing-act-for-publishers. Accessed 10 Nov. 2024. 

“How We Are Thinking about AI at Pan Macmillan.” Pan Macmillan, Apr. 2024, www.panmacmillan.com/ai-at-pan-macmillan#:~:text=At%20Pan%20Macmillan%2C%20we%20have,reach%20more%20readers%20and%20listeners. 

“Huk – Our Position on Ai.” Hachette UK, 7 Mar. 2024, www.hachette.co.uk/landing-page/huk-our-position-on-ai/. 

“International Journal on Artificial Intelligence Tools.” World Scientific, www.worldscientific.com/worldscinet/ijait. Accessed 10 Nov. 2024. 

“Penguin’s Approach to Generative Artificial Intelligence.” Penguin Books UK, 8 Aug. 2024, www.penguin.co.uk/articles/2024/08/penguins-approach-to-generative-artificial-intelligence. 

“Publishing Ethics Authorship and Contributorship Journals.” Cambridge Core, www.cambridge.org/core/services/publishing-ethics/authorship-and-contributorship-journals. Accessed 10 Nov. 2024. 

“Research Integrity and Publication Ethics.” PLOS, 2 Apr. 2024, plos.org/research-integrity-and-ethics/#:~:text=What%20we’re%20doing%3A%20In,raised%20about%20AI%20tool%20usage. 

“Responsible Artificial Intelligence Principles at RELX.” RELX, www.relx.com/~/media/Files/R/RELX-Group/documents/responsibility/download-center/relx-responsible-ai-principles.pdf. Accessed 10 Nov. 2024. 

Shen, Yiqiu, et al. “CHATGPT and Other Large Language Models Are Double-Edged Swords” Radiology Society of North America, 26 Jan. 2023, pubs.rsna.org/doi/full/10.1148/radiol.230163. 

“The Use of Generative AI and Ai-Assisted Technologies in the Editing Process for Elsevier: Elsevier.” Www.Elsevier.Com, www.elsevier.com/about/policies-and-standards/publishing-ethics-books/the-use-of-generative-ai-and-ai-assisted-technologies-in-the-editing-process. Accessed 10 Nov. 2024. 

Together AI. “Word Craft.” Lab Lab AI, Wordcraft, 20 Oct. 2024, lablab.ai/event/edge-runners-3-point-2/wordcraft/word-craft. 

“Using AI in Peer Review and Publishing.” SAGE Publications Inc, 2 Aug. 2023, us.sagepub.com/en-us/nam/using-ai-in-peer-review-and-publishing#pt3. 

“What Might the Future of AI Mean for Editors and Proofreaders?” CIEP Blog, 20 Oct. 2023, blog.ciep.uk/future-of-ai-for-editors/. 

“Your Business Outcomes. Our Services. #bettertogether.” Microsoft Adoption, 3 Nov. 2024, adoption.microsoft.com/. 



No comments:

Post a Comment

The Other Woman

    Rain drummed along the roof as drops chased each other down the windows, leaving remnants of themselves behind. They caught each other i...