Trends and perspectives about ChatGPT and LLM in Scientific Publication

2023-02-05

Our goal with this text is to state the technological evolution, main capabilities, and implications of ChatGPT in the scientific publication. It has been clear for several years that artificial intelligence (AI) is gaining the ability to generate fluent language, churning out sentences that are increasingly hard to distinguish from text written by people. Some representatives journals like Nature reported that some scientists were already using chatbots as research assistants — to help organize their thinking, generate feedback on their work, assist with writing code and summarize research literature. ChatGPT generates convincing sentences by mimicking the statistical patterns of language in a huge database of text collated from the Internet. The bot is already disrupting sectors including academia: in particular, it is raising questions about the future of university essays and research production.

ChatGPT is a large language model (LLM), a machine-learning system that autonomously learns from data and can produce sophisticated and seemingly intelligent writing after training on a massive data set of text. It is the latest in a series of such models released by OpenAI, an AI company in San Francisco, California, and by other firms. ChatGPT has caused excitement and controversy because it is one of the first models that can convincingly converse with its users in English and other languages on a wide range of topics. It is free, easy to use, and continues to learn. ChatGPT, however, isn´t the largest model, considering the number of parameters. Other LLMs such as PalM-Coder from Google, MT-NLF from NVIDIA, also should be observed in this scenario.and landscape.

Large language models (LLMs) are technically, a language model that is a statistical representation of a language, which tell us the likelihood that a given sequence (a word, phrase, or sentence) occurs in this language. Due to this capacity, language models can be used to make predictions about how a sentence might continue and, consequently, generate text. Sophisticated language models, often based on neural networks and large text corpora, are very powerful because they can be used in a wide range of different applications such as translation or text recognition.

In the future, LLMs are likely to be incorporated into text processing and editing tools, search engines, and programming tools. Therefore they might contribute to scientific work without authors necessarily being aware of the nature or magnitude of the contributions. This defies today’s binary definitions of authorship, plagiarism, and sources, in which someone is either an author, or not, and a source has either been used or not. Policies will have to adapt, but full transparency will always be key.

This technology has far-reaching consequences for science and society. Researchers and others have already used ChatGPT and other large language models to write essays and talks, summarize the literature, draft and improve papers, as well as identify research gaps and write computer code, including statistical analyses. Soon this technology will evolve to the point that it can design experiments, write and complete manuscripts, conduct peer reviews, and even support editorial decisions and quality control to accept or reject manuscripts.

Conversational AI is likely to revolutionize research practices and publishing, creating both opportunities and concerns. It might accelerate the innovation process, shorten time-to-publication and, by helping people to write fluently, make science more equitable and increase the diversity of scientific perspectives. However, it could also degrade the quality and transparency of research and fundamentally alter our autonomy as human researchers. Has all technologies, as its own Yin and Yang. ChatGPT and other LLMs produce text that is convincing, but often wrong, so their use can distort scientific facts and spread misinformation.

However, with time and learning, and incorporation of AI Ethics principles and tools, the capabilities of such AI tools will greatly increase and that´s why this type of technology should be very well observed, specially due to the new capabilities that can achieve when integrated with other systems. In less than 3 weeks, Microsoft already incorporated GPT-3.5, Codex and DALL-E into Azure Open AI Services, which means integration of these technologies into business processes. GPT-4 are promising even more capabilities.

As Springer Nature Journal and other relevant journals, we think that the use of this technology is inevitable, therefore, banning it will not work. We must observe and analyse the impacts of technology, understanding that, from times to times, reconfiguration will be enabled by technological adoption and provide not automation, but augmentation of human cognitive abilities. It is imperative that the research community engage in a debate about the implications of this potentially disruptive technology, in several dimensions, from ethics and responsability to even equality of access to these technologies and tools.

Given this context, RBGI adopts in 2023 a position promoted by Nature conserning ChatGPT, given fundamental principles:

Rules and Principles for Manuscript Submission in RBGI

No LLM tool will be accepted as a credited author on a research paper. That is because any attribution of authorship carries with it accountability for the work, and AI tools cannot take such responsibility.
Researchers using LLM tools should document this use in the methods or acknowledgments sections. If a paper does not include these sections, the introduction or another appropriate section should be used to document the use of the LLM: Just state you used it and where. Like any other software.

Forwarding

There are already clear authorship guidelines that mean ChatGPT shouldn’t be credited as a co-author, says Matt Hodgkinson, a research-integrity manager at the UK Research Integrity Office in London, speaking in a personal capacity. One guideline is that a co-author needs to make a “significant scholarly contribution” to the article — which might be possible with tools such as ChatGPT, he says. But it must also have the capacity to agree to be a co-author and to take responsibility for a study — or, at least, the part it contributed to. “It’s really that second part on which the idea of giving an AI tool co-authorship really hits a roadblock,” he says.

New things are coming in the Scientific Publication field, specially driven by Open Science principles and practices, like open peer review, plaudit (other researchers can endorse your research after peer review), and so on. We are moving to a more crowd powered and transparent approach, specially with Blockchain. Therefore, combining crowd with AI to augment capabilities, research can be done in a more eficient, faster and smarter way in order to provide better solutions for society, increasing response and resilient capabilities.

RBGI aims to stand as an innovative journal, observing these movements and capturing them to redefine some paradigms, with better processes and tools for authors, reviewers, and even readers.

Thank you very much,

Ederson de Almeida Pedro - Founder Gautica | Ph.D. Candidate | Human-centered AI for OHS | RAIES Member

Mateus Panizzon - Editor-in-Chief of Brazilian Journal of Management & Innovation | RAIES Member

RAIES - Rede de Inteligência Artificial Ética e Segura