Why AI-Generated Text Sounds Wordy and Choppy

Silver robot with text

Something feels off about your new robot co-worker—besides the fact that your co-worker is a robot. This robot produces grammatically correct text at lightning speed. The writing seems natural, not robotic. It’s impressive, but is this text good and should you adopt it as your own?

Yes, we’re talking about new artificial intelligence technology like ChatGPT. Many companies are already incorporating it into their products or workplaces. But some people still aren’t sure what this technology even does. Is it a search engine? Is it a text editor? Is it a chat program? (Hint: its primary purpose is to generate text.) Generative AI is a hot topic, and it seems that everyone from news media to social media is hailing it as a new co-worker and the solution to every work problem.

But something is off. Of course, AI-generated text is often bland and forgettable—but it’s also wordy and choppy. If the goal of your writing is to be persuasive or to show your knowledge on a topic, this over-general and repetitive content won’t help you meet your goal.

In this article, we’ll start with a primer on what these generative AI tools do and how they work. Then we’ll explore how such seemingly clear text can be nonsense upon further inspection. Finally, we’ll take a deep dive into what makes AI-generated output feel slightly wrong and discuss how to fix it.

The easiest way for writers to create more precise and highly polished writing.

Start your free trial
  • No credit card needed to get started.
  • No more agonizing over your writing.

What Is Generative AI?

Large language models, or LLMs, are advanced AI systems capable of understanding and generating human-like text. You may be familiar with models like GPT, PaLM, or Dolly. For a primer, read We need to talk about ChatGPT by Damien Riehl.

Most consumers access those LLMs through a question-and-answer interface (which is why these tools are called chats). Examples of these tools are ChatGPT by OpenAI and Bard by Google. The question or command is called a prompt, and you, the user, state your prompt as if you were telling a human to do something. In response, the tool generates some output, which could be text, a graph, an image, computer code, or more. For this discussion, we’ll focus on text output.

The availability of generative AI tools is an exciting development because it is one of the first times that powerful new technology is available to the masses with no training required. While there are ideal prompts and prompt methodologies to use, even a bad prompt will yield a response. This makes it possible for anyone to try one of these tools without first learning to use one, training one, developing a dataset, or setting up project parameters. However, this instant access and low barrier to entry also raise questions about security and confidentiality, as well as misplaced blind confidence in the output.

How Generative AI Produces Text

To mimic natural-sounding writing, AI companies trained their LLMs based on text that humans actually wrote. According to a recent report by The Washington Post, generative AI companies have used writing widely available on the internet, especially content from online newspapers, published research papers, and government documents. In this scenario, the text and the structures embedded in it represent the data; the topics being written about are not necessarily important to these LLMs. But thinking and writing are inextricably intertwined, so any errors in writing, logic, or bias that a human writer might make will be reflected in the AI-generated output.

Using advanced computer algorithms (rules and instructions that people make to tell the tools what to do), LLMs create a mathematical model of ideas and concepts by ingesting text. Using this information, the LLMs statistically predict which words writers are most likely to use together in the context of the user’s command.

Because many of the sources that AI companies used to train their new tools were formal writing by educated native English speakers, AI writing sounds fluent and well-educated on the surface (word order, word choice). However, remember that these tools create writing based on word frequency and the likelihood that a given word will come after another word or that a given sentence will come after another sentence. Think of it like an advanced version of predictive text on your cell phone. (Of course, the software is more complicated than that!)

One problem with the writing produced by these AI tools is that the next most common word or sentence is sometimes out of context or disjointed. To prevent this problem, these AI tools seem to have algorithms that put in extra transition words between sentences to make the sentences sound more connected. The goal is to help the writing flow better, but when the conjunction is wrong or unnecessary, the opposite happens. The writing becomes choppy and confusing. The content from generative AI tools is often a wordy and generic approximation of corporate writing. The result is that the writing sounds like any unconvincing corporate writer: oddly formal, overly repetitive, stilted, and bland. It’s even worse if the output is factually inaccurate (a common problem in AI-generated text).

Example: Writing an Essay on the Rainforest

Now let’s apply this knowledge to an example. You might enter a command like, “write an essay about the destruction of the world’s rainforests over the last 40 years.” You may try something more specific, like, “write a 1500-word college-level essay about the destruction of the world’s rainforests over the last 40 years.”

In response, ChatGPT might scan its data for the keywords in the command and see which words, sentences, and phrases usually come along with those keywords. The tool might also have algorithms to know how formal or informal words are so it can choose words based on the requests in the command (college-level, in our example).

If you remember from your high school English days, an essay needs supporting evidence and examples that prove your point. This seems like a perfect opportunity for generative AI, right? So, in response to this prompt, we get a five-paragraph essay that seems useful on its surface.

In its output, ChatGPT wrote three sentences with almost identical syntax in the three supporting evidence paragraphs:

  1. This destructive practice not only destroys the forest canopy but also disrupts the fragile ecosystem, jeopardizing the survival of countless plant and animal species.
  2. Mining activities, such as gold and bauxite extraction, not only destroy vast stretches of rainforest but also contaminate rivers and soil, causing irreparable damage to the ecosystem.
  3. This loss of biodiversity not only diminishes the intrinsic value of these ecosystems but also hampers potential medical advancements and disrupts intricate ecological relationships.

It quickly becomes clear that ChatGPT’s algorithms tell the AI tool to do something like this:

  1. Find the words that are most commonly used in paragraphs or essays that already include the terms rainforest, destruction, and last 40 years.
  2. Of those words, which ones are often the closest to the words example or instance? Use those words to write sentences about examples.
  3. To have fewer paragraphs, combine these sentences about the examples using the conjunctions not only and but also. (There is no way for the AI tool to know whether the examples are connected to each other in the real world; that’s probably why it combined medicinal plants with ecological relationships between species.)

Writing with such generic words and only one type of sentence structure is not strong or convincing. The paragraphs are also disorganized: some talk about the causes of deforestation, but others talk about the effects. A human brain is more likely to make each paragraph discuss a cause of deforestation and an effect produced by that cause. Each AI-generated sentence then lists two actions that these causes (or effects) do (one action after not only and one action after but also). The underlined information applies much more clearly to the second action (the information after but also) than to the first action; this is likely because ChatGPT is just predicting close words and doesn’t refer to or consider its own previous words very easily. This strange link to only half of the example happens because ChatGPT creates text using prediction, then relies on cohesive devices to link the output to give the impression of coherence.

Cohesion without Coherence: Choppy, Wordy AI Writing

If you use generative AI tools like ChatGPT frequently, you’ll notice the overuse of cohesive devices (defined below). AI tools rely on these words to create an appearance of coherence. The goal is to make the writing seem internally linked. When used sparingly, cohesive devices can be a valuable tool for writers. But in the case of AI-generated text, using these words doesn’t mean the text makes sense. Text is coherent when it makes sense and makes its point. Coherence requires judgment and reasoning, but AI tools cannot use judgment or reasoning, which means that they can’t distinguish between sentences or paragraphs in which a cohesive device would be good or bad.

Recognizing Cohesive Devices and Linking Words

English has many cohesive devices to make writing more, well…cohesive. These linking devices can be adverbs, prepositions, conjunctions, and pronouns. They help connect ideas in a text, show how ideas relate to each other, and create a sense of flow or logical progression.

Most writers sort these cohesive devices into these six categories:

  • Organizational Surface Signals: first(ly), second(ly), third(ly), next, finally
  • Conjunctions: therefore, consequently, as a result, on the other hand, in contrast, similarly, however, nonetheless, nevertheless, still, yet, because of this
  • Summative Transitions: in summary, in conclusion, overall, in other words, to sum up, to recap
  • Additive Transitions: also, furthermore, moreover, in addition, as well as
  • Prepositions: with, at, by, to, in, for, from, of, on, since
  • Pronouns: he, she, we, they, such (We’ve written about how to use pronouns to avoid repetition here.)

Each of these words carries different nuances, so it’s important to use the right words for the right purpose. While human writers understand the precise meaning of each word and make word choices accordingly, AI tools have classed these words into categories based on deduced formality and frequency, which can lead to strange or repetitive choices. For a deep dive into which cohesive devices should be used for which purposes, check out Purdue Online Writing Lab on transitional devices and the UNC Chapel Hill Writing Center on transitions.

Risks of Overusing Cohesive Devices and Linking Words

Humans are skilled at adapting their language to the situation. In casual conversation or informal writing, a human would use pronouns, structural organization, and shared background knowledge to avoid repetition, which makes the text flow better and seem less wordy. But LLMs don’t have this type of contextual understanding, so they must rely on vocabulary to create it. While this approach may work in a short email, it’s overdone in an essay, which is why longer output can sound redundant, repetitive, and unnecessarily complicated.

Here are some of the ways overusing cohesive devices makes writing sound worse:

  • Repetitive: When you use the same words repeatedly it sounds, well, repetitive. This can be especially noticeable with words like also, furthermore, and in addition. While repetition can be helpful, it can also be monotonous, boring, predictable, and redundant. Sophisticated writers strive to reduce redundancy, rather than add it.
  • Choppy: When you use too many cohesive devices, your writing may sound choppy and disjointed because, for the links to connect sentences, you must have more of them. This leads to unnecessarily cutting up sentences into unnaturally small pieces thus killing flow.
  • Confusing: When multiple transitions and linking words are piled up in a sentence, it’s harder to follow the ideas in the content, which can obscure meaning.
  • Formulaic: Using too many cohesive devices can make your writing sound overly formal—and formulaic—like it’s written for a textbook or a research paper.

How WordRake Addresses Cohesive Devices

Whether our clients are editing AI-generated writing or their own document, WordRake can help. In Brevity mode, WordRake edits focus on removing unnecessary cohesive devices or replacing long cohesive devices with shorter ones. In Simplicity mode, WordRake edits focus on changing formal or ambiguous cohesive devices to words and phrases that are more familiar to readers in accordance with the Plain Language Act. Here are just a few examples from WordRake’s Brevity mode:

Obvious Summary Statements: A sentence’s location at the end of a paragraph or the beginning of a new one, especially when combined with modals like should or must, is typically enough to show that the sentence is summarizing an opinion.

WordRake Edit Example:

For the reasons stated above, these These individuals were excluded from further analysis.

WordRake also deletes similar expressions, such as:

      • For these reasons, […]
      • For those reasons alone, […]
      • Given the above arguments, […]

Wordy Contrasting Statements: Writers should signal contrast concisely. Despite the aforementioned fact that uses five words to say what although says in one word.

WordRake Edit Example:

Despite the aforementioned fact that Although the court would rule in Defendant’s favor, Plaintiff pursued its case.

WordRake also replaces similar expressions with although, such as:

      • Notwithstanding the idea that […]
      • In spite of the fact that […]
      • While it seems evident that […]

Redundant Causal Links: Verbs like become and turn into already show that there was a change, and the information in the sentence shows why.

WordRake Edit Example:

The boss achieved many successes, and her team became much more deferential to her as a result.

WordRake also deletes similar expressions, such as:

      • for this reason
      • consequently

Redundant Referents: The defining article the already refers to a specific case that the writer has mentioned, so expressions like in question and at issue are unnecessary here.

WordRake Edit Example:

The case in question This case was decided in Defendant’s favor.

WordRake also deletes similar expressions, such as:

      • in question
      • at issue
      • in this particular complaint

Double Conjunctions: It’s redundant to use two conjunctions with the same functions at the beginning of a sentence.

WordRake Edit Examples:

On the other hand, however However, the album’s sound involves a surprising embrace of pop music.

However, despite Despite these differences, there were many similarities across groups, suggesting more universal issues among the participants.

WordRake also shortens or deletes similar expressions, such as:

      • but/yet/while
      • …nevertheless
      • On the other hand, although…
      • In contrast,

Double Negation: It’s redundant to use two words that negate or limit a proposition within the same sentence, such as but and nonetheless.

WordRake Edit Examples:

The chemical reaction occurs most often at low pressures but is nonetheless conducted at high pressures sometimes.

The golfer did not win a single event that season but nevertheless finished the year ranked No. 2 in the world.

The game was a little unfair because the other team had known about it longer than we had, but it was fun nonetheless.

WordRake also deletes similar expressions, such as:

      • albeit
      • despite


Conclusion

Generative AI tools like ChatGPT lack the human reasoning skills necessary to use cohesive devices correctly and at the right frequency, and even human writers make these same mistakes. Ideally, clarity comes from organization and the logical building of ideas: the order in which you present information and the formatting you use to improve readability will prevent the need for many cohesive devices, especially in short texts. As a general rule, if you keep the reader’s point of view in mind and lay out your writing with intuitive headings and white space, you can trust the reader to remember what you’ve said as they’re reading. Save the cohesive devices for complex ideas and long documents.

If you struggle with certain parts of writing, AI tools can help you get unstuck. But it can’t write chunks of text for you to adopt as your own without your editing or investigation. If you decide to use a generative AI tool to create content, you must edit the output. WordRake can make text more powerful, whether you wrote it yourself or used an AI tool to get started. Try WordRake free for 7 days.

About the Authors

Ivy B. Grey is the Chief Strategy & Growth Officer for WordRake. Before joining the team, she practiced bankruptcy law for ten years. In 2020, Ivy was recognized as an Influential Woman in Legal Tech by ILTA. She has also been recognized as a Fastcase 50 Honoree and included in the Women of Legal Tech list by the ABA Legal Technology Resource Center. Follow Ivy on Twitter @IvyBGrey or connect with her on LinkedIn.

Danielle Cosimo is a Language Usage Analyst for WordRake. Before joining the team, she was a translator and editor for non-native English speakers applying to degree programs in the United States and the UK. Danielle is formally trained in linguistics and has a certificate in computer programming. She is fluent in English, Portuguese, and Spanish. She applies her interdisciplinary knowledge to create WordRake’s editing algorithms.

The easiest way for writers to create more precise and highly polished writing.

Start your free trial
  • No credit card needed to get started.
  • No more agonizing over your writing.

Our Story

demo_poster_play
WordRake founder Gary Kinder has taught over 1,000 writing programs for AMLAW 100 firms, Fortune 500 companies, and government agencies. He’s also a New York Times bestselling author. As a writing expert and coach, Gary was inspired to create WordRake when he noticed a pattern in writing errors that he thought he could address with technology.

In 2012, Gary and his team of engineers created WordRake editing software to help writers produce clear, concise, and effective prose. It runs in Microsoft Word and Outlook, and its suggested changes appear in the familiar track-changes style. It saves time and gives confidence. Writing and editing has never been easier.