The case for content professionals in the age of generative

Originally published on LinkedIn on Apr. 17, 2024

It’s an exciting time for many organizations. After struggling for years to navigate massive, siloed stores of information, they see a way forward with Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) systems, and generative AI. Finally, technology is available that can ingest the content from disparate sources, quickly understand it, and provide cogent answers to their users’ and employees’ far-reaching questions.

As they start implementing these solutions, however, organizations are running into an unexpected hurdle. LLMs often have the same problem as humans—they cannot find the right information. RAG systems help, but humans still need to train the system. People must sort good answers from bad to help technology return the right results. As teams investigate why the technology is returning suboptimal results, they find themselves forced to confront the massive backlog of data and content that they were trying to avoid.

In this article, I suggest that while technologies like LLMs, RAGs, and generative AI are extremely valuable to organizations trying to make hefty bodies of information easier to access and understand, they are only half the equation. To get the best return on investment (ROI), organizations must also invest in their content development teams.

The time is right for generative AI

Now is the right time for generative AI solutions. Study after study shows that workers are struggling with the amount of content they need to navigate, either because of the sheer volume of information, because it is so spread out, or because it’s in too many applications. For example:

A 2022 HBR study showed that workers spend approximately 4 hours a week toggling between various work applications and readjusting to the new apps.

A 2023 Gartner survey indicated that the 47% of digital workers struggle to find the information they need to effectively perform their jobs.

A 2024 study of Israeli government workers showed that 31% of workers increased the amount of time spent finding work-related information during the pandemic, primarily because of poor information management.

A well-designed generative AI solution, powered by an LLM (and often a RAG system) can help with all these issues. Users can drastically reduce the need to ping-pong between multiple sources and open disparate documents to find answers. Instead, they can go to one portal to answer their wide-ranging questions and explore the nuances of the subject through a conversational interface.

Quality is all you need

As companies evaluate AI solutions, they are frequently surprised that they must evaluate the quality of their information as well. Although it may seem obvious, many teams might not fully grasp the relationship between their content quality and AI results. For instance, LLMs cannot surface information that is not there. And if the underlying corpus has a glut of low-quality content, it will take considerable work to make sure that the LLM doesn’t expose it to users.

A good deal of AI research seems to center on the idea that valuable information is there, and we only need to find it. For instance, in their 2023 paper, Hugo Touvron et al tell us “Quality is all you need,” and then describe how they have achieved this with Llama 2. The process involves evaluating millions of supervised fine-tuning (SFT) datasets and narrowing them down to a much smaller subset of higher quality SFTs to achieve better results. While impressive, this process takes more effort than some companies want to take on.

This reality is reflected in Clear ML’s 2023 report that analyzes the costs of implementing generative AI solutions in enterprise environments. When surveying 1,000 executives in the AI, machine learning, engineering, IT, and data science space, they expected just 13% of their budget to go towards data preparation. (No category was included for generating missing data, and content isn’t addressed at all.)

In their 2024 paper, Li et al advance the development of LLM quality through an Instruction Following Difficulty (IFD) metric. Developers can use this metric to automatically measure the quality of content (reducing the need for human verification) and reduce the amount of content needed for inclusion in an LLM. The idea here is twofold—AI can automatically determine content quality, and then using that determination, can drastically narrow down how much content is included in the LLM.

This model is probably much closer to what many companies are hoping for—an algorithm that can both find the needle in the haystack and set it aside for future efforts. However, it still assumes the relevant content is there, and we simply need the right algorithm to find it.

The reality of existing content corpuses

In his 2018 book, Infonomics, Douglas B. Laney suggests that although most organizations create a huge amount of information, few actually value it. Executives do not prioritize content because information isn’t an explicit line item on their balance sheets. They cannot see the ROI.

Ironically, many of these organizations are overwhelmed with content specifically because they undervalue it. They have lax standards for who is allowed to create and post assets and a dearth of people to review their accuracy and keep them up to date once created. The result is a high volume of low quality, poorly maintained information.

This type of thinking often creates a culture where content teams are underfunded, and content leaders are forced to make detrimental tradeoffs. They employ mitigation strategies that solve short-term problems but create long-term issues, as described below:

Focus on hiring experienced writers. In this scenario, the writers need less training and can create higher-quality content. As the smaller team struggles to keep up with the pace of development, however, they are likely to introduce information gaps into the content corpus.

Hire a larger, less experienced team of writers. In this scenario, the content leader increases their chances of documenting all products and features by hiring more people. However, the less experienced team is more likely to introduce errors and inconsistencies into the content corpus. Additionally, more experienced writers may spend their time supervising junior writers instead of creating content.

Assign content development to SMEs. In this scenario, the content leader crowdsources information development to the user community, and then hires people to moderate the content. This approach drastically reduces the content team’s budget without offloading work to other teams within the company. However, organizations cannot count on users to document the entire product for them. Content quality is inconsistent and content gaps are common.

Create content with generative AI. In this last scenario. the content leader turns to generative AI to create technical content and hires humans for quality control. This approach drastically reduces the content team’s budget without offloading work to other teams or users. However, if the content is used as part of an LLM-based solution, it could lead to eventual model collapse.

Tradeoffs are not unique to content teams, and in well-run environments, content leaders use blended techniques to mitigate the risks. However, outside stakeholders might be surprised by the results. It’s entirely possible they had not thought through the downstream implications to RAG corpuses. With training, AI can overcome some of these issues, but not all.

Never waste a good crisis

Whereas some teams might be discouraged by this situation, others will look at it as an opportunity. KPMG reports that 80% of C-suite and business leaders believe generative AI is important to maintaining a competitive advantage and gaining market share. The same study shows leaders are looking to improve their ROI (as opposed to simply experimenting with the technology). Savvy content leaders should use this as an opportunity to directly associate their teams’ work with business-critical initiatives.

As content leaders look for these opportunities, they should specifically try to attach themselves to RAG-based projects. AI teams often improve the accuracy of off-the-shelf LLM solutions by cross-checking the answers against a corpus of organization-specific content and data (“the RAG”). These solutions then power conversational chatbots for use by customers and employees. Content teams are natural members of these projects since their work constitutes much of the RAG baseline.

As part of this process, development teams will probably scrutinize the content more deeply than they ever have before. RAG developers and test teams are likely to find outdated pages, inaccurate information, missing content, or conflicting information on different parts of the site. In addition to working with the appropriate stakeholders to fix these issues, content development teams should also use this opportunity to track their overall impact on the project and time spent supporting it.

Communicate the value of content

As previously noted, the effort to improve the underlying content in the RAG might surprise those who are funding the project. The following metrics can help content leaders assess the impact of their teams’ efforts and potentially secure ongoing funding to maintain the quality of the generative AI results.

Employee efficiency metrics: A simple employee survey might be the easiest and most effective way to track employee efficiency. Consider asking employees how much time they spend looking for content (both before and after implementing the chatbot) and how much time they spend creating content. (The latter number might seem unintuitive, but employees often recreate existing information if they cannot find it.)

Customer satisfaction metrics: Whereas correlation does not equal causation, it's worthwhile to compare generative AI chatbot results with customer satisfaction metrics. The two clearly have the potential to impact each other, so it's worth monitoring. Good metrics to track include chatbot usage vs. site visits vs. customer satisfaction (to understand overall chatbot usage and its impact on customer satisfaction) and questions answered vs. source content (to understand how much any given team is contributing to chatbot results).

Conclusion

The technology behind LLMs, RAGs, and generative AI is changing quickly, and many content professionals fear these changes will make them redundant. For those who have worked in underfunded departments, it’s an easy conclusion to draw. I would suggest, however, that content professionals will remain relevant for the foreseeable future.

In corporate environments, LLMs, RAGs, and generative AI have the potential to radically improve employee efficiency and customer satisfaction. Users will be able to easily access the information they need and engage in helpful conversations that help them understand it. But these solutions aren’t just built on technology. They also require a solid foundation of useful content, created and managed in large part by content professionals.