The Ethics of AI in Regulated Life Sciences

Written on behalf of Vistatec by Kit Brown-Hoekstra, Founder and Principal at Comgenesis

Many, if not most, disruptive technologies outpace the legal and ethical frameworks used to ensure that such technologies are implemented in beneficial ways. This statement is undoubtedly true with artificial intelligence (AI), which has dominated recent discourse and continues to evolve rapidly.

In life sciences and other highly regulated industries, we must balance the desire for innovation and efficiency with our industry mandate to do no harm. We must be particularly thoughtful in implementing and using AI because mistakes in our industry can cause life-altering consequences. We cannot currently depend solely on AI vendors to determine what guardrails are necessary and desirable for our industry, which often has a higher standard of care than the tool vendors.

As localization and content professionals, we are uniquely positioned to develop an industry-wide ethical framework and to ensure that this framework includes (at a minimum) these considerations:

Data security and privacy
Equity and inclusion
Transparency, authenticity, and truth-telling
Data curation and guardrails
Technical accuracy and proper attribution
Effective prompt design
Sustainability

Data Security and Privacy

Every week, it seems yet another company has been hacked or accidentally exposed sensitive data, leaving people with valid concerns about their privacy and data security and reluctant to share their data. These concerns pose a challenge for AI developers, as training and effectively using AI require vast amounts of well-curated data that could contain sensitive or proprietary information. These datasets must be protected from bad actors and maintain compliance with local regulations while remaining accessible to the systems and people who need them.

Globally, privacy laws vary in their strictness and their interpretation of AI’s use in a medical setting:

European Union: GPDR directly addresses handling personal information in automated decision-making, gives individuals the right to manage how their data is used, and requires transparency and accountability.
Japan: APPI regulates how systems handle personal data, requires consent for data processing, and requires organizations to enforce security measures to protect data.
China: The Cybersecurity Law is similar to GPDR and APPI in handling personal data, though its primary focus is cybersecurity.
United States: the Health Insurance Portability and Accountability Act (HIPAA) predates AI use in healthcare. Still, it does address patient privacy in a way that impacts how AI is used and managed.
Other: Many countries have varying degrees of protection and recognition of AI built into their laws.

In addition, many healthcare providers are already using AI to assist with differential diagnosis, patient education, decision support, and data analysis. Patients might even use generative AI, such as ChatGPT, to research symptoms or to learn more about their diagnosis.

We need to consider both sides of these use cases: 1) supplying the AI systems with anonymized data that can improve outputs, and 2) specifying how an individual’s search history and patterns get stored and used by the AI so that patient confidentiality is maintained.

Equity and Inclusion

Before the early 1990s, clinical trial data was heavily skewed toward white men, and in some cases, women, older adults, children, and other ethnic and racial groups were actively excluded (JAMA, 2001). In other cases, certain groups were experimented on without their knowledge or permission, the legacy of which continues to impact these groups’ willingness to participate in modern clinical trials (the Tuskegee Syphilis Experiment is one egregious example). While efforts are ongoing to create more equity and inclusion in clinical trials, this history means that the data used to train AIs on medical diagnosis, treatment, and outcomes are inherently biased.

As mentioned in our blog on decentralized clinical trials, these can help ensure more inclusive and equitable data collection, improving AI training and output.

Other efforts to improve equity and inclusion and reduce bias are crucial and ongoing, and these include the following:

Curating the datasets for diversity before using them to train the AI (something that language service providers do well)
Employing debiasing algorithms and adversarial training to reduce the bias in the dataset
Using well-structured data and content that is tailored to the specific job you want the AI to do and restricting the AI to particular tasks to reduce errors (something technical communication professionals do well)
Establishing policies to deprecate biased content and embedding those policies into the AI to act as guardrails.

Transparency, Authenticity, and Truth-telling

AI can be somewhat of a black box with little clarity on how it arrived at its decisions or predictions. Compounding the situation, relatively few people understand AI’s inner workings or even what data their system is being fed. For example, generative AI can “hallucinate” — providing seemingly plausible answers that are nonsensical, or more concerning, an almost-but-not-quite-right answer. According to leading AI expert Melanie Mitchell of the Santa Fe Institute in Science News, “We don’t really have a sense of how they are doing this reasoning, [or] whether that reasoning is robust.”

Bad actors also use AI for nefarious purposes, such as “deep fakes.” It can be challenging to recognize a deep fake because the AI often uses elements of a real person’s image or voice and manipulates it into a fake scenario. This application of AI is concerning because of the consequences to society as a whole and to the victim, who could lose reputation or revenue, be falsely accused of a crime, or be blackmailed, threatened, or even killed.

On a more positive note, AI has enabled voice cloning to give someone their voice back if they’ve lost it to injury or disease. The actor Val Kilmer, who lost his voice to throat cancer, is a perfect example of this implementation.

In medical and life science organizations, such hallucinations and sleight of hand can be prevented by putting guardrails on the AI, ensuring data security, defining and enforcing acceptable uses, and clearly identifying AI output.

Transparency and accountability for the AI’s outputs also means explaining how AI makes decisions and establishing clear responsibilities for oversight and recourse if the AI fails or makes a mistake. Many privacy laws require that companies provide a way for people to opt out of the automated decision-support process.

Data Curation and Guardrails

AI is like an eager intern keen to assist you in your work. Just as you would probably not put an intern in charge of a complex project without guidance and specific instructions, you need to provide AI with tight supervision and explicit guidance. AI hasn’t yet reached the level of maturity where you can hand it a random list of project requirements and tasks and expect it to produce flawless results organized according to the project team’s needs.

We need to set expectations for compliance, establish best practices and guidelines, and implement solutions that help us ensure that the data is fully curated before being given to the AI. We also need to embed policies and other guardrails into the AI to provide effective and accurate output.

According to Phil Ritchie, Vistatec’s CTO, “A key part of all of this for medical and life science companies is using the right AI tool for the right activities and locking it down to only use your data so that you can better control the fidelity of the outputs.”

Technical Accuracy and Proper Attribution

When ChatGPT and DALL-E first became popular, educators, content creators, musicians, and artists expressed deep (and valid) concerns about copyright violations and plagiarism. Many AI developers used datasets containing copyrighted materials without attribution or compensation to the creators. Students started using ChatGPT to write their essays but often did not recognize when the AI went from synthesizing to plagiarizing.

However, because of how AI consumes data, the reality is more complex. According to Lance Cummings, an associate professor at the University of North Carolina Wilmington, “The ethical debate should be more nuanced, focusing on how these companies acquire and continue to access this data. Knowing when to cite ChatGPT is just like citing our knowledge; it is more complex than just blanketing it all with plagiarism.”

For medical companies that have a high need for technical accuracy and proper attribution, it is vital to avoid these issues. Luckily, many companies are already doing some things right when it comes to supporting AI:

Using Simplified Technical English for source content and a controlled vocabulary with terminology management for localized content
Implementing structured authoring and component-based content management using DITA or other XML architectures
Developing robust semantic models and taxonomies in all the languages that they localize into
Documenting their content ecosystems, including localized content
Establishing clear workflows, roles and responsibilities, and distribution channels

Although AI has made significant progress, it is only as good as the information it receives. Therefore, it is crucial to provide clear and specific instructions that are well-structured, well-written, and technically accurate.

Effective Prompt Design

As David Cooperrider, the founder of Appreciative Inquiry, said, “We live in a world that our questions create.” The advent of AI requires us to ask better questions to guide the AI properly without introducing or amplifying bias, violating privacy, or manipulating our audience with disinformation.

Designing effective and ethical prompts requires us to think deeply about what questions we ask and how we ask them. AI might come up with entirely different responses depending on the questions we ask about the same challenge. Challenging AI with differently worded prompts can help us learn its capabilities and limitations. Guidelines around specificity, using clear, simple language and structure can help us maximize those capabilities while limiting the hallucinations and misinformation.

While effective AI development requires a multi-disciplinary approach, linguists and content professionals bring critical skills in culture, context, communication, content strategy and analysis, and controlled language, thus ensuring that AI technologies are technologically advanced, socially responsible, and ethically grounded.

Sustainability

AI puts substantial energy demands on data centers, and awareness of this area is growing. While not directly in the purview of localization and content professionals, an ethics framework would only be complete considering this new technology’s environmental impact. Companies can and should investigate ways to mitigate energy consumption issues.

Conclusion

While technology itself is generally neutral from an ethical perspective, how we use it can significantly impact our quality of life at a societal and individual level. We get to choose the future we want to see— a society where AI and other technologies enhance our lives and enable everyone to live to their fullest potential.

An attitude of cautious optimism, a dash of healthy skepticism, and the establishment of effective, ethical guidelines can lead us to a more positive future. We are at the beginning of this journey. Success requires continuous multidisciplinary dialogue, which is essential for creating AI that works for the greater good.