Challenges for Evaluating AI and Practical Implications

Artificial intelligence (AI) promises to transform clinical decision-making processes as it has the potential to harness the vast amounts of genomic, biomarker, and phenotype data that is being generated across the health system including from health records and delivery systems, to improve the safety and quality of care decisions.

Technological developments are outpacing our ability to predict the effects of AI on the practice of medicine, the care received by patients, and the impact on their life. In the immediate future, we can expect AI to support clinical decisions with humans in the decision loop.

 A recent paper by Associate Professor Magrabi along with other members from the International Medical Informatics Association (IMIA) Working Group on Technology Assessment & Quality Development in Health Informatics and the European Federation for Medical Informatics (EFMI) Working Group for Assessment of Health Information Systems share their insights on the application of Artificial Intelligence in Clinical Decision Support.

Objectives

This paper draws attention to:

  1. key considerations for evaluating artificial intelligence (AI) enabled clinical decision support;
  2. challenges and practical implications of AI design, development, selection, use, and ongoing surveillance.

Method

A narrative review of existing research and evaluation approaches along with expert perspectives drawn from the IMIA Working Group on Technology Assessment and Quality Development in Health Informatics and the EFMI Working Group for Assessment of Health Information Systems.

Results

There is a rich history and tradition of evaluating AI in healthcare. While evaluators can learn from past efforts, and build on best practice evaluation frameworks and methodologies, questions remain about how to evaluate the safety and effectiveness of AI that dynamically harness vast amounts of genomic, biomarker, phenotype, electronic record, and care delivery data from across health systems. This paper first provides a historical perspective about the evaluation of AI in healthcare. It then examines key challenges of evaluating AI-enabled clinical decision support during design, development, selection, use, and ongoing surveillance. Practical aspects of evaluating AI in healthcare, including approaches to evaluation and indicators to monitor AI are also discussed.

Conclusion

Commitment to rigorous initial and ongoing evaluation will be critical to ensuring the safe and effective integration of AI in complex sociotechnical settings. Specific enhancements that are required for the new generation of AI-enabled clinical decision support will emerge through practical application.

Read the full paper.

Associate Professor Farah Magrabi leads the Safety and Quality of Digital Health Systems Research Stream of the CRE in Digital Health. She is Co-chair of the IMIA Working Group on Technology Assessment & Quality Development in Health Informatics; and Co-chair of the Australian Alliance for Artificial Intelligence in Health (AAAiH)’s working group on Safety, Quality and Ethics.