Talks/Videos

Lessons from the regulatory process for medical software for image analysis and AI (Dec 12, 2022)

This video is a recording of a seminar that was part of the National Institute of Aging (NIH/NIA) Artificial Intelligence Lecture Series. It was presented on December 12th, 2022.

Abstract

Abstract: Artificial intelligence/machine learning (AI/ML) algorithms are currently driving much new research in medical image analysis research. The growth in the number of image analysis publications using such techniques has been exponential. Similarly, in the medical software world, we have also seen an explosion in the number of FDA-cleared standalone medical software devices known as SaMD (Software-as-a-Medical Device), also largely fueled by by AI/ML methods. Recent developments in AI/ML (primarily deep learning) have been surrounded by a large amount of hype and overpromise; a phenomenon that is common in the history of AI. One of the major problems we face is how to avoid overlearning or overtraining an algorithm from the relatively small training datasets available (as compared to what is used for non-medical applications.) Researchers in the field are familiar with how an algorithm’s performance can deteriorate over time as it gets applied to data from slightly different scanners (or even the same scanner after a minor software upgrade), which are both fundamental due to such overtraining. So, while there are many papers advertising exceptional performance, much of this is artificially inflated. The situation is analogous to the p-hacking (reproducibility) crisis seen in other areas of science. In this talk, I will review the medical software regulatory process and recent developments in the use of AI in medical image analysis and present some thoughts as to how some of the procedures used in regulated medical software development (explicit quality procedures, risk classification, risk management, usability engineering, external validation) could be applied to AI/ML to potentially allow this potentially game-changing technology to transform human health.

Bibliography

1. Papademetris X, Quraishi AN, Licholai GP. Introduction to Medical Software: Foundations for Digital Health, Devices and Diagnostics. Cambridge University Press; 2022. (Cambridge Texts in Biomedical Engineering). (See www.medsoftbook.com)

2. Papademetris X. Coursera Class: Introduction to Medical Software [Internet]. Coursera. 2021 [cited 2021 Oct 15]. Available from: https://www.coursera.org/learn/introduction-to-medical-software/

3. Dreyfus HL. Alchemy and Artificial Inteligence. Rand Corporation; 1965. Available from: https://www.rand.org/content/dam/rand/pubs/papers/2006/P3244.pdf

4. NEW NAVY DEVICE LEARNS BY DOING; Psychologist Shows Embryo of Computer Designed to Read and Grow Wiser [Internet]. New York Times. 1958 [cited 2022 Oct 16]. Available from: http://fastml.com/images/ai/new_navy_device_learns_by_doing.jpg?utm_source=substack&utm_medium=email

5. Creative Destruction Lab. Geoff Hinton: On Radiology [Internet]. Youtube; 2016 [cited 2023 Jan 9]. Available from: https://www.youtube.com/watch?v=2HMPRXstSvQ

6. International Organization for Standardization and International Electrotechnical Commission. ISO/IEC 22989:2022: Information technology — Artificial intelligence — Artificial intelligence concepts and terminology [Internet]. Geneva, CH; 2022 [cited 2023 Jan 9]. Report No.: 22989. Available from: https://www.iso.org/standard/74296.html

7. International Medical Devices Regulator Forum (IMDRF): SaMD Working Group. Software as a Medical Device (SaMD): Application of Quality Management System [Internet]. 2015. Available from: http://www.imdrf.org/docs/imdrf/final/technical/imdrf-tech-151002-samd-qms.pdf

8. International Medical Devices Regulator Forum (IMDRF): SaMD Working Group. Software as Medical Device: Possible Framework for Risk Categorization and Corresponding Considerations [Internet]. 2014. Available from: http://www.imdrf.org/docs/imdrf/final/technical/imdrf-tech-140918-samd-framework-risk-categorization-141013.pdf

9. International Electrotechnical Commission (IEC). IEC 62304 Medical device software -- Software life cycle processes [Internet]. Geneva, CH; 2006. Available from: https://www.iso.org/standard/38421.html

10. Duda R, Hart P. Pattern Classification and Scene Analysis. New York: Wiley; 1973.

11. LeCun Y, Bengio Y, Hinton G. Deep Learning. Nature [Internet]. 2015;521(7553):436–44. Available from: https://doi.org/10.1038/nature14539

12. Goodfellow I, Bengio Y, Courville A. Deep Learning [Internet]. MIT Press; 2016. Available from: https://www.deeplearningbook.org/

13. Lorenz EN. The Essence of Chaos (Jessie and John Danz Lectures). first. University of Washington Press; 1995.

14. al Conocimiento (Knowledge Window) V. When Lorenz Discovered the Butterfly Effect [Internet]. OpenMind. 2015 [cited 2022 Jun 3]. Available from: https://www.bbvaopenmind.com/en/science/leading-figures/when-lorenz-discovered-the-butterfly-effect/

15. Zeidenberg J. AI algorithms tend to malfunction, a problem of `drift’, observers say. Canadian Healthcare Technology [Internet]. 2002 Sep;27(6):12. Available from: https://www.canhealth.com/wp-content/uploads/2022/09/Canadian-Healthcare-Technology-2022-06.pdf

16. Onofrey JA, Casetti-Dinescu DI, Lauritzen AD, Sarkar S, Venkataraman R, Fan RE, Sonn GA, Sprenkle PC, Staib LH, Papademetris X. Generalizable Multi-Site Training and Testing of Deep Neural Networks Using Image Normalization. Proc IEEE Int Symp Biomed Imaging [Internet]. 2019 Apr;2019:348–51. Available from: https://www.ncbi.nlm.nih.gov/pubmed/32874427

17. Wong A, Otles E, Donnelly JP, Krumm A, McCullough J, DeTroyer-Cooley O, Pestrue J, Phillips M, Konye J, Penoza C, Ghous M, Singh K. External validation of a widely implemented proprietary sepsis prediction model in hospitalized patients. JAMA Intern Med [Internet]. 2021 Aug 1 [cited 2022 Oct 18];181(8):1065–70. Available from: https://pubmed.ncbi.nlm.nih.gov/34152373/

18. Rosenblatt M, Dadashkarimi J, Scheinost D. Enhancement attacks in biomedical machine learning [Internet]. arXiv [stat.ML]. 2023. Available from: http://arxiv.org/abs/2301.01885

19. U.S. Food and Drug Administration (FDA): Center for Devices and Radiological Health. Software as Medical Device (SAMD): Clinical Evaluation. Guidance for Industry and Food and Drug Administration Staff [Internet]. 2017. Available from: https://www.fda.gov/medical-devices/digital-health-center-excellence/software-medical-device-samd

20. International Organization for Standardization (ISO). ISO 14971:2019 Medical devices -- Application of risk management to medical devices [Internet]. Geneva, CH; 2019. Available from: https://www.iso.org/obp/ui/#!iso:std:72704:en

21. Geirhos R, Jacobsen JH, Michaelis C, Zemel R, Brendel W, Bethge M, Wichmann FA. Shortcut learning in deep neural networks. Nature Machine Intelligence [Internet]. 2020 Nov 10 [cited 2022 Sep 12];2(11):665–73. Available from: https://www.nature.com/articles/s42256-020-00257-z

22. Zech JR, Badgeley MA, Liu M, Costa AB, Titano JJ, Oermann EK. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study. PLoS Med [Internet]. 2018 Nov;15(11):e1002683. Available from: http://dx.doi.org/10.1371/journal.pmed.1002683

23. Saporta A, Gui X, Agrawal A, Pareek A, Truong SQH, Nguyen CDT, Ngo VD, Seekins J, Blankenberg FG, Ng AY, Lungren MP, Rajpurkar P. Benchmarking saliency methods for chest X-ray interpretation. Nature Machine Intelligence [Internet]. 2022 Oct 10 [cited 2022 Nov 2];4(10):867–78. Available from: https://www.nature.com/articles/s42256-022-00536-x