When applied to machine learning models and other artificial intelligence (AI), the Gerchberg-Saxton algorithm more equitably and uniformly distributes a dataset’s racial and ethnic biomedical information, researchers reported in study findings published in the Journal of Healthcare Informatics Research. They demonstrated that the algorithm created parity in a group of patients spanning a broad range of population backgrounds.

For the study, researchers from the National Cancer Institute (NCI), Wake Forest School of Medicine, and University of North Carolina used the Medical Information Mart for Intensive Care (MIMIC) III version 1.4 database to test the GS algorithm. They said that MIMIC has been widely used to study clinical machine learning and AI, and it is “known for its inherent selection biases and disparities.”

Initially, the researchers selected a sub-dataset of 13,980 patients from 36 different racial and ethnic groups, but for the purposes of their analysis, they reclassified the sample into 5 ethnic groups based on patients’ self-reported common ancestral heritage: 

  • European American (n = 9,814)
  • African American (n = 1,690)
  • Eastern Asian American (n = 346)
  • Hispanic American (n = 641)
  • Other (i.e., unknown or not reported; n = 1,489)

“The MIMIC-III data has previously been shown to have a potential bias that can adversely impact the accuracy of predictive models,” the researchers wrote. “Therefore, our study demonstrates that we can mitigate racio-ethnic bias allowing for more equitable and unbiased estimates.”

The GS algorithm was created to predict patterns in light and electrons, the researchers explained, and it is most frequently used in holographic imaging to reconstruct or complete images with unknown patterns. It typically achieves its goal within 20 processing cycles.

For their study, the researchers applied the GS algorithm to the patient dataset for 50 cycles. They also emphasized that they applied it before splitting the cohorts into training, test, and validation datasets. They also used the Shapley Additive Explanations, a mathematical concept that assigns importance values to each feature in a machine learning model to estimate its impact on the model’s output “to provide a more definitive and substantiated view on the effectiveness of the GS algorithm in mitigating racial bias. It is commonly used in machine learning to explain the output of a model by identifying the contribution of each feature to the predicted outcome,” the researchers said. Finally, they used the Demographic Parity and Error Rate Parity fairness constraints to evaluate the fairness of their AI model by assessing the bias in its outcomes.

After applying the GS algorithm, the researchers observed two significant outcomes:

  • Greater parity across racial and ethnic groups, indicating a bias reduction 
  • Significant increase in overall prediction accuracy for the post–GS-trained model 

“The results of these analyses verify that a more uniform feature contribution indicates a more equitable training process,” the researchers concluded. “While further research is required to investigate the full capacity and performance of the GS algorithm in other settings and with other modalities to explore its full potential in medical applications, we believe that the implications of our study are significant and have the potential to advance current, ongoing efforts investigating bias mitigation.”

Learn more about AI and its application in oncology nurses’ role in health care and cancer care on the Oncology Nursing Podcast. Listen now using the players embedded below.