healthcare | VALIANT

Developing an expanded version of My Diabetes Care in English and Spanish: A design and formative usability study

waddelma — Wed, 17 Jun 2026 19:13:10 +0000

Rodriguez-Baron, Elsa B.; Rodriguez, Jorge A.; Samal, Lipika; Anders, Shilo; Beebe, Russell; Reale, Carrie; Elasy, Tom; Hackstadt, Amber J.; Yu, Zhihong; Mayberry, Lindsay; Nelson, Lyndsay A.; Rosenbloom, S. Trent; Wright, Adam; Nigg, Audriana; Martinez, William. (2026).��.��Digital Health.��

This study aimed to improve the My Diabetes Care (MDC) patient portal, a digital tool that helps people manage diabetes, by making the interface easier to use and adding more health information, such as weight, body mass index (BMI), and urine microalbumin, which is a marker used to check for kidney damage. The researchers also created a Spanish-language version designed to be more culturally appropriate and accessible for Spanish-speaking patients. To do this, they used a five-step design process and tested the prototype with 12 adults with type 2 diabetes from primary care clinics. Participants completed tasks in the portal and gave feedback through interviews and usability surveys. The team also worked with Spanish-speaking community consultants to help shape the language, design, and onboarding support for the Spanish version. After two rounds of testing, the portal was improved in terms of both layout and wording, and all participants were able to complete the tasks successfully in the final round. Usability scores were very high, suggesting that the system was easy to use. Overall, the study shows that involving users and community members in the design process can help create a diabetes self-management tool that is both practical and responsive to the needs of diverse patients.

Figure 2. Illustrative examples of selected design sprint activities. (a) Related “how might we” statements grouped into a category. (b) Design sketches with dot stickers to visually indicate individual preferences and collectively prioritize design components.

The environmental impact of diagnostic imaging: opportunities for pediatric radiologists

waddelma — Wed, 17 Jun 2026 18:38:47 +0000

Gross, Jonathan; Pruthi, Sumit; Omary, Reed A.; Snyder, Elizabeth J. (2026).��.��Pediatric Radiology.��

Healthcare is one of the major sources of greenhouse gas emissions worldwide, and medical imaging contributes to that impact because it uses a lot of energy. Although people have become more concerned about radiology’s role in climate change, most of what is known comes from adult imaging departments. This matters for pediatric radiology because children are especially vulnerable to the health effects of climate change, which can affect their wellbeing in many ways. The purpose of this manuscript is to review the main environmental impacts of radiology and point to areas where the field could become more sustainable.

Fig.��1

From:��

Patient-reported outcomes for monitoring substance use treatment: A systematic review of single-item measures

waddelma — Tue, 26 May 2026 21:15:22 +0000

Reese, Thomas J.; Tindle, Hilary A.; Bachmann, Justin.; Wright, Adam.; Ancker, Jessica S.; Audet, Carolyn M.; Shah, Mauli V.; Steitz, Bryan D.; Levin, Michael H.; Kast, Kristopher A.; Marcovitz, David.; von Horn, Amanda.; Kelley, A. Taylor.; Bridges, John F. P. (2026).��.��Addiction.��

Measurement-based care is a structured way for clinicians to track treatment progress by using repeated, standardized check-ins to guide decisions. In substance use treatment, this approach can improve outcomes, but it is often difficult to use because many questionnaires are long. This review looked at whether very short patient-reported outcome measures, or PROMs, which ask patients to rate a single issue with one question, can work well in this setting. The researchers searched the medical literature for studies of single-item PROMs in adults receiving substance use treatment and evaluated how well these measures performed, including whether they measured what they were supposed to measure, gave consistent results, predicted outcomes, and detected change over time. They found 35 studies covering 68 single-item measures and more than 50,000 participants across nine clinical topics. Fifteen measures had enough evidence to be considered strong overall. The best-supported single-item measures were those assessing craving, readiness for treatment, and self-efficacy, meaning a person’s confidence in their ability to succeed. In some cases, these single-question measures worked as well as longer questionnaires. However, more than half of the measures did not yet have clear cutoffs for interpreting scores or enough evidence that they could reliably detect improvement or worsening over time. Overall, the review suggests that single-item PROMs can be practical and useful tools for routine monitoring in substance use treatment, especially when they have strong evidence and clear scoring thresholds, but some are better used alongside other measures rather than alone.

FIGURE 1

Coverage of single-item patient-reported outcome measure studies across substance type and clinical construct. Darker boxes indicate a higher number of studies.

Auditor models to suppress poor artificial intelligence predictions can improve human-artificial intelligence collaborative performance

waddelma — Thu, 26 Mar 2026 19:29:10 +0000

Katherine E. Brown; Jesse O. Wrenn; Nicholas J. Jackson; Michael R. Cauley; Benjamin X. Collins; Laurie L. Novak; Bradley A. Malin; Jessica S. Ancker (2026).��.��Journal of the American Medical Informatics Association, 33(3), 621–631.��

This study examines how machine learning (ML) systems—often used to support healthcare decisions—can sometimes produce unfair results, meaning their predictions may be less accurate for certain patient groups. A key concern is that clinicians may rely too heavily on these systems, which can unintentionally reinforce these biases. The researchers explored a strategy called��ML suppression, which means selectively “silencing” or withholding certain AI predictions when they are likely to be unreliable, based on an auditing process. They also looked at whether incorporating��uncertainty estimates��(how confident the model is in its predictions) could help decide when to suppress outputs.

Using large hospital datasets, the team simulated how clinicians and ML systems would work together to predict outcomes like death, ICU admission, or hospital readmission. They compared different scenarios, including when the AI performed better than clinicians and when it performed worse. They evaluated both accuracy (using a standard metric called AUC, which measures how well predictions distinguish outcomes) and fairness (measured by differences in error rates across groups).

The results showed that when the AI model performed better than clinicians, using suppression improved overall performance without making fairness worse. When clinicians performed better, relying on human judgment alone was often as fair or fairer than using suppressed AI predictions. Importantly, adding uncertainty information helped improve results further by better identifying when AI predictions should be ignored. Overall, the study suggests that carefully filtering out low-quality AI predictions can improve both the effectiveness and fairness of human–AI collaboration in healthcare.

��

Figure 1.

Schematic indicating the collaboration scenario with and without suppression.³⁴

Clinician Needs and Requirements for a Decision Aid Navigator: Qualitative Study

waddelma — Wed, 28 Jan 2026 17:17:08 +0000

Morse, Brad; Reale, Carrie; Nguyen, An T.; Latella, Erin; Bauguess, Hannah D.; Anders, Shilo H.; Roberts, Pamela S.; SooHoo, Spencer L.; El-Kareh, Robert E.; Soares, Andrey; & Schilling, Lisa M. (2025).��.��JMIR Human Factors,��12, e69756.��

Decision aids are tools that help patients and clinicians make healthcare decisions together, improving patient knowledge, reducing regret, and encouraging meaningful discussion. However, many clinicians do not use these tools because of time limits, difficulty matching aids to patient needs, leaving the electronic health record (EHR) to access them, and manual data entry.

This study explored clinician needs to design an EHR-integrated app called DEAN (Decision Aid Navigator), built on the SMART on FHIR platform. DEAN identifies decision aids relevant to a patient’s conditions, current treatments, and demographics, and helps document shared decision-making discussions.

Researchers interviewed 13 clinicians from four academic medical centers while showing a prototype of DEAN. Analysis of the interviews revealed three key needs: (1) streamlined functionality to reduce workflow burden, (2) clinician skills to use the app and decision aids effectively, and (3) trust that the app suggests pre-vetted decision aids. Clinicians agreed that EHR integration was essential for adoption.

The study concludes that improving tools like DEAN and integrating them into the EHR can help clinicians use decision aids more efficiently, supporting shared decision-making and potentially increasing patient-centered care.

Figure 1.��The 5 rights of clinical decision support (adapted from []) .

Evaluating cell AI foundation models in kidney pathology with human-in-the-loop enrichment

waddelma — Fri, 19 Dec 2025 16:47:48 +0000

Guo, J., Lu, S., Cui, C., Deng, R., Yao, T., Tao, Z., Lin, Y., Lionts, M., Liu, Q., Xiong, J., Wang, Y., Zhao, S., Chang, C. E., Wilkes, M., Fogo, A. B., Yin, M., Yang, H., & Huo, Y. (2025).��.��Communications Medicine,��5(1), 495.��

Large artificial intelligence foundation models are becoming important tools in healthcare, including digital pathology, where they help analyze medical images. Many of these models have been trained to handle complex tasks such as diagnosing diseases or measuring tissue features using very large and diverse datasets. However, it is less clear how well they perform on more focused tasks, such as identifying and outlining cell nuclei within images from a single organ like the kidney. This study examines how well current cell foundation models perform on this task and explores practical ways to improve them.

To do this, the researchers assembled a large dataset of 2,542 kidney whole slide images collected from multiple medical centers, covering different kidney diseases and even different species. They evaluated three widely used, state-of-the-art cell foundation models—Cellpose, StarDist, and CellViT—for their ability to segment cell nuclei. To improve performance without requiring extensive, time-consuming pixel-level annotations from experts, the team introduced a “human-in-the-loop” approach. This method combines predictions from multiple models to create higher-quality training labels and then refines a subset of difficult cases with corrections from pathologists. The models were fine-tuned using this enriched dataset, and their segmentation accuracy was carefully measured.

The results show that accurately segmenting cell nuclei in kidney pathology remains challenging and benefits from models that are more specifically tailored to this organ. Among the three models, CellViT showed the best initial performance, with an F1 score of 0.78. After fine-tuning with the improved training data, all models performed better, with StarDist reaching the highest F1 score of 0.82. Importantly, combining automatically generated labels from foundation models with a smaller set of pathologist-corrected “hard” image regions consistently improved performance across all models.

Overall, this study provides a clear benchmark for evaluating and improving cell AI foundation models in real-world pathology settings. It also demonstrates that high-quality nuclei segmentation can be achieved with much less expert annotation, supporting more efficient and scalable workflows in clinical pathology without sacrificing accuracy.

Fig. 1: Overall framework.

The upper panel��(a–c) illustrates the diverse evaluation dataset consisting of 2542 kidney WSIs.��a��shows the number of kidney WSIs in publicly available cell nuclei datasets versus our evaluation dataset, which exceeds existing datasets by a large margin.��b��depicts the diverse data sources included in our dataset.��c��indicates that these WSIs were stained using Hematoxylin and Eosin (H&E), Periodic acid–Schiff methenamine (PASM), and Periodic acid–Schiff (PAS).��Performance: Kidney cell nuclei instance segmentation was performed using three SOTA cell foundation models: Cellpose, StarDist, and CellViT. Model performance was evaluated based on qualitative human feedback for each prediction mask. Data Enrichment: A human-in-the-loop (HITL) design integrates prediction masks from performance evaluation into the model’s continual learning process, reducing reliance on pixel-level human annotation.

Human-centered design of an artificial intelligence monitoring system: the ��Vlog�ٷ� Algorithmovigilance Monitoring and Operations System

waddelma — Sun, 23 Nov 2025 16:58:07 +0000

Salwei, Megan E., Davis, Sharon E., Reale, Carrie., Novak, Laurie Lovett., Walsh, Colin G., Beebe, Russ., Nelson, Scott D., Sundrani, Sameer., Rose, Susannah L., Wright, Adam T., Ripperger, Michael A., Shave, Peter., & Embi, Peter J. [2025]. .��JAMIA Open,��8(5), ooaf136.��

As artificial intelligence [AI] becomes more common in healthcare, there is growing awareness that these systems need continuous oversight after they are put into use—a process known as algorithmovigilance. However, few tools exist to help hospitals consistently monitor and manage the performance of AI across their entire organization. In this study, we worked to understand what end users need from such a system while designing a new monitoring platform called the ��Vlog�ٷ� Algorithmovigilance Monitoring and Operations System [VAMOS]. To do this, we brought together a multidisciplinary team at ��Vlog�ٷ� Medical Center and held nine participatory design sessions with clinicians, leaders, and technical experts to create early prototypes. After developing a working version, we conducted eight additional interviews to gather feedback and used rapid qualitative analysis to refine the design. A multidisciplinary heuristic evaluation then helped identify more ways to improve the system. Through this human-centered, iterative process, we identified the key features an AI monitoring system must include, such as specific data displays, performance dashboards, expandable “accordion” summaries, and model-specific pages that meet the needs of a wide range of users. We also outlined general design principles for long-term AI monitoring, highlighting the challenge of supporting teams spread across the health system as they track performance issues and respond to signs of algorithm deterioration. Ultimately, VAMOS is intended to help healthcare organizations monitor AI tools in a systematic and proactive way, with the goal of improving care quality and ensuring patient safety.

Figure 1.

Overview of human-centered design process to develop VAMOS.

��

AI-Driven Clinical Decision Support to Reduce Hospital-Acquired Venous Thromboembolism: A Trial Protocol

waddelma — Thu, 23 Oct 2025 19:20:17 +0000

Walsh, Colin G.; Long, Yufei; Novak, Laurie Lovett; Salwei, Megan E.; Tillman, Benjamin F.; French, Benjamin C.; Mixon, Amanda S.; Law, Michelle E.; Franklin, Jacob; Embi, Peter J. (2025). JAMA Network Open, 8(10), e2535137.

Hospital-acquired venous thromboembolism (HA-VTE), or blood clots that develop in the veins during or after a hospital stay, remains one of the leading preventable causes of death among hospitalized adults in the United States. Although many models have been created to predict which patients are most at risk, none have clearly proven to be more effective than others, and it is still uncertain whether these models actually improve doctors’ decisions about preventive treatment. Testing these systems in both urban and rural hospitals may help determine how well they work across different healthcare environments.

This study is a randomized clinical trial designed to test whether an artificial intelligence (AI)–based clinical decision support (CDS) tool can reduce the number of HA-VTE cases among adult hospital patients. The trial will be conducted by ��Vlog�ٷ� Medical Center from October 2025 through September 2027, including adults aged 18 and older who are hospitalized in medical, surgical, or intensive care units and are at high risk for blood clots but do not currently have one or a condition that prevents preventive treatment. Participants will be drawn from ��Vlog�ٷ� Adult Hospital in Nashville and three partner hospitals serving rural communities in Middle Tennessee.

Within the hospital’s electronic health record system, patients will be randomly assigned to receive either AI-supported care, which uses an alert system to prompt clinicians about clot prevention, or standard care based on traditional risk assessment tools. The main goal of the study is to determine whether the AI tool reduces the number of hospital-acquired blood clots. Additional measures will include hospital length of stay, readmission rates, safety outcomes, and bleeding events.

This study will be one of the first to examine whether an AI-driven decision support system can safely and effectively lower the risk of hospital-acquired blood clots without increasing side effects. It will also assess whether the same AI model performs equally well in both urban and rural hospitals. The results and supporting data will be shared publicly through peer-reviewed publications and ClinicalTrials.gov.

Figure 1. ��Intervention OurPractice Advisories Logic

BPA indicates best practice advisory; CDS, clinical decision support; DVT, deep vein thrombosis; VTE-AI, Venous Thromboembolism Using Artificial Intelligence.

Assessing the clinical utility of biomarkers using the intervention probability curve (IPC)

waddelma — Thu, 23 Oct 2025 19:05:51 +0000

Paez, Rafael; Rowe, Dianna J.; Deppen, Stephen A.; Grogan, Eric L.; Kaizer, Alexander M.; Bornhop, Darryl J.; Kussrow, Amanda K.; Barõn, Anna E.; Maldonado, Fabien; Kammer, Michael N. (2025). Cancer Biomarkers, 42(1), CBM230054.

Before new medical tests, or biomarkers, are used in clinics, it is important to understand how useful they are for guiding patient care. One way to do this is to see how a test might change which patients are assigned to different treatment groups, but traditional methods have some limitations. To address this, researchers developed the intervention probability curve (IPC), which models how likely a doctor is to choose a particular treatment based on a patient’s estimated risk of disease.

In this study, the IPC was used to evaluate a new biomarker for suspected lung cancer, using data from the National Lung Screening Trial. The analysis estimated how the biomarker would affect decisions about interventions, such as biopsies or surgeries. The results suggested that 8% of patients with non-cancerous nodules could avoid unnecessary invasive procedures, while patients with actual cancer nodules would almost always still receive appropriate care (only 0.1% change).

Compared with traditional methods, the IPC provides a more detailed and continuous view of how a biomarker could influence clinical decisions. This approach shows that the IPC can be a valuable tool for assessing the potential impact of new biomarkers before they are implemented in everyday clinical practice.

Figure��1.

Population-based assessment of changes in intervention probability. While the mean of the distributions is similar, the spread of distributions shows the change in probability is more tightly clustered around zero in the cancer population than the change in probability.

Bedtime sliding scale insulin is unnecessary for hospitalized patients with bedtime glucose < 300 mg/dL: A nudge-based quasi-experiment

waddelma — Fri, 26 Sep 2025 19:56:15 +0000

Flory, James H., Vertosick, Emily Ann, Kuperman, Gilad J., Ancker, Jessica S., Kim, Scott Y.H., Fitzpatrick, Christine, Gould, Kimberly, Weiss, Everett, & Vickers, Andrew J. (2025). Diabetes Research and Clinical Practice, 228, 112428.

This study looked at how bedtime rapid-acting insulin is used in hospitalized patients with moderately high blood sugar, particularly in populations like cancer patients where previous research may not apply. Researchers changed the standard insulin order so that rapid-acting insulin at bedtime would only be automatically suggested for glucose levels of 300 mg/dL or higher. About half of the providers used this new order set over a two-month period, allowing comparison with the original approach. Among 458 patients, the new order set led to a 91% increase in the use of a less-aggressive insulin plan and lowered average morning glucose by 16 mg/dL. These findings suggest that rapid-acting bedtime insulin is not needed for glucose levels below 300 mg/dL and highlight that simple changes to order sets can be used to run effective, low-cost clinical trials without disrupting usual patient care.

Fig. 1��Study schema.