Patient Prioritization in RPM Using Multi-Armed Bandits

Overview

In Remote Patient Monitoring (RPM), Chronic Care Management (CCM), and Remote Therapeutic Monitoring (RTM), clinician time is the most valuable—and the most constrained—resource. We built this project to answer a simple but frustrating question that nearly every clinical team faces: once you’ve handled the critical issues, how do you spend your remaining time in a way that is efficient, compliant, and financially sustainable?

Let me be absolutely clear: this project does not prioritize based on revenue at the expense of patient care. Critical alerts, symptoms, and emergencies are handled first by a separate triage algorithm built for safety, not billing. That’s always the priority. But after those have been handled, we’re left with a queue of lower-urgency patients—all deserving of care, but all competing for limited time.

So the question becomes: which of these patients should I call today? Who gets time? Who gets a reminder? And how do I ensure we’re not wasting effort while still providing meaningful care?

‍

The Problem

RPM programs are structured around strict time-based billing codes. For example:

CPT 99457: 20 minutes of interactive care required per patient per month.
CPT 99458: Add-on code for each additional 20-minute block.

But the 20-minute block is absolute. If you log 19 minutes with a patient, you cannot bill. And if you go over 20 but don’t reach 40, you’ve potentially wasted time. So the challenge is aligning care delivery with billing thresholds without treating care delivery as just billing.

Before this system, most clinics made prioritization decisions manually. Clinicians would scan dashboards, try to remember which patients had been contacted recently, and make subjective decisions about who to call next. It was inconsistent, unscalable, and cognitively exhausting.Small clinics especially struggled. They can’t afford optimization staff, complex data teams, or RPA contracts. But they deserve better tools.

‍

What I Built

To solve this, I modeled the patient prioritization process as a variation of the Multi-Armed Bandit (MAB) problem from decision theory. Each patient is treated as an "arm," with the reward being a successfully billed CPT code.We wanted a system that would:

Explore low-engagement patients early in the month, to ensure no one is overlooked.
Exploit patients nearing the 20-minute threshold later in the month, to ensure that work already invested doesn’t go to waste.

‍

How It Works (Technically)

Here’s the breakdown of what happens each day:

Input Preprocessing: For each patient, we calculate:

Minutes of care logged so far
Proximity to the next billable CPT threshold (e.g., how close are they to 20 minutes?)
Frequency and recency of prior contact

Expected Engagement Curve: We model the patient’s month as a curve, assuming at least 2 minutes of interaction per day across a 30-day window. This helps us distribute effort rationally across time.
Farness and Proximity Metrics: We calculate how far a patient is from a 20-minute billing block and how close they are to being eligible for additional care.
Priority Scoring with Epsilon-Greedy:

With probability epsilon (which varies over the month), we select patients for exploration.
Otherwise, we prioritize those with the highest proximity-to-revenue ratio.

Capping and Safeguards:

We cap the total minutes any one patient can receive to avoid overinvestment.
We apply audit flags if the patient is already over-served, preventing billing risk.

Daily Queue Generation:

The system outputs a ranked list of patients for the day, aligned to CPT goals and resource availability.

‍

Results

Time utilization improved dramatically. Clinics monitored more patients without hiring more staff.
Completion rates for 99457/99458 increased, with fewer patients stranded at 17 or 18 minutes.
The cognitive load for clinical teams dropped. Instead of thinking about who, they could focus on how to help.

One client went from manually auditing 100 patient charts a day to letting the algorithm generate a smart queue—and recovered over $80K in billable time in the first two months.

‍

What Makes This Different

This isn’t an LLM. This isn’t RPA. This is structured, rules-based AI built for the messiness of real-world clinical operations. It doesn’t hallucinate. It doesn’t guess. It follows rules that are easy to inspect, explain, and justify during an audit.And importantly, it’s built for the clinics that get ignored—the five-provider team in rural Arkansas, the FQHC scraping by on grants, the CCM startup trying to support independent doctors.These clinics don’t have time to waste. This system ensures they don’t.

‍

Conclusion

There is a right way to balance time, care, and compliance—and it starts by respecting clinical judgment while removing the burden of decision-making noise. By using a bandit model with a day-weighted epsilon-greedy strategy, we built a system that adapts over time, works with billing rules, and helps providers do more with less.

We didn’t cut corners. We automated them. And then we used that time to make care better.

This is what it looks like to build practical, transparent AI for healthcare. And it’s already saving providers time, money, and stress—without sacrificing a single patient’s care.