PSScreen: Partially Supervised Multiple Retinal Disease Screening

BibTex

Copy

@misc{liu2025psscreenpartiallysupervised,
      title={PSScreen: Partially Supervised Multiple Retinal Disease Screening},
      author={Qing Liu and Boyi Zheng},
      year={2025},
      eprint={2508.10549},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2508.10549},
}

AI Audio Lecture + Q&A

0:00 / 0:00

PSScreen: Partially Supervised Multiple Retinal Disease Screening

Transcript

John: Welcome to Advanced Topics in Medical Computer Vision. Today's lecture is on a paper titled 'PSScreen: Partially Supervised Multiple Retinal Disease Screening' from researchers at the University of Oulu. John: We've seen a lot of recent work trying to apply large foundation models to medicine, like 'RET-CLIP', which leverages vast unlabeled data. This paper, however, takes a more targeted approach. It questions if we can build a highly effective, specialized model without needing perfectly complete datasets. Yes, Noah? Noah: Excuse me, Professor. Could you clarify what 'partially supervised' means in this context? Does it just mean semi-supervised, with a mix of labeled and unlabeled data? John: That's a great clarifying question. It's a bit more complex here. It's not just about labeled versus unlabeled. Imagine you have datasets from three different hospitals. Hospital A labels for glaucoma and diabetic retinopathy. Hospital B only labels for cataracts. And Hospital C labels for glaucoma and myopia. No single dataset is complete. PSScreen is designed to learn from this collection of 'partially' labeled datasets, where for any given image, some disease labels are known, but others are explicitly absent. Noah: So the main challenge isn't just missing labels, but also that the data comes from different sources. John: Precisely. That's the core contribution. The paper tackles two fundamental problems of real-world medical data simultaneously: the 'label absent' issue and the 'domain shift' issue. Data from different clinics looks different due to varying cameras, lighting, and patient populations. Most existing partially supervised methods assume data comes from the same distribution, which is a major limitation PSScreen aims to overcome. To do this, it proposes a two-stream network architecture. Noah: A two-stream network? Why two? John: One stream is deterministic, and the other is probabilistic. Think of the deterministic stream as a standard, confident learner. It trains on the ground-truth labels that are actually available. Its job is to learn the explicit, task-relevant features for disease classification as accurately as possible from the data it's given. Noah: Okay, so that's the baseline learner. What does the probabilistic stream do? John: The probabilistic stream is designed for generalization. It incorporates a module that intentionally injects uncertainty into the feature learning process. By modeling feature statistics as distributions rather than fixed values, it learns to be robust against the variations, or domain shifts, it sees across the different datasets. This makes it much better at handling images from a new, unseen hospital. Noah: How does it inject that uncertainty? Is it like a more complex version of dropout? John: It's more structured than dropout. The paper uses something called a DSU block. It calculates channel-wise means and variances for features, models them as Gaussian distributions, and then samples from these distributions to modify the features. The idea is to simulate the kinds of shifts you'd see between domains, forcing the model to learn features that are invariant to these changes. The two streams don't work in isolation, though. They're connected through several clever loss functions. Noah: So how do they talk to each other? John: There are three key mechanisms. First is feature distillation, which encourages the probabilistic features to align with the deterministic ones, ensuring the model doesn't lose its discriminative power while generalizing. Second is self-distillation, where the deterministic stream acts as a 'teacher' for the 'student' probabilistic stream on known classes. But the most interesting part is the third mechanism: pseudo-label consistency for the unknown classes. Noah: Wait, I'm a bit skeptical about pseudo-labeling. Isn't there a high risk of the model just reinforcing its own mistakes? John: That's a valid concern and a common failure mode. PSScreen mitigates this risk by being selective. The 'teacher' deterministic stream only generates a pseudo-label for an unknown class if its prediction is extremely confident, say, above a 95% threshold. This high-confidence pseudo-label is then used to train the probabilistic 'student' stream. By using one stream to supervise the other, and only with high-certainty examples, it reduces the risk of error propagation while still making use of the unlabeled parts of the data. Noah: That makes sense. It's a form of knowledge transfer. It seems this architecture is quite powerful for this specific problem. Do you see this shifting how we approach medical AI model development? John: I believe so. The primary implication is that it provides a path to building robust, deployable models without waiting for perfectly curated, fully-annotated, single-source datasets, which are practically impossible to get at scale. It moves the field towards using data as it exists in the real world—fragmented and incomplete. The results are telling: PSScreen significantly outperformed other partially supervised methods on out-of-domain data. It even performed better on a zero-shot task than large foundation models. Noah: So you're saying a specialized model trained on messy data can beat a massive generalist model? John: For this specific task, yes. It demonstrates that a thoughtful architecture designed for the specific challenges of the data can outperform a more generic approach. And this framework isn't necessarily limited to ophthalmology. The core principles—the deterministic-probabilistic streams, the cross-stream distillation, and the confident pseudo-labeling—could be adapted to other multi-domain, multi-label problems in radiology or pathology. John: So to wrap up, the key takeaway here is that PSScreen provides a pragmatic blueprint for leveraging real-world, imperfect medical data. Instead of seeing domain shifts and missing labels as obstacles to be cleaned away, it incorporates them as central parts of the training process, using uncertainty modeling and knowledge distillation to build a more robust and generalizable model. This is a crucial step towards creating AI tools that are actually useful in a clinical setting. John: Thanks for listening. If you have any further questions, ask our AI assistant or drop a comment.

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Browser Extension

Dark mode

PSScreen: Partially Supervised Multiple Retinal Disease Screening