DDXPlus
Distributional diagnosis over a closed set of 49 pathologies. The model reads a patient case — demographics, then a transcript of yes/no and categorical symptom and antecedent questions — and must output a probability distribution over the 49 conditions. Unlike the other domains, the gold target is a genuine distribution: DDXPlus ships a per-case differential with calibrated probabilities, not a single right answer.
Dataset entry
Demographics: Age: 80 Sex: M Symptoms: Have you been coughing up blood? Yes Do you have pain somewhere, related to your reason for consulting? Yes Characterize your pain: a knife stroke Do you feel pain somewhere? posterior chest wall(L) How intense is the pain? 1 Does the pain radiate to another location? nowhere How precisely is the pain located? 6 How fast did the pain appear? 4 Are you experiencing shortness of breath or difficulty breathing in a significant way? Yes Do you have a cough that produces colored or more abundant sputum than usual? Yes Do you have a fever (either felt or measured)? Yes Have you had chills or shivers? Yes Do you have skin lesions or redness related to your condition? Yes What color is the rash? pink Do your lesions peel off? Y Is the rash swollen? 1 Where is the affected region located? back of the neck, thoracic spine, buttock(R), buttock(L), flank(R) How intense is the pain caused by the rash? 4 Is the lesion larger than 1cm? N How severe is the itching? 1 Have you noticed any new fatigue or general malaise? Yes Do you have a cough? Yes Antecedents: Do you drink alcohol excessively? Yes Do you have heart failure? Yes Have you ever had pneumonia? Yes Do you have COPD? Yes Do you have asthma / used a bronchodilator? Yes Have you had surgery within the last month? Yes Have you traveled out of the country in the last 4 weeks? N Is your BMI less than 18.5? Yes
<w_other> residual bucket.Representations
The same 49-condition target, three output formats the model writes (real rollouts).
Verbalized hypothesis list. Named conditions with weights and a residual bucket.
<hypotheses>
<h1>Acute dystonic reactions</h1><w1>0.70</w1>
<h2>Myasthenia gravis</h2><w2>0.15</w2>
<h3>Guillain-Barré syndrome</h3><w3>0.10</w3>
<w_other>0.05</w_other>
</hypotheses>
Class-hierarchy DSL. Latent classes with priors, per-class disease weights (d) and symptom likelihoods (s+); the posterior over the 49 is computed in closed form.
<spec>
u 0.05
c cardiac_emergency=0.6
d D36=0.45
d D37=0.3
d D47=0.2
d D10=0.15
s+ chest_pain=0.8
s+ palpitations=0.9
s+ shortness_of_breath=0.7
s+ fear_of_dying=0.85
c psychiatric_emergency=0.4
d D33=0.4
d D08=0.3
d D23=0.2
s+ anxiety=0.8
s+ numbness=0.7
s+ detachment=0.7
</spec>
Probabilistic program. The model writes the evidence function; the harness runs inference for the posterior over diseases.
def evidence(dx):
# 54M: hemoptysis, dyspnea, productive cough, COPD, cystic fibrosis, prior pneumonia
pyro.sample("respiratory_symptoms", dist.Bernoulli(
per_disease({0:.9, 11:.85, 13:.8, 34:.75, 44:.7, 37:.65}, default=.45)[dx]), obs=ONE)
pyro.sample("chronic_lung_history", dist.Bernoulli(
per_disease({0:.7, 11:.65, 13:.6, 34:.55, 37:.5}, default=.35)[dx]), obs=ONE)