Which should I take for self-assessment?

Either works. PHQ-9 is faster. If you're going to follow up with a clinician about the result, PHQ-9 maps more directly to what they'll want to know. If you're doing deeper self-reflection, SDS covers more ground.

Yes. Taking both gives you two data points. If they agree (both elevated or both normal), the signal is robust. If they disagree substantially, either reflects the tests' differences (one capturing something the other misses) or represents measurement noise — consult a clinician to sort out.

Is BDI (Beck Depression Inventory) better than either?

Different. BDI is a 21-item depression scale with strong validation, heavier cognitive-symptom weighting than PHQ-9 or SDS. BDI-II (revised 1996) is widely used in research and clinical settings. Commercial instrument (requires license). For free self-assessment, PHQ-9 or SDS are easier to access; BDI remains the gold standard in many clinical research contexts.

Best single source for comparative validation data?

Kroenke's papers on the PHQ-9 (starting with the 2001 validation paper in Journal of General Internal Medicine) for PHQ-9 specifically. For comparative: Nezu, Ronan, Meadows, and McClure's Practitioner's Guide to Empirically Based Measures of Depression (2000) covers all major instruments.

PHQ-9 vs. SDS: Which Depression Screen Is Right for Which Purpose - PsyZenLab

PHQ-9 structure

The PHQ-9 asks about nine symptoms over the past 2 weeks, rated 0–3: 1. Little interest or pleasure in doing things 2. Feeling down, depressed, or hopeless 3. Trouble falling or staying asleep, or sleeping too much 4. Feeling tired or having little energy 5. Poor appetite or overeating 6. Feeling bad about yourself 7. Trouble concentrating 8. Moving or speaking slowly; or being restless/fidgety 9. Thoughts that you would be better off dead or hurting yourself Total score range: 0–27. Cutoffs: - 0–4: minimal - 5–9: mild - 10–14: moderate - 15–19: moderately severe - 20–27: severe The PHQ-9 maps almost 1:1 to DSM-5 major depressive disorder criteria. This is deliberate — the PHQ-9 was designed to assess DSM criteria directly.

SDS structure (brief recap)

SDS has 20 items covering affective (feeling sad), psychological (concentration, fatigue), somatic (sleep, appetite, heart rate, digestion), and cognitive (guilt, suicidal thoughts) symptoms. Raw score converts to index 25–100. Cutoffs: <50 normal, 50–59 mild, 60–69 moderate, 70+ severe. See sds-depression-interpretation article for detail. Key contrast with PHQ-9: SDS has more somatic items (8 out of 20 = 40% of items) and distributes the symptom coverage more broadly. PHQ-9 is more compact and tightly DSM-matched.

Direct comparison

Dimension	PHQ-9	SDS
Items	9	20
Time to complete	~3 minutes	~8 minutes
Response scale	0-3 (4 options)	1-4 (4 options)
Score range	0-27	Index 25-100
DSM mapping	Direct	Indirect
Somatic weighting	Moderate (items 3, 4, 5, 8)	Heavy (8/20 items)
Clinical adoption	Very widespread	Moderate, declining in US
Cross-cultural use	Wide, many validated translations	Very wide, longer history
EMR integration	Common	Less common
Cost	Free	Free (public domain)

When PHQ-9 is better

Most contemporary clinical contexts: - **Primary care screening**: brevity matters; PHQ-9 fits 3 minutes of a 15-minute appointment where SDS would take 8+ - **Mental health treatment monitoring**: weekly or biweekly re-administration tracks response to treatment; PHQ-9's shorter format supports routine monitoring - **Integration with DSM-5 diagnosis**: PHQ-9 scores map directly to diagnostic criteria, supporting clinician workflow - **Telehealth / electronic intake**: PHQ-9 is standard in most electronic patient-portal systems - **Research use**: PHQ-9 has become the default depression screening instrument in contemporary clinical research

When SDS is better

**Populations with somatic-heavy presentations**: cultures or individuals where depression shows as physical symptoms (body aches, digestive issues, fatigue without clear mood depression). SDS's 8 somatic items capture this better than PHQ-9's fewer. **Screening where broader symptom coverage matters**: comprehensive assessment rather than quick triage. SDS covers more symptom space. **Specific research applications where longitudinal comparison to historical data matters**: SDS has 60-year longitudinal literature; PHQ-9's history goes back only to 2001. **Self-administered depth**: for someone doing serious self-assessment rather than triage, SDS's broader questioning generates more reflective material.

Important shared limitations

Both screens share failure modes: **Not diagnostic**: both are screens, not diagnostic instruments. A positive score flags need for clinical evaluation; it does not constitute a diagnosis. **Base-rate vulnerable**: in low-prevalence populations (general community screening), both produce meaningful false-positive rates. See base-rate-neglect-tests article. **Miss specific presentations**: atypical depression (mood reactive to events), melancholic depression, and depression with psychotic features are incompletely captured by both. **Cultural variation**: both require appropriate translation and validation for non-Western populations. See cultural-validity-tests article. **Self-report limitations**: respondents with dismissive-avoidant attachment, alexithymia, or active defensive patterns may under-report symptoms on both. **Suicide item handling**: both include suicide-related items. Any endorsement of suicidal ideation warrants direct clinical attention regardless of total score. PsyZenLab's implementation of both screens provides crisis resource information when these items are endorsed.

PHQ-9 vs. SDS: Which Depression Screen Is Right for Which Purpose

Key Takeaways