The test-retest reliability problem
The most cited criticism of MBTI is test-retest reliability. Multiple studies — most prominently David Pittenger's influential 1993 review in Consulting Psychology Journal — have found that approximately 50% of people who take the MBTI receive a different 4-letter type when retested within a short interval (typically 5 weeks to several months). This is a serious problem. A personality instrument that assigns you to INTJ today and INFJ next month cannot be measuring a stable trait reliably. However, the retest-variation is not random. It's concentrated at the dichotomy cutoffs. Someone who scores strongly Introverted (I at 85% preference) rarely retests as Extraverted. Someone who scores marginally Introverted (I at 52%) frequently retests as Extraverted, since their underlying trait score is near the cutoff and small measurement noise flips the category. This is a specific form of the dichotomization problem discussed below. The underlying preferences may be reasonably stable; it's the forced binary categorization that is unstable.
Construct validity against the Big Five
When MBTI is compared to the Big Five — the most empirically robust personality taxonomy — the four MBTI dimensions correlate with Big Five factors but not perfectly. Based on McCrae and Costa's 1989 study (replicated several times since): - E/I correlates with Extraversion: r ≈ 0.74 (strongest correspondence) - S/N correlates with Openness: r ≈ 0.72 - T/F correlates with Agreeableness: r ≈ 0.44 (weakest) - J/P correlates with Conscientiousness: r ≈ -0.49 Noted: MBTI has no measure equivalent to Neuroticism — the Big Five factor most reliably associated with mental health outcomes. This is a significant gap for clinical use. The weaker correlations (T/F and J/P) suggest these dimensions may be measuring multiple things that don't separate cleanly. In particular, T/F conflates cognitive style (logical vs. values-based decision-making) with social orientation (agreeable vs. competitive), which are empirically separable. Bottom line on construct validity: MBTI captures real trait dimensions but with less precision than the Big Five and with some muddy conflations.
The dichotomization problem
MBTI's most methodologically questionable choice is treating continuous traits as binary categories. There is no "I know you're moderately introverted" in MBTI — you are I or E. Empirically, the underlying preferences are not bimodal. Distributions of preference scores are unimodal, meaning most people are somewhere in the middle on each dimension rather than clustered at the poles. Forcing a binary classification on a unimodal distribution produces large misclassification near the center and exaggerates the differences between adjacent types. INTJ and INFJ share dominant Ni and inferior Se, differ only in auxiliary (Te vs. Fe) and tertiary (Fi vs. Ti). Someone with a 52% T preference being classified as INTJ, and someone with 48% T preference as INFJ, are more similar to each other than either is to a strong-preference INTJ or INFJ. MBTI treats them as different types. This is not just a theoretical concern. Practical decisions made on MBTI type (career recommendations, team composition) can easily be wrong for near-cutoff individuals whose underlying traits don't match their assigned type robustly.
Where MBTI is defensibly useful
Despite these problems, MBTI has genuine utility in specific contexts: **Self-reflection**: the 16 type descriptions are memorable, evocative, and reasonably accurate at the prototype level. Reading your type's description often produces "yes, that's me" recognition for useful reasons, even when the specific category assignment has error bars. **Communication scaffolding**: in teams, relationships, and families, MBTI vocabulary provides a non-judgmental way to discuss differences. "I'm an ENFP and you're an ISTJ; we approach planning completely differently" opens conversations that wouldn't happen otherwise. **Entry to Jungian cognitive-function theory**: MBTI is a flawed instrument for accessing Jungian theory, but it is the most common entry point. The underlying cognitive-function theory — Ni, Ne, Se, Si, Ti, Te, Fi, Fe — is independently interesting and clinically useful even when the MBTI categorization is imprecise. **Rough orientation for meditation method selection**: as discussed in other articles in this blog, type-based method recommendations are coarse but not useless. The error bars on MBTI are smaller than the differences between method families (kōan vs. shikantaza vs. mettā).
Where MBTI should not be used
**Hiring decisions**: this is where MBTI has been most thoroughly criticized and rightly so. The test-retest unreliability and dichotomization problems make MBTI an unsound basis for employment decisions. Multiple organizations have published position statements against MBTI in hiring (the Myers-Briggs Company itself discourages this use). Big Five and conscientiousness-focused instruments are better for predicting job performance. **Relationship compatibility matching**: the evidence that any specific type-combinations produce better or worse relationship outcomes is weak. Attachment style is a far stronger predictor of relationship outcome than MBTI type. **Clinical psychological assessment**: for clinical diagnosis or significant treatment planning, MBTI should be used alongside empirically-validated instruments, not as primary data. **Fine-grained distinctions**: the difference between adjacent types (INTJ vs. INFJ, ISTP vs. ISTJ) should not carry significant decision weight. Use MBTI for the broader temperament grouping (NT, NF, SJ, SP) and treat within-temperament distinctions as low-confidence.
How PsyZenLab handles this
We provide MBTI as one of several personality instruments, with explicit disclaimers about its methodological limitations. Our internal recommendation logic uses MBTI at the temperament level (NT/NF/SJ/SP) for meditation method fit, not at the 16-type level for fine distinctions. For users wanting more rigorous personality data, we offer the Big Five (NEO-PI-FF adaptation) and a Jungian cognitive-function test that avoids the dichotomization problem by reporting function strength on a continuous scale rather than forcing a 4-letter type code. The honest position: MBTI is useful, limited, and commonly misused. Using it well means using it lightly — as rough orientation, not as identity or prediction.
