Why interval drills alone don't make you hear music — and what does
The scientific case against isolated interval drilling and for functional, scale-degree-based ear training.
Most people who pick up an ear-training app for the first time are handed the same exercise: two notes play, you guess the interval. Repeat. For decades this has been the canonical “ear training” task in conservatories, textbooks, and apps. There is now a strong empirical case that, on its own, it is not what builds a musical ear.
The problem: pitch is heard tonally, not as raw distance
When you listen to a melody in a song, you do not perceive the second note as “a major third above the first.” You perceive it as the third of the key — a specific functional sensation, colored by everything else happening in that key. Decades of work by Carol Krumhansl and colleagues on the tonal hierarchy established that listeners — even those without formal musical training — rate notes against an internal scaffold of stability built from the cadences and scale material they have just heard [1]. After a major-mode context, the tonic receives the highest stability rating, then the dominant, then the mediant, then the rest of the diatonic notes, then non-scale notes [1:1].
What this means in practice: the same major third — say, C to E — does not sound the same to a trained listener when E is the third of C major as when it is the fifth of A minor as when it is a chromatic neighbor in B♭ major. It is the same physical interval; it has different functional identities. The ear that learns music learns those identities, not the abstract distance.
What the research suggests
A growing pedagogical consensus, drawing on tonal-hierarchy work and on practitioner literature (Bruce Arnold, Berklee functional aural-skills tracks, the Use Your Ear network), holds that scale-degree (functional) ear training transfers to musical hearing more efficiently than isolated interval training [2]. Interval recognition is necessary — you need the raw discrimination — but it is not sufficient.
Aural-skills research within higher music education reinforces the same point in a different direction. Pomerleau-Turcotte and colleagues showed that high achievement in multi-part aural dictation — a holistic task that demands the listener track multiple voices in tonal context — was the best predictor of high achievement in other aural-skill areas, including sight-singing [3]. Holistic, contextual hearing tracks together; isolated drilling does not bootstrap it.
Why isolated interval drilling produces frustrating plateaus
Two failure modes are common.
The “mnemonic ceiling.” Many learners use song mnemonics — Here Comes the Bride for an ascending fourth, Star Wars for an ascending fifth. This works for clean test conditions and fails as soon as the interval appears inside a real piece, because the memorized song is no longer the surrounding context [2:1]. The mnemonic is a stand-in for context; once context is supplied by the music itself, the mnemonic becomes interference rather than help.
No transfer to harmony. A learner who has drilled all twelve melodic intervals to 95% accuracy still often cannot identify a vi chord by ear, because the chord is not heard as “a stack of intervals” — it is heard as a function in the key. Without functional training, the interval skill does not generalize upward.
What functional training actually trains
In a scale-degree exercise, the app first establishes the key — typically by playing a cadence or the scale — and then plays a single note. The learner names that note’s degree (1, 2, 3, 4, 5, 6, 7, or, in solfège, do/re/mi/fa/sol/la/ti). The cognitive task is fundamentally different: you are matching the heard pitch against an internal map of the key you were just given.
This is the same skill the trained listener uses to follow a melody, anticipate a resolution, sing along in the right key, or transcribe a tune. It is the skill that makes a trained ear feel “musical” rather than “calculating.”
The case for both, in the right order
The strongest position the evidence supports is not “intervals are useless” — it is “intervals belong inside a tonal frame.” Begin with same/different and higher/lower discrimination. Add tonality (find-the-tonic, stable-vs-unstable). Train scale degrees against an established key. Then layer interval recognition on top, ideally with intervals presented as movements between scale degrees rather than between random pitches. Finally, scale to chords, progressions, and dictation [4].
This is the order Fifths uses, and it is not arbitrary. It reflects what is known about how a tonal ear is actually constructed.
Practical takeaway
If you are a self-directed learner working with any ear-training tool — Fifths, EarMaster, Tonedear, the Functional Ear Trainer, or your own Anki deck — the highest-leverage adjustment you can make is to ensure every exercise is presented inside an established key. Either your tool plays a cadence or scale before each question, or you sing one yourself before listening. The cost is a few seconds per question. The benefit, over months of practice, is the difference between drilling test items and actually building the ear that hears music.
References
Krumhansl, C. L. (1990). Cognitive Foundations of Musical Pitch. Oxford University Press. See also: Krumhansl, C. L., & Kessler, E. J. (1982). Tracing the dynamic changes in perceived tonal organization in a spatial representation of musical keys. Psychological Review, 89(4), 334–368. https://doi.org/10.1037/0033-295X.89.4.334 ↩︎ ↩︎
Use Your Ear. Solfege-based ear training: why does it work? https://www.useyourear.com/blog/solfege-based-ear-training. See also the practitioner discussion at https://tonedear.com/ear-training/functional-solfege-scale-degrees. ↩︎ ↩︎
Pomerleau-Turcotte, J., Moreno Sala, M. T., Dubé, F., & Vachon, F. (2022). Experiential and Cognitive Predictors of Sight-Singing Performance in Music Higher Education. Journal of Research in Music Education, 70(3). https://doi.org/10.1177/00224294211049425. PMC: https://pmc.ncbi.nlm.nih.gov/articles/PMC9242514/ ↩︎
Cleland, K. D., & Dobrea-Grindahl, M. (2021). Developing Musicianship through Aural Skills: A Holistic Approach to Sight Singing and Ear Training (2nd ed.). Routledge. https://www.routledge.com/9780367030773 ↩︎