Scott Alexander has a post up arguing that we may be underestimating how big the effect size of anti-depression drugs is. But there’s another reason to think we’re misestimating the effect that cuts to the heart of how we measure depression and a lot of other things.
I like the level of detail you're going into here but I think the pattern of effect sizes we see for many conditions - not just depression - indicates some sort of general, conceptual problem with how we're measuring certain kinds of treatment outcomes. I came across this when I was researching my post on naltrexone (https://notpeerreviewed.wordpress.com/2021/05/10/can-we-take-the-devil-out-of-the-bottle-evidence-and-personal-experience-with-naltrexone-for-alcohol-abuse/); most of those studies used seemingly cardinal rather than ordinal outcomes and still showed effect sizes that don't seem to reflect the experiences people report, and I suspect we would see this for many treatments and conditions. I'll have to admit I don't have a good sense of what the answer might be.
What matters is not whether moving from 2 to 3 is as different as 3 to 4 for a sub item. What matters is whether it contributes to the same extent to the sum (and what it represents). This is a subtle but important distinction. Relatedly, you should think more about psychometrics, validity and reliability and how they apply to this situation.