I like the level of detail you're going into here but I think the pattern of effect sizes we see for many conditions - not just depression - indicates some sort of general, conceptual problem with how we're measuring certain kinds of treatment outcomes. I came across this when I was researching my post on naltrexone (https://notpeerreviewed.wordpress.com/2021/05/10/can-we-take-the-devil-out-of-the-bottle-evidence-and-personal-experience-with-naltrexone-for-alcohol-abuse/); most of those studies used seemingly cardinal rather than ordinal outcomes and still showed effect sizes that don't seem to reflect the experiences people report, and I suspect we would see this for many treatments and conditions. I'll have to admit I don't have a good sense of what the answer might be.
What matters is not whether moving from 2 to 3 is as different as 3 to 4 for a sub item. What matters is whether it contributes to the same extent to the sum (and what it represents). This is a subtle but important distinction. Relatedly, you should think more about psychometrics, validity and reliability and how they apply to this situation.
It is, as I comment in the essay, possible that non-linear item scores add up to make a total score which is linear in depression, but I find it unlikely, particularly because the apparent convexity seems to hold across the items. As you're aware, scoring is by simple addition.
I like the level of detail you're going into here but I think the pattern of effect sizes we see for many conditions - not just depression - indicates some sort of general, conceptual problem with how we're measuring certain kinds of treatment outcomes. I came across this when I was researching my post on naltrexone (https://notpeerreviewed.wordpress.com/2021/05/10/can-we-take-the-devil-out-of-the-bottle-evidence-and-personal-experience-with-naltrexone-for-alcohol-abuse/); most of those studies used seemingly cardinal rather than ordinal outcomes and still showed effect sizes that don't seem to reflect the experiences people report, and I suspect we would see this for many treatments and conditions. I'll have to admit I don't have a good sense of what the answer might be.
What matters is not whether moving from 2 to 3 is as different as 3 to 4 for a sub item. What matters is whether it contributes to the same extent to the sum (and what it represents). This is a subtle but important distinction. Relatedly, you should think more about psychometrics, validity and reliability and how they apply to this situation.
It is, as I comment in the essay, possible that non-linear item scores add up to make a total score which is linear in depression, but I find it unlikely, particularly because the apparent convexity seems to hold across the items. As you're aware, scoring is by simple addition.