Here's a linear regression that uses 5 parameters (6 including an intercept) to capture more than 5/6 of the variance while using up less than half of the degrees of freedom. This should be viewed as a descriptive dimensional reduction, i.e. an attempt to describe the data well with reasonably few parameters; when I'm doing something like inference or forecasting I have a stronger prejudice towards using fewer parameters and preserving more degrees of freedom (or at least regulating the fit somehow), but this is sufficiently pseudoscientific that I feel I might as well include the interaction terms that seem to pull their weight:

robust.se intercept 63437.5 1489.285 42.60 E 1375.0 2183.031 0.63 F -1875.0 2183.031 -0.86 J 6100.0 1815.196 3.36 EJ 7450.0 2944.699 2.53 FJ -5450.0 2944.699 -1.85

To try to get useful data reduction, I have refused to use any cubic (or higher) interaction term; including the first- and second-order terms that I've left out only explains 20% of the variance that is left in this fit, so I feel the clarity of parsimony here is more valuable than a slightly better fit. The first thing to note is that I've dropped the S/N axis entirely; it doesn't do much. Also, the P types have very little variance (a standard deviation of $2700 versus $6100 for the J types); they're largely in the low sixties.

The J types are a bit more interesting. Their mean is very close to $70,000, but EJs make about $9000 a year more than IJs, and TJs make about $7000 more than FJs. The IFJs are right in the middle of the P's; it's the other J types that do well.

The linked article suggests some problems with this; some of the things it raises as problems don't really bother me, but it doesn't mention that these are

*household*incomes, which means that you're conflating income effects, family size effects, and effects from affinities of people from personality types for spouses of other personality types.

## 1 comment:

In fact, 63000+(8400*E+6800*T)*J captures 80% of the variance, though even on a purely descriptive basis dropping the linear terms while keeping the interaction terms gives me the heebie-jeebies a bit.

Post a Comment