Here is a critique which also mentions Shalizi’s comment, and why he is missing the point:
“Jake, this is a good review and I agree with many of your major conclusions. However, your summary of the literature on g has several problems.
[g-factor] s predicated on the notion that performance across different cognitive batteries tends to be positively correlated
A quibble — the positive correlation between performance on different test items is not just a notion but an empirical observation that has been supported by millions of data points over the last century. More on this below.
Psychological tests for g-factor use principal component analysis — a way of identifying different factors in data sets that involve mixtures of effects.
Factor analysis, not PCA, is the method used by psychometricians. They are similar in principle but not in application.
g-factor is very controversial.
Not among intelligence researchers.
In this review, we emphasize intelligence in the sense of reasoning and novel problem-solving ability (BOX 1). Also called FLUID INTELLIGENCE(Gf), it is related to analytical intelligence1. Intelligence in this sense is not at all controversial…
ref.1
[These authors go on to explain that in their view Gf and g are one and the same.]
From another review:
Here (as in later sections) much of our discussion is devoted to the dominant psychometric approach, which has not only inspired the most research and attracted the most attention (up to this time) but is by far the most widely used in practical settings.
ref.2
This was published over a decade ago. The psychometric approach has continued to attract the most research and attention and is still by far the most widely used.
The second and broader critique of this work is whether the tests that we have for “intelligence” measures something useful in the brain.
There’s wide agreement that the tests measure something useful about human behavior:
In summary, intelligence test scores predict a wide range of social outcomes with varying degrees of success. Correlations are highest for school achievement, where they account for about a quarter of the variance. They are somewhat lower for job performance, and very low for negatively valued outcomes such as criminality. In general, intelligence tests measure only some of the many personal characteristics that are relevant to life in contemporary America. Those characteristics are never the only influence on outcomes, though in the case of school performance they may well be the strongest.
ref.2
A more standard criticism of g:
while the g-based factor hierarchy is the most widely accepted current view of the structure of abilities, some theorists regard it as misleading (Ceci, 1990).
ref.2
that is:
One view is that the general factor (g) is largely responsible for better performance on various measures40,85.A contrary view accepts the empirical,factor-analytic result, but interprets it as reflecting multiple abilities each with corresponding mechanisms141. In principle, factor analysis cannot distinguish between these two theories, whereas biological methods potentially could10,22,36. Other perspectives recognize the voluminous evidence for positive correlations between tasks and subfactors, but hold that practical, creative142 and social or emotion-related73 abilities are also essential ingredients in successful adaptation that are not assessed in typical intelligence tests. Further, estimates of individual competence, as inferred from test performance, can be influenced by remarkably subtle situational factors, the power and pervasiveness of which are typically underestimated2,136,137,143.
ref.1
The concepts of IQ and g-factor have been questioned by several authors. Stephen Jay Gould actually wrote a whole book — The Mismeasure of Man — trying to debunk the assumption that intelligence can be measured in a single number. (For a more recent and excellent critique, I recommend this article by Cosma Shalizi.) The common theme among many of these critiques is that the tests for intelligence conflate numerous separable brain processes into a single number. As a consequence, 1) you aren’t sure what you are measuring, 2) you can’t associate what you are measuring with a particular region (the output may be the result of an emergent process of several regions), and 3) you may be eliding significant differences in performance across individuals that you would recognize with a better test.
You give too much credit to Gould and Shalizi. Their primary criticisms are entirely less reasonable than the points you make.
The main thrusts of their arguments are that test data do not statistically support a g-factor. Gould’s argument is statistically incompetent (for a statistican’s critique see Measuring intelligence: facts and fallacies by David J. Bartholomew, 2004). Shalizi’s criticism is incredibly sophisticated, but likewise incorrect. In a nutshell, Shalizi is trying to argue around the positive correlations between test batteries. If those correlations didn’t exist, his argument would be meaningful. However, as I noted above, these intercorrelations are one of the best documented patterns in the social sciences.
significant differences in performance across individuals that you would recognize with a better test.
It’s possibly not well known that enormous efforts have gone into trying to make tests that have practical validity for life outcomes yet do not mostly measure g. See for example the works of Gardner and Sternberg. The current consensus is that their efforts have failed. A notable exception might be measures of personality.
Conclusion:
Ultimately, we need to use biological measures such as cortical volume to determine what g really is. One possible approach is to combine chronometric measurements (e.g. reaction time) with brain imaging studies. Genetically informed study designs have a role to play here too.
One reply on “Cosma Shalizi on IQ”
Here is a critique which also mentions Shalizi’s comment, and why he is missing the point:
“Jake, this is a good review and I agree with many of your major conclusions. However, your summary of the literature on g has several problems.
[g-factor] s predicated on the notion that performance across different cognitive batteries tends to be positively correlated
A quibble — the positive correlation between performance on different test items is not just a notion but an empirical observation that has been supported by millions of data points over the last century. More on this below.
Psychological tests for g-factor use principal component analysis — a way of identifying different factors in data sets that involve mixtures of effects.
Factor analysis, not PCA, is the method used by psychometricians. They are similar in principle but not in application.
g-factor is very controversial.
Not among intelligence researchers.
In this review, we emphasize intelligence in the sense of reasoning and novel problem-solving ability (BOX 1). Also called FLUID INTELLIGENCE(Gf), it is related to analytical intelligence1. Intelligence in this sense is not at all controversial…
ref.1
[These authors go on to explain that in their view Gf and g are one and the same.]
From another review:
Here (as in later sections) much of our discussion is devoted to the dominant psychometric approach, which has not only inspired the most research and attracted the most attention (up to this time) but is by far the most widely used in practical settings.
ref.2
This was published over a decade ago. The psychometric approach has continued to attract the most research and attention and is still by far the most widely used.
The second and broader critique of this work is whether the tests that we have for “intelligence” measures something useful in the brain.
There’s wide agreement that the tests measure something useful about human behavior:
In summary, intelligence test scores predict a wide range of social outcomes with varying degrees of success. Correlations are highest for school achievement, where they account for about a quarter of the variance. They are somewhat lower for job performance, and very low for negatively valued outcomes such as criminality. In general, intelligence tests measure only some of the many personal characteristics that are relevant to life in contemporary America. Those characteristics are never the only influence on outcomes, though in the case of school performance they may well be the strongest.
ref.2
A more standard criticism of g:
while the g-based factor hierarchy is the most widely accepted current view of the structure of abilities, some theorists regard it as misleading (Ceci, 1990).
ref.2
that is:
One view is that the general factor (g) is largely responsible for better performance on various measures40,85.A contrary view accepts the empirical,factor-analytic result, but interprets it as reflecting multiple abilities each with corresponding mechanisms141. In principle, factor analysis cannot distinguish between these two theories, whereas biological methods potentially could10,22,36. Other perspectives recognize the voluminous evidence for positive correlations between tasks and subfactors, but hold that practical, creative142 and social or emotion-related73 abilities are also essential ingredients in successful adaptation that are not assessed in typical intelligence tests. Further, estimates of individual competence, as inferred from test performance, can be influenced by remarkably subtle situational factors, the power and pervasiveness of which are typically underestimated2,136,137,143.
ref.1
The concepts of IQ and g-factor have been questioned by several authors. Stephen Jay Gould actually wrote a whole book — The Mismeasure of Man — trying to debunk the assumption that intelligence can be measured in a single number. (For a more recent and excellent critique, I recommend this article by Cosma Shalizi.) The common theme among many of these critiques is that the tests for intelligence conflate numerous separable brain processes into a single number. As a consequence, 1) you aren’t sure what you are measuring, 2) you can’t associate what you are measuring with a particular region (the output may be the result of an emergent process of several regions), and 3) you may be eliding significant differences in performance across individuals that you would recognize with a better test.
You give too much credit to Gould and Shalizi. Their primary criticisms are entirely less reasonable than the points you make.
The main thrusts of their arguments are that test data do not statistically support a g-factor. Gould’s argument is statistically incompetent (for a statistican’s critique see Measuring intelligence: facts and fallacies by David J. Bartholomew, 2004). Shalizi’s criticism is incredibly sophisticated, but likewise incorrect. In a nutshell, Shalizi is trying to argue around the positive correlations between test batteries. If those correlations didn’t exist, his argument would be meaningful. However, as I noted above, these intercorrelations are one of the best documented patterns in the social sciences.
significant differences in performance across individuals that you would recognize with a better test.
It’s possibly not well known that enormous efforts have gone into trying to make tests that have practical validity for life outcomes yet do not mostly measure g. See for example the works of Gardner and Sternberg. The current consensus is that their efforts have failed. A notable exception might be measures of personality.
Conclusion:
Ultimately, we need to use biological measures such as cortical volume to determine what g really is. One possible approach is to combine chronometric measurements (e.g. reaction time) with brain imaging studies. Genetically informed study designs have a role to play here too.
references:
[1] http://www.loni.ucla.edu/~thompson/PDF/nrn0604-GrayThompson.pdf
[2] http://www.gifted.uconn.edu/siegle/research/Correlation/Intelligence.pdf“