Physicists and chemists learned a long time ago that you have to standardize measurements and methodology if you’re going to discover truths that nobody has found before. But many social scientists act otherwise, and it’s probably a major reason why so many of our “findings” in the journals cannot be replicated. When it comes to psychological tests, researchers often slice and dice them so they can pack more measures in a survey. The thinking seems to be, “It’s better to measure more things somewhat poorly than to measure fewer things better.” This strategy does give you a better chance to find something, anything, but you’re fishing, not hunting.
All things considered, I believe the RWA Scale gives researchers a pretty good way to measure the authoritarian follower personality. Not everyone agrees, but almost everyone who uses it chops it up and uses just part of the scale. I have wailed against this practice, but it does no good. Investigators pick their own favorite items from the test and assume these will measure whatever the whole test would measure. Often, it seems, they picked items they thought would confirm an hypothesis—which is not exactly resolutely pursuing the truth whatever it may be. But some of the shortened scales are not ever balanced against response sets. And while picking items that most obviously tap, say, authoritarian aggression will increase the chances of finding that “authoritarians” are hostile in some other way, all you’ve really shown is that people who have one set of hostile thoughts also have others. To make the claimed connection to authoritarianism, you have to measure the whole trait as it is defined, or else you’re an illusionist. Fooling yourself. And even when the subset of items selected does cover the named construct, a highly shortened scale such as the 2-, 4-, and 6-item versions used will have markedly lower reliability, and hence less validity, and hence less power to measure the thing, than the real scale will.
I have instead treasured measuring traits well and truly. Toward that end I have spent time over the decades improving my scales by testing new items that might reveal the trait better. This re-search had a hidden benefit: As you develop items that zing closer to the bullseye of the construct you are trying to measure, you can pile up as good a score with fewer items than you needed before. Bruce Hunsberger and I did this in 2004 with a 12-item Religious Fundamentalism Scale that had all the attractive psychometric properties and empirical connections of the original 20-item version (The International Journal for the Psychology of Religion, 2004, 14, 47-54.) I then published a shortened 20-item RWA Scale in 2006 which replaced the 30-item versions that I had been using (The Authoritarians, available on this website, pp. 10-11). This however made no more difference to the scale whackers than my deal with Satan to lock them all in a room with novice banjo players if they continued to tramp all over my tests (Ibid., pp. 35-36). But in case some people are willing to measure right-wing authoritarianism as it has been conceived and defined, here is a balanced shorter RWA Scale that appears to do the job as well as the 20-item one.
Method
The Fall 2019 Monmouth Poll. This new version is based on a recent Morning Consult poll that majestically drew nation-wide samples from eight countries, rather than the usual collection of (Canadian) students who wandered into my Intro Psych course (and their parents who found themselves psychologically corralled when I wanted to sample a different population.) This study in turn arose thanks to a survey of American voters conducted by the Monmouth Poll for John Dean and me in the Fall of 2019. Following a pilot study using New Jersey residents, Monmouth acquired the names, email addresses, and political preferences of a large number of registered American voters. These people were invited in October/ November 2019 to participate in a poll on “a variety of issues.” The Monmouth sample was not drawn to represent the American public. For starters, unregistered citizens were excluded. It was a very self-selected sample of adults with known political affiliations. And we overrecruited Republicans a bit because they were the people we were trying to understand. (See Appendices V and VI in the paperback edition of John W. Dean and Bob Altemeyer, Authoritarian Nightmare, 2020, Brooklyn, Melville House Publishing for the entire survey and its principal findings.)
The April 2021 Morning Consult Poll. Both the pilot survey and the Fall 2019 Monmouth Poll produced some mind-blowing results, and when these appeared in the summer of 2020 with the publication of Authoritarian Nightmare, Morning Consult decided to repeat the study and extend it to other countries. Thus at the end of April 2021 more representative samples of 1000 American, Canadian, British, Australian, German, French, Italian, and Spanish citizens answered the RWA Scale and other questions that had been in Monmouth survey.
Decision intelligence company Morning Consult uses online polls to reach a large group of citizens in the countries which it surveys to answer questions on the many issues it studies. These “panels” are comprised of individuals who opt-in to take online surveys, who on their own may not be representative of the country as a whole. However Morning Consult ensures that it interviews a representative sample of adults in each country based on gender and age and further ensures accurate results by weighting some people’s answers more than others to ensure it is demographically representative of adults in the country based on gender, age, education, and region. In general, better-educated, older men tended to answer Morning Consult’s survey of authoritarianism in the eight countries (as happens with many polls). Correcting for these deviations from national parameters by giving different “weight” to answers provided by various groups produced essentially the same overall results. So I shall report only the unweighted responses in all the studies described in this essay, because that’s what actually happened.
Two other things. The Morning Consult survey is, in my opinion, the best study ever done on authoritarianism. Nothing comes close to its range, representativeness, and opportunity for international comparisons. Which leads to the second observation. The Morning Consult’s greatest finding may be that range and representativeness are not as limiting as we have perhaps imagined. Most of the previous research with the RWA Scale was done in one city in the middle of Canada on two populations: University students and their noble time-sacrificing parents. Other groups, such as American and Canadian lawmakers, and citizens of the Soviet Union, produced a strong record of cross-replication. But I know of only one previous national survey that used the complete RWA Scale, done in 2005 in the United States by Mick McWilliams and Jeremy Keil. It found what I had been finding over the years in my little shop on the Canadian prairie. So also did the Morning Consult polls, with one exception I shall point out. If a pot of gold awaits other researchers at the end of this rainbow of research arcing over the past 50 years, it is that the limited window on the world provided by one’s targets of opportunity may not be all that limiting.
How Did the RWA Scale Do in the Monmouth and Morning Consult Surveys?
The Monmouth Fall 2019 poll began with some standard polling questions to acclimate the respondent to the task. For example, “Would you say things in this country are going in the right direction, or have they gotten off on the wrong track?” Then the 20-item RWA Scale appeared, preceded by a non-scored item about protest marches to get the subject a bit of practice using a -4 to +4 response scale. Then the rest of the survey followed. The poll asked 131 questions and took on the average 30 minutes to complete.
The 190 interitem correlations on the RWA Scale in the Monmouth Poll averaged a record-busting .516, an overwhelming richness of internal consistency that gave the 20-item summed score a Cronbach alpha coefficient of .955, and the test scores a signal-to-noise ratio of 19:1.[1] (This steroid injection of psychometric rigor gave the expression, “significant at the .05 level” a wonderful new meaning.) These results had been preceded by similarly mind-blowing outcomes in the pilot survey of New Jersey residents (N=478, mean intercorrelation=.476; alpha=.948). McWilliams and Keil in contrast had “only” gotten a mean interitem correlation of .23 and an alpha of .90 with a 30-item RWA Scale. But it was hard to argue with the replicated results found by Monmouth.
The Morning Consult survey in turn asked 172 questions of its American sample, and responses to the RWA Scale reverted to more normal psychometric levels, with a mean interitem correlation equal to .299 and an alpha of .896. Why then were the Monmouth results off the chart? They probably benefitted from differences in the samples drawn, and by events during the nineteen months between the surveys. Monmouth’s sample of registered voters who cast ballots was probably more interested in politics than Morning Consult’s wider dip into the American public, with lots of firmly committed Democrats and Republicans. When a large part of a sample has its ideological ducks in order, it pumps up the internal consistency among the RWA Scale responses. Especially in a polarized society where so many ducks are assembled in serried ranks in opposite camps on so many issues. A wide range of answers with lots of extreme scores lets a relevant test really shine. Also, an election had put Joe Biden in the White House, and while authoritarian followers are more likely than most to say in general that we should support the authorities, they feel much less inclined to do so when they dislike the authority. But probably most important of all, January 6, 2021 had happened by April 2021 (trust me on this), and we know that some Trump supporters became non-supporters (at least for a while) after the attack on the Capitol. Items about a “mighty leader who will do what has to be done” probably looked a little different to some after Trump dispatched his mob to stop the electoral certification.
Morning Consult Results in Other Countries
The RWA Scale was developed to measure authoritarianism in North America, and it has performed well in Canadian and American research from the 1970s on. It did so again in the Morning Consult Canadian sample, as it had in the American survey, with a mean interitem correlation of .317 yielding an alpha equaling .903.
Over the years scattered results (mainly in Europe) showed that the test “worked” surprisingly well in some other countries despite cultural differences and translation problems. In the Morning Consult survey of the United Kingdom the items intercorrelated .269 on the average producing an alpha of .882. In the fourth English-speaking country polled by Morning Consult, Australia, the numbers came in at .291 and .892. So the four predominantly “Anglo” samples showed a similar level of international consistency in internal consistency (from .269 to .317) in their responses to the RWA Scale, with corresponding near-uniformity in alpha reliability: .882 to .903.
As one might expect, the results proved weaker in Italy (.211; .843), Spain (.206; .840), Germany (.181; .818), and France (.126; .747). The French results were outstanding—badly so. This was mainly because, stereotypes notwithstanding, the French responses showed much more yea-saying than anyone else’s.[2] But in those other seven countries, the psychological trait of being an authoritarian follower appears to be the same cat with the same stripes—as it had previously in various other places. And that quite amazes me.
Shortening the RWA Scale
When you are trying to understand behavior by using a personality test, you have to know what the scale is actually scaling and you want to know how trustworthy its scores are. On a test which purports to be measuring one thing, such as the tendency to be an authoritarian follower, the best psychometric indicator of how well it is assessing that one thing is the consistency of responses to its various items. And one can gauge the reliability of its summed score with Cronbach’s alpha coefficient. These are the statistics I just reported from the Morning Consult survey. However, there is a trade-off between these two features which becomes especially important when you are trying to shorten a test. As you shuck the items on the scale that have the weakest connection to the rest of the measure, the reliability of the summed scores will take a hit simply because you have fewer indications of the trait. (See Footnote 1.) You have to rob Peter to pay Paul.
The Morning Consult surveys give us an opportunity to find an RWA Scale where Peter and Paul can both make a living. Combining the highly similar results from the four predominantly English-speaking countries gives one a hefty collection of 4,000 citizens who represent the adult populations in these four democracies to an admirable degree. Table 1 shows the key psychometric properties of the 20-item RWA Scale used in these surveys, and those same properties for an 18-item version when you drop the portrait item and the contrait item that have the weakest correlations with the rest of the test. The mean interitem correlation naturally goes up, from .298 to .313 since you have set aside the items with the weakest connections, and happily the alpha barely goes down, from .896 to .892, indicating the improved internal consistency nearly compensated for the 10% reduction in the size of the test. So the 18-item version works as well as the 20-item test. But it hardly will satisfy researchers who will risk an eternity of banjo music in order to use a short test.
Table 1
Mean Interitem Correlations, Reliabilities, and Empirical Validities of RWA Scales of Different Lengths Drawn from the Morning Consult Surveys of English-Speaking Countries (N=4,000)
No. of
Items |
Mean
Interitem r |
Alpha
Coefficient |
Corr. with
Prejudice |
Corr. With
Fundamentalism |
Corr. with
“Stomp Out” |
20 | .298 | .896 | .705 | .700 | .457 |
18 | .313 | .892 | .702 | .697 | .449 |
16 | .322 | .885 | .707 | .695 | .448 |
14 | .331 | .875 | .704 | .707 | .443 |
12 | .340 | .861 | .698 | .714 | .428 |
10 | .345 | .841 | .697 | .711 | .417 |
8 | .350 | .811 | .682 | .719 | .403 |
6 | .370 | .779 | .675 | .726 | .374 |
4 | .397 | .725 | .646 | .690 | .309 |
2 | .423 | .594 | .576 | .696 | .259 |
Table 1 goes on to show what happens when we peel off the next weakest-connecting portrait and contrait items from the 20-item test. The 16-statement measure also looks OK, but still will probably be unsatisfying to the test pruners. You can see what happens as we continue whittling down the test, step-by-step, to a quite worthless two-item version that has an alpha reliability of only .594.[3] So where is the “Goldilocks Zone”? Going from eight items to ten costs just .005 in mean interitem correlation but the increase in the number of items raises alpha .030. Going from ten to twelve items costs the same .005 in interitem connections, but raises the alpha less because you get diminishing returns from increasing the length of a test. So if we are determined to make the test smaller, you get the best psychometric deal for both Peter and Paul with ten items.[4]
The last three columns of Table 1[5] show the correlations between the various RWA Scales and three other measures that were included in all of the Morning Consult surveys, beginning with responses to a 24-item measure of prejudice. These scores proved the best predictor of support for Donald Trump in the Monmouth survey, with a spectacular correlation of .812. The second shaded column shows correlations with the 12-item Religious Fundamentalism Scale mentioned earlier. I have argued that such fundamentalism captures the authoritarian response to the religious impulse. The third index of having an authoritarian orientation was provided by reactions to the “Nazi cheer book statement” composed some years ago, which I have used to measure people’s readiness to persecute others when their chosen leaders give them the word. It goes, “Once our government leaders and the authorities condemn the dangerous elements in our society, it will be the duty of every patriotic citizen to help stomp out the rot that is poisoning our country from within.” Most Trump supporters in the Monmouth sample agreed with this. It is just a single item, but it captures the “lock and load,” “stand by and stand ready” spirit of authoritarian followers.
Scanning down the shaded columns in Table 1 reveals that the empirical relationship of the various RWA Scales generally increased as the test got longer. But it was a bumpy ride for fundamentalism which was sensitive to how often religious themes were present in the items being considered. Overall, halving the RWA Scale from twenty to ten items had very little effect on these empirical correlations. Prejudice went from .705 to .697, fundamentalism actually improved (.700 to .711), and the correlation with the “Nazi cheer” dropped the most, from .457 to .417, a change of little consequence.[6]
Some researchers may look at Table 1 and say, “We were right. It doesn’t matter how many items (beyond four) you use to measure a trait.” But the mock scales represented in Table 1 were all balanced against response sets, and they were chosen because they were the best intercorrelating items on the test at each stage, and they represented the defining elements of authoritarianism as well as the 20-statement version did, once we got to ten items.
We have to be aware however of a hidden danger in using ten items to measure something when you can use more. If any item “goes bad,” that will have a greater effect on summed scores on a shorter scale. Items do become dated. The poorest interconnecting statement on the 20-item RWA Scale is, “A ‘woman’s place’ should be wherever she wants to be. The days when women are submissive to their husbands and social conventions belong strictly in the past.” The item still arouses disagreement in authoritarian followers, but not as much as it used to. Most of the ten items on the RWA Scale shown in Table 2, which I consider equivalent to the 20-item version, have been on the RWA Scale for decades.[7] One is frankly amazed at their longevity as well as their international relevance. But they will not remain relevant forever anywhere.
Table 2
The Recommended Ten-Item RWA Scale
- Our country desperately needs a mighty leader who will do what has to be done to destroy the radical new ways and sinfulness that are ruining us.
- Gays and lesbians are just as healthy and moral as anybody else. *
- The “old-fashioned ways” and the “old-fashioned values” still show the best way to live.
- Atheists and others who have rebelled against the established religions are no doubt every bit as good and virtuous as those who attend church regularly. *
- God’s laws about abortion, pornography and marriage must be strictly followed before it is too late, and those who break them must be strongly punished.
- What our country really needs is a strong, determined leader who will crush evil and take us back to our true path.
- You have to admire those who challenged the law and the majority’s view by protesting for women’s abortion rights, for animal rights, or to abolish school prayer.*
- Homosexuals and feminists should be praised for being brave enough to defy “traditional family values.” *
- The only way our country can get through the crisis ahead is to get back to our traditional values, put some tough leaders in power, and silence the troublemakers spreading bad ideas.
- Everyone should have their own lifestyle, religious beliefs, and sexual preferences, even if it makes them different from everyone else. *
Subjects are asked to respond to each statement on a -4 to +4 basis, with “Neutral” or “No opinion” scored a 0. Responses to protrait items (Nos. 1, 3, 5, 6 and 9) are converted to a 1-9 scale for scoring, with -4 (“Very strongly disagree”) being scored as 1, -3 (“Strongly disagree”) being a 2, -2 (“Moderately disagree”) being a 3, -1 (“Slightly disagree”) being a 4, no response being a 5, +1 (“Slightly agree”) being a 6, +2 (“Moderately agree)” equals 7, +3 (“Strongly agree”) being 8, and +4 (“Very strongly agree”) is tabulated as 9. The keying is reversed for the contrait items (Nos. 2, 4, 7, 8 and 10, marked with an asterisk). For example, “Very strongly disagree” is scored a 9 and “Very strongly agree” is scored a 1. Summed scores on the scale can thus range from 10 to 90.
A Two-for-One Bonus: A Short Racial Prejudice Scale
The Morning Consult subjects also responded to 24 items tapping various prejudices. These were taken directly from the 2019 Monmouth survey and some of the items, such as those dealing with Hispanics, would have little relevance in other countries. Furthermore, prominent victims of discrimination elsewhere, such as aboriginal First Nations peoples in Canada and Pakistanis in the United Kingdom, were not mentioned. So the Morning Consult data do not tell us everything they could about prejudice in the countries involved. But an analysis of the responses from the four English-speaking nations that followed the same procedures used to distill the short RWA Scale produced the 12-item Prejudice Scale given in Table 3.
These items, whose wording I have sometimes modified a bit to promote relevance outside the USA, mainly pick up sentiments of white supremacy. Which does seem to be in the news nowadays. Prejudice against Jews, Hindus and Buddhists, and Latin Americans correlated almost as well with these statements in the Morning Consult study, but not well-enough to elbow their way onto this short measure of prejudice.[8] Additional items might be added to those in Table 3 to tap prejudice against groups not mentioned—provided they correlate well enough with the rest of the scale and the test remains balanced.[9]
The sum of these twelve items had a mean interitem correlation of .414 in the combined English-speaking sample (N=4000), producing an alpha of .895. They correlated .665 with scores on the 10-item RWA Scale.[10]
Table 3.
A Twelve-Item Racial Prejudice Scale
- There are entirely too many people from the wrong places getting into our country now.
- We should tear down the walls that keep people from different cultures away from us, rather than build new ones. *
- Racial minorities have had it good for years in our country because of all the government programs that help them get ahead of white people.
- The more diverse our country becomes, with different people with different religions and heritages from everywhere else in the world, the stronger it will be.*
- Muslims bring a valuable new element to our society and should be welcomed.*
- White people are the major victims of discrimination in our country. The government is on everybody else’s side but theirs.
- Instead of complaining and protesting all the time, colored people like Blacks should be grateful for how good they have it here compared to where they came from.
- It is good to live in a country where there are so many minorities present, such as the groups who came from Asia, Africa, and the Caribbean.*
- Certain races of people clearly do NOT have the natural intelligence and “get up and go” of the white race.
- “Colored people” continue to get less than their fair share of our country’s wealth because of discrimination.*
- Black people are just naturally more violent than white people.
- Most minorities on welfare would rather work, but they can’t get jobs that pay a living wage.*
Footnotes
[1] The mean interitem correlation (rii)) on a test and the number of items (n) determine the alpha coefficient of reliability: α = [n(rii)] / {[(n-1)rii]] + 1}.
[2] Balancing a test with equal numbers of protraits and contraits does not prevent yea-saying—or nay-saying. But it keeps these response sets from seriously contaminating the summed scores on the scale, which is vitally important because yea-saying in particular is so prevalent and will easily produce phantom relationships with other measures contaminated with yea-saying.
[3] I did keep an eye on the content of the resulting items to make sure their collection gave voice to the three elements whose covariation has been my definition of right-wing authoritarianism: authoritarian submission, authoritarian aggression, and conventionalism. But this required only one intervention in a process that was otherwise dictated entirely by each item’s connection with the other 19 items on the test.
[4] I was tempted to go with the 12-item version despite this, because I’ve tried to guard against the wide-
spread tendency in the research literature to race to the bottom when it comes to test reliability. But a
pleasant sense of destiny arises in the realization that the RWA Scale has now gone for 30 to 20 to 10 items.
[5] The Australian sample posted a mean interitem correlation among these ten items of .326, yielding an alpha of .830. The same figures for the 1000 Canadians were .375 and .857. The notably yea-saying French sample produced the lowest outcomes: .166 and .667. The numbers came in at .284 and .799 for Germany, .240 and .759 for Italy, .248 and .767 for Spain, .304 and .813 for the U.K., and .349 and .844 for the USA. The figures for the combined American and Canadian samples, drawn from the population whose authoritarianism the RWA Scale was intended to measure, were .371 and .855, producing a signal-to-noise ratio of 5.9 to 1. Which would be really crummy on your TV, but tisn’t that bad for a balanced 10-item measure of a deep psychological trait.
[6] In the Australian sample, the 20-item RWA Scale correlated .679 with prejudice, .670 with fundamentalism, and .464 with “Stomp out the Rot.” The 10-item version put up correlations of .679 with prejudice, .690 with fundamentalism, and .424 with stomping out the rot. The same figures, respectively, in the Canadian sample were .690, .711,and .480; .688, .727, and .437. In France, where the RWA Scale had poor reliability, the corresponding numbers were .594, .481,.and 359; .615, .504, and .327.The German sample produced the following connections: .576, .636, .and 415, .611, .640, and .364. Italians posted .670, .683, and .412; .680, .665, and .397. Their Mediterranean neighbors in Spain produced .666, .660, and .364; .671, .664, and .344 The United Kingdom figures were .724, .620, and .422; .704, .624, .and 382. The American sample gave up correlations of .721, .751, and .459 on the 20-item version of the RWA Scale, and 709, .745, and .423 on the 10-item version.
On both the 20-item and 10-item scales the United States sample proved appreciably more authoritarian than anyone else. Who do you think was the least authoritarian sample? (Hint: This country had significant previous experience with dictatorships, and its population was widely thought to be “naturally authoritarian.”)
7 “A woman’s place” and “Atheists and others” were on the original RWA Scale first used in 1973. For many years I included other original items to see how authoritarianism levels had changed over time. The “Continuing 12” always connected with each other and could predict authoritarian behaviors. But later statements, developed as I gained more insight into authoritarianism, coalesced better and so nearly all of the old-timers were eventually reworded or retired.
[8] Right-wing authoritarians dislike many more groups than those named on the prejudice scale. I have called them “equal opportunity bigots” because they prejudge so many people so negatively—including groups they could not possibly have had any personal contact with. Almost everyone lands in their “Out-Group.” When their ethnocentrism and resulting prejudices are pointed out, however, they accuse the observers of “name-calling” and dividing the country. But no one comes close to them in doing these things.
[9] Protrait items are easier to write, and so one could substitute a local issue for Item 11, which has the lowest item-whole correlation among the protraits with the other eleven statements on the scale. For example, in Canada Item 11 might go something like, “Nobody owes Canada’s First Nations Indians a damn thing. Their lives are a mess simply because they made them that way.”
[10] The same indices by country were: Australia: .406, .892, .653; Canada: .454, .909, .655; France: .304,. 841, .574; Germany: .380, .881, .564; Italy: .390, .885, .652; Spain: .306; .843, .638; United Kingdom: .431,.901, .678; USA: .362; 872; .670. France posted the highest score across the twelve prejudice items, and Canada had the lowest. The correlation between scores on the original 20-item RWA Scale and the 24-item Prejudice Scale in the four English-speaking countries was .705.