Beginning or amateur constructors of high-range tests for mental abilities, as well as a disturbing number of more experienced psychometricians, characterize themselves by a collection of persistent errors which result from the giving in to otherwise understandable temptations and human weaknesses. A list of these aberrations with brief comment to each follows. More detailed explanations of why these behaviours are bad can be found in for instance Recommendations for conducting high-range intelligence tests.
Why this is bad is explained excellently in the aforementioned article. The temptations toward allowing retests are generally the following:
Each one of these apparent benefits is blown away by the disadvantages of retesting.
Homogeneous tests tend to have less generality than heterogeneous tests, and with regard to I.Q. tests that implies lower validity and more room for one's score to be off in either direction, compared to one's true g level. Homogeneous tests are also less robust, more vulnerable to score inflation. They are the prime target of cheaters, answer publishers, answer sellers, and other such scum. Unethical people tend to have uneven aptitude profiles (which does logically not imply that people with uneven aptitude profiles are necessarily unethical, as in "a cow is an animal but not all animals are cows"), and therefore choose one-sided tests to cheat with; they lack the talent for tackling a heterogeneous test. Temptations toward offering homogeneous tests are:
Some candidates tend to ask questions about the tests, their contents, even about particular test items, and answering those to satisfaction of the candidate will always put that candidate at an advantage compared to other candidates and must therefore never be. The temptation toward answering lies in "empathy", in wanting to "help", and in not wanting to be seen as a "Dutch uncle", as a stiff, dull, strict, unresponsive person. Almost no one is unempathic enough to be strict in these matters, as is betrayed so mercilessly by questions and remarks I myself often receive from candidates whose words all too often reveal that with other test designers they are given the information about particular test items they desire, they are allowed retests, et cetera. The curse of "empathy", and the tremendous fear of being seen as unsympathetic or pedantic, render very few suitable as testers.
The reasons for not doing this, and the correct way to treat the phenomenon, are explained elsewhere. The temptation toward this habit comes from wanting to respect other people's different ways of looking at a problem but lacking the reasoning ability to see that certain answers are logically wrong, from not recognizing that the alternative answer is a pareidolic or apophenic delusion (so, that it is like the patterns one may see in random noise, clouds, and so on; candidates may be very confident that such a pattern is the intended solution, and thus, with the best possible intentions, trick the scorer into accepting it as a valid answer; but in reality, that pattern exists only in the paranoid-critical mind of the candidate, which is also why it is impossible to create tests that do not have pareidolic, apophenic "alternative answers"; those are not a property of the test but only of the candidate's mind), from being afraid to judge over what is right and wrong, and from personal sympathies for particular persons, including oneself (for yes, some test scorers partly or wholly take the tests they themselves score, score the tests they themselves take, and decide which of their own alternative answers are counted right and even to which norms their scores correspond; in fact this behaviour is not atypical for claimants of "the world's highest I.Q.").
Especially beginning test constructors tend to be shocked or worried when they see candidates score disappointingly lower than they - the candidates - expect, and sometimes want to soothe the disbelief and frustration of the candidate by explaining what the intended answers are, so that the candidate will recognize them as better than the given answers and thus be able to accept that the latter were wrong. Why this is a mistake is explained in the aforementioned article Recommendations for conducting high-range intelligence tests; the key phrase is motivation for secrecy.
When a test item is moved to a different environment - another test with other, fewer or more problems - its statistical behaviour changes. For instance, items that were originally accompanied by similar ones (possibly of varying difficulty) will become harder when some of those accompanying items are no longer there; this is probably because those other items served as "examples" in some way, or formed a gentle slope.
A typical situation is that wherein a new, shorter test is created by selecting items from an earlier, larger test. In the new test, the attention of the candidate will be less dispersed because of the lower number of items, and the items will on the whole tend to become easier, although a few may become harder. Using data from the earlier test to arrive at norms for the new test may result in too high norms. This is ironic, as the new test was of course created to achieve greater psychometric soundness, for instance by selecting items by their individual statistical properties, or by leaving out items to which the answers have leaked out. Inherent to this is that the scores from new candidates taking the shorter test are not comparable with (hypothetical) scores from candidates who took the original test and whose scores were converted to scores on the new test (which was a subset of the longer test).
Removing items, the answers to which have leaked out or been published, makes it all too easy for evil persons to destroy the work of a test creator. It is equivalent to a shop owner by default handing over the money to any robber without resistance. It is therefore an extremely bad approach that inevitably leads to the end of that test creator's career. It is giving in to terror.
It must on the other hand be stressed that in those cases not the test items have been "compromised", as some horribly say, but the culprits, the answer publishers, the answer leakers, the cheaters, have been compromised; by themselves, and for good. For good work can never be "compromised" by worthless piles of faeces that possess no ability of their own and therefore seek fulfilment in life through vandalizing the work of others.
The only right approach is to track them down, keep records of who they are and what they have done, and call them to account, no matter how long it takes. They must be thoroughly aware that never in their lives they will be safe again until they have redeemed their shameful debt, and that with every second the answers leaked by them remain published, their inescapable suffering when caught, no matter where or when, grows exponentially, and the whip will come down harder and oftener.