In the last issue of The Tattler, many readers were concerned (if not surprised) by the alarming statistics regarding the state of students’ mental health at IHS. No one would dare challenge the central notion of the article: mental health is a dire issue that should be dealt with by our communities. However, for lack of a better term, the reality of the situation is not the subject of this article. The techniques used to gather the data used in such reports are just as worthy of our collective critique, if not more so in the grand scheme of affairs.
AP Statistics students know that the methods used to gather a set of data are crucial to the reliability of the final results. If a sample is not representative of the population, then the collected data is rendered nearly useless for any serious analysis. After all, the purpose of conducting such studies is to identify trends that can be used to generalize an entire population into a set of statistics. Therefore, eliminating any factors that may affect the sample in a negative way (i.e. make the sample less representative of the population) is an indispensable part of the data collection process. Factors that may skew the subjects of a survey towards a particular response are known as “bias.” For example, convenience bias occurs when one selects a sample out of sheer convenience for oneself, such as a salesman asking the ten people nearest to him for survey responses. Results collected through such means are, as one might guess, not representative of the population as a whole. What if the salesman was at a concert, where every individual in attendance has similar musical tastes? What if he happened to be near his family, who all have a shared history and experiences? There are an infinite number of ways in which poor sampling techniques can yield results that are utterly unusable in any significant sense.
In the case of the mental health article, as with many surveys conducted via email, non-response bias is the main culprit behind the chaos. Non-response bias occurs when subjects have the option of simply not responding to a given question in order to be counted among the sample. Call-in surveys are an excellent example of how this technique can yield less-than-informative results, as the most passionate viewers are inherently more motivated to respond to questions pertaining to their topics of interest. One look at one of these surveys would lead an unassuming viewer to think that America is even more polarized than it actually is, as those with the loudest voices are the only ones inclined to speak out.
None of this is to detract from the genuinely altruistic motives of those who happen to rely on faulty statistical techniques. Outside of a formal statistics course, few people get the chance to encounter the nuances and pitfalls of data collection. However, the role of statistics in our everyday lives only grows greater by the day, as Big Data is one of the fastest-growing and most controversial industries of the twenty-first century. Thus, equipping oneself with a thorough knowledge of statistical affairs is and will continue to be an invaluable part of life in this Brave New World.