Data Quality
Potentially significant sources of bias
Biases that may influence Ohpynez's data are listed below. This is not intended to be a comprehensive list or to provide a full detailed explanation of each bias type.
Awareness and accessibility
The opinions of individuals that are unaware of and/or unable to access this website are obviously not represented.
Self-selection
Ohpynez does not prompt, promote or otherwise suggest questions to individual users. Individuals must actively seek out each question and those who respond to newly created questions may need to revisit the question when more answers are available to review. The act of choosing to use this website is also a form of self-selection bias.
Leading questions
For accurate and representative data it is important to ensure that the question being asked is not attempting to skew the answers of respondents toward a desired outcome. Ohpynez does not actively edit, censor or depreciate questions that are poorly written. Therefore, it is strongly suggested to review the question carefully prior to viewing its data.
Multiple responses
It is conceivable that some individuals may create multiple accounts in an attempt to bias results. Most online polls can be similarly affected by individuals looking to influence their results and this is a potential risk of using this medium.
Oversampling and undersampling
The number of respondents for certain demographic or categorical groups in the sample may be misrepresentative of their true proportion of the population. There are statistical methods designed to alleviate these problems. Since Ohpynez does not collect demographic data, these methods can not be implemented.
Self-reporting
It is possible for participants who desire a specific result to achieve it by adjusting how they express their opinion and/or respond to others' opinions. Since Ohpynez is anonymous, the effect of this bias should ideally be insignificant.
Other factors influencing data quality
Listed below are a few issues that are unique to Ohpynez's data that you should be aware of.
Answer order
While answers are selected at random for each interaction, the number of answers available to be reviewed impacts the percentage chance a given answer has to be reviewed. Essentially this means that the earlier an answer is submitted, the more interactions it is likely to have accumulated. The impact of this is negligible and can be negated if all answers have been reviewed by all participants.
Interaction counts
The number of interactions may vary drastically between participants. The interaction count indicates the number of opportunities an opinion has had to move. Since the initial value of all opinions is zero, there will be a cluster of data points near zero consisting of users with very low interaction counts. Requiring a minimum number of interactions will help alleviate this issue.
Interaction order
Each interaction between individuals changes the values attributed to their opinions. The amount of change is dependent upon each opinion's value prior to the interaction. Modifying only the order in which interactions occur for a single participant will cause the numerical value of their opinion to vary. An opinion's value is expected to self-correct with additional interactions, although the most recent interactions may have a more pronounced impact. Any minor discrepancies are often unimportant as the data is ordinal.
While it is possible to determine a range of values for each opinion by modeling every possible order in which the interactions could have occurred, it is impractical to do so. It is equally unreasonable to attempt to control for the actions of multiple participants (e.g. when and how they interact with others' opinions).