Link To TurkPrime

How to run successful experiments and get the most out of Amazon's Mechanical Turk

Showing posts with label demographics. Show all posts
Showing posts with label demographics. Show all posts

Friday, December 1, 2017

Are MTurk workers who they say they are?

The internet has the reputation of being a place where people can hide in anonymity, and present as being very different people than who they actually are. Is this a problem on Mechanical Turk? Is the self-reported information provided by Mechanical Turk workers reliable? These are important questions which have been addressed with several different methods. Researchers have examined a) consistency of responding to the same questions over time and across studies b) the validity of responses, or the degree to which the items capture responses that represent the truth from participants. It turns out that there are certain situations in which MTurk workers are likely to lie, but they are who they say they are in almost all cases.


Consistency over time/Reliability:
One way of measuring truthfulness of responses is to examine how individual workers respond at different times to the same questions. In one study, data collected from over 80,000 MTurk workers examined the reliability in reported demographic information over time. This large study found that participants were overwhelmingly consistent when reporting demographic variables across different studies over time, with gender identification being 98.9% consistent, race 98.2% consistent, and birth year being 96.2% consistent, with this slightly lower score being largely due to technical issues rather than Turkers not being truthful (Rosenzweig, Robinson, & Litman, 2017).


Validity
Various forms of validity have been examined in data collected through MTurk, with results showing that data are by and large valid. We will focus on convergent validity, which refers to how a measure is correlated with other measures of known related constructs. Convergent validity of self-reported information at a group level can  be established by examining whether workers are providing logically consistent information. Data collected on TurkPrime show that associations between variables are consistent with what is found in the general population. For example, older Mechanical Turk workers tend to be more religious and more conservative, a pattern that is consistent with the general US population. The reported number of children correlates strongly with age and family status, as do divorce rates.  Self-reported time of day preference is correlated with the time of day that workers are actually active, which is also correlated with a cluster of clinical, personality, and behavioral variables that have been previously reported in the literature in studies of the general population (Unpublished Data). Similar consistent patterns have been observed in health information collected from Mechanical Turk workers. TurkPrime profiled over 10,000 Mechanical Turk workers on over 50 questions relating to physical health, with a factor analysis revealing that symptoms clustered around underlying conditions in the expected way. For example hypertension, high cholesterol, and diabetes formed a single factor. This factor, interpreted to be metabolic syndrome, correlated with other variables such as age and gender in the expected way. The rate of metabolic syndrome increases with age and was higher among men. BMI also correlated with self-reported exercise (See also Litman et al., 2015). Some other examples include the fact that rates of chronic illnesses are significantly higher among smokers compared to non-smokers, and strongly associated with BMI, with both higher and lower than average BMI being predictive of chronic illnesses.


Video tools that are currently in beta testing at TurkPrime are starting to be used to verify participants’ reported demographic characteristics such as gender, and race, and the presence of a second person for dyadic research, with promising initial results indicating that participants are highly truthful.


When Participants are Likely to Lie
Research has additionally examined the reliability of data collected when selection criteria were listed as a prerequisite to enter a study (e.g.“only open to males”). Data show that when participants are incentivized to not be truthful, such as when they are only able to take a lucrative study if they identify as a particular demographic group, they lie (Chandler & Paolacci, 2017; Rosenzweig et al., 2017). For example, in a study with a HIT title that said it was open “for men only”, 44% of participants who entered had previously consistently reported their gender as “female”.


Best Practices
When researchers want to selectively recruit participants on MTurk, they have several options. Some researchers recruit for a specific demographic group by including such specifications/selection criteria in a study open to all workers, and rely on workers to tell the truth when opting in to such a study. Based on the data, this is a mistake that will lead untruthful participants to opt-in. There are, however, several ways to selectively recruit participants that are who they say they are. One option is to use the qualifications system with characteristics already verified by MTurk, or by using TurkPrime’s qualification system. Another option is to run a study open to all workers, ask a series of initial demographic questions, and only have participants who match the desired demographic criteria proceed to the next round in the study, paying even those who were of the wrong demographic for their time. You can create worker groups on TurkPrime based on such pre-screenings which can help you track and subsequently recruit participants who match your criteria of interest.


References:
Rosenzweig, C., Robinson, J., & Litman, L. (2017, January). Are They Who They Say They Are?: Reliability and Validity of Web-Based Participants’ Self-Reported Demographic Information. Poster presented at the 18th Society for Personality and Social Psychology Annual Convention, San Antonio, TX.

Monday, November 20, 2017

Strengths and Limitations of Mechanical Turk


Hundreds of academic papers are published each year using data collected through Mechanical Turk. Researchers have gravitated to Mechanical Turk primarily because it provides high quality data quickly and affordably. However, Mechanical Turk has strengths and weaknesses as a platform for data collection. While Mechanical Turk has revolutionized data collection, it is by no means a perfect platform. Some of the major strengths and limitations of MTurk are summarized below.
Strengths
A source of quick and affordable data
Thousands of participants are looking for tasks on Mechanical Turk throughout the day, and can take your task with the click of a button. You can run a 10 minute survey with 100 participants for $1 each, and have all your data within the hour.
Data is reliable
Researchers have examined data quality on MTurk and have found that by and large, data are reliable, with participants performing on tasks in ways similar to more traditional samples. There is a useful reputation mechanism on MTurk, in which researchers can approve or reject the performance of workers on a given study. The reputation of each worker is based on the number of times their work was approved or rejected. Many researchers use a standard practice that relies on only using data from workers who have a 95% approval rating, thereby further ensuring high-quality data collection.
Participant pool is more representative compared to traditional subject pools
Traditional subject pools used in social science research are often samples that are convenient for researchers to obtain, such as undergraduates at a local university. Mechanical Turk has been shown to be more diverse, with participants who are closer to the U.S. population in terms of gender, age, race, education, and employment.
Limitations
There are two kinds of potential limitations on MTurk, technical limitations, and more fundamental limitations with the platform. Many of the technical limitations of MTurk have been resolved through scripts written by researchers or platforms such as TurkPrime, which help researchers do things they were not previously able to do on MTurk including
  • Exclude participants from a study based on participation in a previous study
  • Conduct longitudinal research
  • Make sure larger studies do not stall out after the first 500 to 1000 Workers
  • Communicate with many Workers at a time.
There are however several more fundamental limitations to data collection on MTurk:
Small population
There are about 100,000 Mechanical Turk workers who participate in academic studies each year. In any one month about 25,000 unique Mechanical Turk workers participate in online studies. These 25,000 workers participate in close to 600,000 monthly assignments. The more active workers complete hundreds of studies each month. The natural consequence of a small worker  population is that participants are continuously recycled across research labs. This creates a problem of ‘non-naivete’. Most participants on Mechanical Turk have been exposed to common experimental manipulations and this can affect their performance. Although the effects of this exposure have not been fully examined, recent research indicates that this may be impacting effect sizes of experimental manipulations, comprising data quality and the effectiveness of experimental manipulations.

Diversity

Although Mechanical Turk workers are significantly more diverse than the undergraduate subject pool, the Mechanical Turk population is significantly less diverse than the general US population. The population of MTurk workers is  significantly less politically diverse, more highly educated, younger, and less religious compared to the US population. This can complicate the way that data can be interpreted to be reliable on a population level.

Limited selective recruitment

Mechanical Turk has basic mechanisms to selectively recruit workers who have already been profiled. To accomplish this goal Mechanical Turk conducts  profiling HITs that are continuously available for workers.  However, Mechanical Turk is structured in such a way that it is much more difficult to recruit people based on characteristics that have not been profiled. For this reason while rudimentary selective recruitment mechanisms exist there are significant limitations on the ability to recruit specific segments of workers.


Solutions
TurkPrime offers researchers more specific selective recruitment opportunities, and has some features in development to help researchers target participants who are less active and therefore more naive to common experimental manipulations and survey measures. TurkPrime also offers access to PrimePanels, which has access to over 10 million participants, who can be selectively recruited, and are more diverse.


References:


Peer, E., Vosgerau, J., & Acquisti, A. (2014). Reputation as a sufficient condition for data quality on Amazon Mechanical Turk. Behavior research methods, 46(4), 1023-1031.

Thursday, September 8, 2016

MTurk Panels on Your Own Requester Account

Studies with Panels  for just $0.15 - 0.75 / complete

Now you can run Mechanical Turk studies using your own Requester account and specify over two dozen demographic traits!.  The traits include gender, ethnicity, age, marital status and sexual orientation. But it does not stop there! The available options also include occupation, medical and health history, cell phone use and much more.


The cost ranges from $0.15 - $0.75 per completed assignment. For example, if you run a study with a panel of 100 White Males 40 and under with a cost of $0.42 / complete the TurkPrime Panel Fee is 0.42 * 100 = $42.00. The panel fee is determined by the incidence rate of your particular panel so that harder to reach demographics cost more...but are capped at a maximum of $0.75 per complete.

In addition, TurkPrime displays the feasibility of the study to run to completion. This is not a guarantee that the workers will take your study. since the MTurk workers, ultimately, decide whether they will accept and complete your study based on many factors including worker payment, requester rating on TurkOpticon and clarity of your study, among others. 

If you want to be certain your study will run to completion we recommend using the TurkPrime Lab Services of either Prime Panels or MTurk Panels where TurkPrime manages all user interaction, reaches out to workers to complete your study and guarantees the study will run to completion.

Studies with MTurk Panels will display the panel traits in the study dashboard along with a tag marking it as a panel study, as shown below.


Tuesday, April 12, 2016

Demographic Consistency

Demographic Consistency Over Time on Mechanical Turk

Problem:

Ever wonder if workers are being honest with you when they answer a survey? Or, if you specify that your study should be taken only by Women, whether some workers take the study even though they are not women?



Solution:


Now, all studies that have the Demographic Question feature enabled, will receive 2 additional columns  in the CSV download for each Worker:
  1. gender
  2. gender consistency score
The gender column specifies the worker's gender based on the worker's responses to that question in previous surveys. Each worker who completes a study may have taken TurkPrime studies many times before. With each study where the Demographic Question is enabled, TurkPrime asks Workers to optionally share some demographic information like gender so they can qualify for HITs that require those demographics. (Through TurkPrime Panels Lab Service, researchers run many studies available only to workers who satisfy specific demographic requirements.)  It is the responses to these questions that is shared with you, the TurkPrime users.

The gender consistency score specifies the consistency with which the gender question was answered as specified. For example, if the GenderConsistency is 66%, that means the MTurk Worker responded that they were female only 66% of the time. If the field is blank, that means that TurkPrime has not received a response from that Worker when asked to identify their gender.

This may indicate an innocuous error on the Worker's part, multiple workers sharing an MTurk account, or negligence on the part of worker. In any event, that information is now shared with TurkPrime users.

Wednesday, May 13, 2015

Maximizing HIT Participation


Problem:

How can you increase Amazon Mechanical Turk HIT Worker participation rates and speed completion of a HIT? This is particularly an issue with HITs that have a large number of required participants or have Qualifications that limit the number of qualified Workers

Solution:

By monitoring the participation rates of hundreds of HITs we have observed the following patterns that increase participation significantly:

Thursday, March 12, 2015

The New New Demographics on Mechanical Turk: Is there Still a Gender Gap?



Overview

Seventy five Mechanical Turk studies conducted with US-based Workers in  2013 and 2014 were reviewed. From a total of 32, 595 Workers, 15,324 (47%) were female.

Background

It’s been a while since the last update on the demographics of Mechanical Turk Workers, so we thought it’s time for a new look. The current consensus seems to be that MTurk Workers are primarily female. For example Panos Ipeirotis' blog reports that US-based Workers are 65% female. MTurk is always changing, and this report presents data from 75 studies conducted over the last two years.

Friday, February 20, 2015

Why use TurkPrime Panels?



MTurk Requesters are often interested in studying specific groups of people. For example, a researcher may be interested in men over 40, Republicans, people who are concerned about the cleanliness of sponges, or cancer survivors. TurkPrime Panels utilizes various techniques that make the process of acquiring specific MTurk samples faster and cheaper. We virtually guarantee to be able to get panels more economically than most Requesters are able to do on their own. Additionally, we can get the panels a lot faster, and eliminate the considerable amount of manual work that is required to obtain panels.