Effective Mechanical Turk: The TurkPrime Blog

Wednesday, August 15, 2018

TurkPrime Tools to Help Combat Responses from Suspicious Geolocations

Last week, the research community was struck with concern that “bots” were contaminating data collection on Amazon’s Mechanical Turk (MTurk). We wrote about the issue and conducted our own preliminary investigation into the problem using the TurkPrime database. In this blog, we introduce two new tools TurkPrime is launching to help researchers combat suspicious activity on MTurk and reiterate some of the important takeaways from this conversation so far.

TurkPrime’s Tools to Deal with Suspicious Activity

As we announced last week, we’ve created two new tools to help researchers fight fraud in their data collection:

1. Block Suspicious Geolocations

2. Block Duplicate Geolocations

The Block Suspicious Geolocations tool is a Free Feature that allows researchers to block submissions from a list of suspicious geolocations. In our investigation last week, we identified several geolocations that were responsible for a majority of duplicate submissions. Our Block Suspicious Geolocations tool will prevent any MTurk Worker from submitting a HIT from these locations. As mentioned in last week’s blog, once we removed these locations from our analyses, we saw the rate of duplicate submissions from the same geolocation across studies this summer fell to 1.7%—a number well within the range of what we’ve identified as normal across the life of our platform. The screenshot below shows our new Block Suspicious Geolocations tool, found in Tab 6 “Worker Requirements” when you design a study.

Our second tool, the Block Duplicate Geolocations tool, is a Pro Feature that allows researchers to block multiple submissions from any geolocation. The Block Duplicate Geolocations tool casts a much wider net than the Block Suspicious Geolocations tool and should ensure that responses collected in any one survey come from a more distributed set of locations. By restricting the number of submissions from each geolocation, researchers can be more confident that the responses they collect are coming from unique participants. When using this tool data collection may be a little slower, especially if the target sample is concentrated in a small geographic area (e.g., one particular state). The screenshot below shows our new Block Duplicate Geolocations Tool, found in Tab 8 “Pro Features” when you design a study.

Moving Forward

Understanding what has caused the recent increase in low quality responses on MTurk and the corresponding increase in submissions from the same geolocation is a matter of ongoing research. As we learn more details we will share them with the research community and continue to develop tools that ensure the highest quality of research data.

More immediately, we have identified a list of worker IDs that have repeatedly been associated with suspicious geolocations. In addition to the tools described above, we will create an internal exclusion list based on the worker IDs of suspicious accounts over the next several days. This exclusion list will create an additional layer of protection on our system by blocking worker accounts that have a high likelihood of being involved in fraud. We will write another blog to provide more detail about this issue in the coming days. In the meantime, however, researchers already have two powerful tools for eliminating fraud in their data collection. These tools should increase researchers’ confidence that they are obtaining genuine responses from unique workers.

Friday, August 10, 2018

Concerns about Bots on Mechanical Turk: Problems and Solutions

Data quality on online platforms

When researchers collect data online, it’s natural to be concerned about data quality. Participants aren’t in the lab, so researchers can’t see who is taking their survey, what those participants are doing while answering questions, or whether participants are who they say they are. Not knowing is unsettling.

Recently, the research community has been consumed with concern that workers on Amazon’s Mechanical Turk (MTurk) are cheating requesters by faking their location or using “bots” to submit surveys. These concerns originated and have been driven by reports from researchers that there are more nonsensical and low-quality responses in recent studies conducted on MTurk. In these studies, researchers have noticed that several low-quality responses are pinned to the same geolocation. In this blog, we’d like to add some context to the conversation, share the findings from our internal inquiry, and inform researchers what TurkPrime is doing to address the issue.

Concern about Bots

The recent concern about bots appears to have begun on Tuesday, August 7th, 2018, when a researcher asked the PscyhMap Facebook group if anyone had experienced an increase in low quality data. In just the third response to that thread, another researcher suggested, “maybe a machine?” Soon, other researchers were reporting an increase in nonsense responses and low-quality data, although at least a few reported no increase in junk responses to their studies. The primary piece of evidence causing researchers to suspect bots was that most of the low-quality responses were tagged to the same geolocation and a few places in particular—Niagara Square in Buffalo, NY; a lake in Kansas; and a forest in Venezuela. What’s more, many respondents from these geolocations provided suspicious responses to open-ended questions, often answering with “GOOD STUDY,” or “NICE.”

Although this activity raises concerns, the conversation, so far, has overlooked some important points. Most critically, while it is clear some researchers have unfortunately collected several bad responses, the research community does not yet know how widespread this problem is. Diagnosing the issue requires knowing how many studies don’t fit the pattern, as well as how many do.

Scope of problem

At TurkPrime, we track the geolocation of all surveys submitted in studies run on our platform. In the last 24 hours, we have worked to determine whether there is a growing problem of multiple submissions from the same geolocation. In reviewing over 100,000 studies that have been launched on TurkPrime, we see that the rate of submissions from duplicate geolocations typically bounced from less than 1% to 2.5% within a study—a number that could be explained by people submitting surveys from the same building, office, internet service provider, or even the same city. Geolocations are not precise, an issue we will discuss in more detail in a future blog post.

Based on this analysis, we set 2.5% as the threshold for detecting suspicious activity. Over 97% of studies have not reached this threshold, showing that the overwhelming majority of studies have not been affected by data coming from the same geolocation.

However, when we look at the rate of duplicate submissions based on geolocation over time, we see that in March of this year the percentage of duplicate submissions began edging up. Clearly, this is a problem, but a problem that has emerged only recently.

What TurkPrime is Doing

At TurkPrime, we are developing tools that will help researchers combat suspicious activity. We have identified that all suspicious activity is coming from a relatively small number of sources. We have additionally confirmed that blocking those sources completely eliminates the problem. In fact, once the suspicious locations were removed, we saw that the number of duplicate submissions had actually dropped over the summer to a rate of just 1.7% in July 2018.

What you can do to eliminate the problem

In the coming days, we will launch a Free feature that allows researchers to block suspicious geolocations. This means researchers will be able to block workers from suspicious geolocations, excluding submissions from those locations in their data collection. We will also launch a Pro feature that allows researchers to block multiple submissions from the same geolocation within a study. This feature will cast a wider net and may block well-intentioned workers using the same internet service provider, or working in the same library. This tool will give researchers greater confidence that they are not receiving submissions from anyone using the same location to submit junk responses.

Conclusions

Our data, and the work of multiple researchers, show there has been a recent increase in the number of low quality responses submitted on Mechanical Turk. Data from the TurkPrime database show that the vast majority of all studies, and the vast majority of recent studies, have never been affected by the current concern of bots. What we still don’t know about the recent issue, is whether these responses are coming from bots or foreign workers using a VPN to disguise their location and submit surveys intended to sample US workers. Either way, in the coming days TurkPrime will release tools that allow researchers to block workers from suspicious locations and to decide how narrowly they would like to set the exclusion criteria. Concerns about bots and low quality data on MTurk are not new. But at TurkPrime we will continue to look for ways to ensure quality data and to make conducting online research easier for researchers.

Thursday, January 11, 2018

MicroBatch is now a Pro Feature

TurkPrime is announcing a change in our pricing for the MicroBatch feature. MicroBatch is now included as a Pro feature, with a fee of 2 cents + 5% per complete. This will also provide users with access to all other pro features, with no additional charge. This change is necessary so that we can continue to provide the highest quality service and tools that our users expect.

To explain the new pricing structure further, see the following example (the part highlighted in blue reflects the recent change):

Study Example: # of HITs: 100

Payment per HIT: $1

Without MicroBatch: Total Study Cost = $140	With MicroBatch: Total Study Cost = $127
Payment to Workers: $100	Payment to Workers = $100
MTurk Fees (40%) = $40	MTurk Fees (20%) = $20
TurkPrime Fees (0%) = $0	TurkPrime Fees (2 cents + 5% per complete) = $7
	Savings of 13%

Since its inception 3 years ago, TurkPrime has been dedicated to enhancing the quality and usability of online research with the goal of empowering researchers. The Team at TurkPrime remains committed to this mission. As more researchers come to use our platform and the online sampling world continues to evolve, we have taken on more staff, so that we can continue to provide individualized customer support to each of our users, and develop new tools to expand and improve functionality.

We greatly appreciate our relationship with our users, and know that this change will allow us to continue providing high quality service and equipping the research community with the tools and features they need. We also know that for some users this change was unexpected. If you have a funded grant with funds allocated to participant collection based on our previous pricing, please contact us so that we can honor our previous pricing. We look forward to your feedback, and to assisting you in getting the most out of your research and TurkPrime.

Sunday, December 24, 2017

We are Hiring: Project manager/research assistant position (full time)

TurkPrime

Queens, NY

Project Manager/Research Assistant

Background: TurkPrime is a web-based platform for online participant recruitment for the social and behavioral sciences, market research, and medical research. TurkPrime was launched in May, 2015, and currently serves over 6,000 subscribers from universities and corporate institutions around the world. Over 3,000 studies are conducted on TurkPrime each month.

The TurkPrime Toolkit is a set of cutting edge research tools allowing for flexible online study management and data collection using multiple platforms including Mechanical Turk. TurkPrime manages a panel of over 150,000 Mechanical Turk respondents, and partners with multiple additional sample providers through API integration to achieve a global reach of over 20 million respondents. The combination of robust research tools as well as an extensively profiled participant pool allows TurkPrime clients to conduct high quality research flexibly and effectively.

TurkPrime is also actively engaged in academic research focusing on, but not limited to, online data collection methodology1-5. The research team at TurkPrime works closely with our software development team to make sure that TurkPrime’s system and research practices are grounded in solid empirical research, and that the features we provide are beneficial to a wide range of researchers. TurkPrime’s research team has a track record of peer-reviewed publications that address issues such as the contribution of its software to research design and data quality, assessment and improvement of data quality on Mechanical Turk, and selective recruitment from Mechanical Turk and other platforms.

We are looking for a full time project manager/research assistant with extremely strong communication, organizational and writing skills to manage projects, help clients with technical questions, collect data for online projects, and help write peer-reviewed papers, white papers, and blog entries.

Responsibilities: The key responsibilities for the position include: client management including responding to TurkPrime users’ questions, managing complex client projects (e.g. intensive longitudinal, diary, dyadic, and video interview studies), writing a weekly blog, study design and data collection, supporting the writing of peer-reviewed and white papers, maintaining project documentation, managing project data and materials, and quality control.

Skills: Extremely high writing and communication skills; Exceptional organization and attention to detail; Ability to use web communication and documentation software effectively; Team-oriented; Very strong work ethic; Multi-tasking; Self-starter and industrious; Adaptivity to rapidly changing demands in a high performance workplace; Background in scientific methodology (B.A. or more is required for the position). Experience with conducting online research and knowledge of online research software (Qualtrics, Millisecond, MTurk etc) is a plus. Data analysis skills are a plus. Interest in publishing peer-reviewed papers is a plus.

Notes: TurkPrime is based in Kew Gardens Hills, NY. Initially, the project manager would be expected to be in the Kew Gardens Hills office 4 days per week. Over time, a more flexible schedule will be considered. TurkPrime is an equal opportunity employer and strongly encourages applications from members of groups underrepresented in science and technology industries.

Applying: Please send a resume and a letter of interest to leib.litman@turkprime.com.

Representative Recent Publications

1.   Litman, L., Robinson, J. Online Research on Mechanical Turk and Other Platforms. Sage Publications. Innovations in Methodology Series (367 pages). Scheduled to be published in 2018.
2.   Litman, L., Robinson, J., & Rosenzweig, C. (2015). The relationship between motivation, monetary compensation, and data quality among US and India-based workers on Mechanical Turk. Behavior Research Methods, 47(2), 519-528.
3.   Litman, L., Robinson, J., & Abberbock, T. (2016). TurkPrime. com: A versatile crowdsourcing data acquisition platform for the behavioral sciences. Behavior Research Methods, 1-10.
4.   Litman, L., Williams, M. T., Rosen, Z., Weinberger-Litman, S. L., & Robinson, J. (2017). Racial Disparities in Cleanliness Attitudes Mediate Consumer Attitudes toward Cleaning Products: A Serial Mediation Model. Journal of Racial and Ethnic Health Disparities, 1-9. (Developed methods relating to selective recruitment on Mechanical Turk).
5.   Litman, L., Robinson, J., Weinberger-Litman, S. L., & Finkelstein, R. (2017). Both Intrinsic and Extrinsic Religious Orientation are Positively Associated with Attitudes Toward Cleanliness: Exploring Multiple Routes from Godliness to Cleanliness. Journal of Religion and Health, 1-12. (Developed methods relating to selective recruitment on Mechanical Turk).

Friday, December 22, 2017

Conducting research: How online research tools can help

When researchers learn about conducting research online, it can sometimes be difficult to understand quite how all the tools that are available can actually be applied to a project. This post is about how specific research ideas can be carried out using the kinds of features available for online research. Online research tools can make it much simpler to recruit balanced samples of individuals that are hard to find and selectively sample using more traditional methods.

On TurkPrime, you can selectively recruit participants based on many variables that may be of interest to you, ranging from age, gender and race, to political orientation, religion, employment, physical health symptoms, and personality (see them all here). This feature can be immensely useful in targeting specific populations. If you don’t want to use this feature or want to screen for participant characteristics of a kind that isn’t on our full list, you can still find specific populations by running your own screening studies to find the kind of participants that you want to target in subsequent studies (to learn more about how to do this see “best practices” section here, as well as this post here).

As an example of how this feature may be useful, let us imagine a religion researcher who is interested in how religiosity and religious orientation (Extrinsic vs Intrinsic orientations) predict many behaviors and decisions in life. For this researcher to carry out studies comparing different levels of religiosity, she needs to finds a sample that is not religiously homogeneous. This researcher may even want to recruit a sample that is balanced, with a third of the sample being highly religious, a third of the sample being somewhat religious, and a third of the sample being not religious at all. Once a sample like this is recruited, all sorts of interesting questions about how religion impacts behaviors can be tested.

A study just like this was recently published by some of the members of the TurkPrime staff in the Journal of Religion and Health (Litman, Robinson, Weinberger-Litman, Finkelstein, 2017). In this study, TurkPrime’s Panel feature was used in order to selectively recruit 3 groups of approximately 150 participants with different relationships to religion, some being very religious, other less religious, and a 3rd group of those who were not religious at all. In this study, researchers found that attitudes toward cleanliness was significantly predicted by religiousness, and religious orientation, even when most covariates of attitudes toward cleanliness were included in a regression model. This research has important implications for the relationship between religion and health.

Imagine for a moment how difficult it would have been to conduct this research without online sampling methods. Perhaps researchers would have tried to reach out to students on their college campus either through a college database of students, or by recruiting people in person. These researchers might hope that by reaching out to great numbers of students they would find enough to fill these three categories of religion. Maybe if they couldn’t find enough religious students this way, they would need to seek out religious groups on campus and reach out to them directly. Additional time might be spent sitting with each student while they completed a survey. This would require multiple research assistants and a lot of time and perseverance. Still, this would result in data from a convenience sample that is highly skewed toward the young, more liberal, students that are on most campuses. If this research was conducted on MTurk without the use of TurkPrime tools, it also would have been very hard to get a sample that includes religious individuals, as MTurk has a much higher rate of atheism (around 40%), than the U.S. population.

Features available in online research have significant impact on the kinds of studies that can be carried out, enhancing methods available to researchers and making one aspect of their job a whole lot easier. There are many more examples of researchers using TurkPrime to carry out complex projects, some of which can be seen in the references linked below.

References

Dixon, L. J., Witcraft, S. M., McCowan, N. K., & Brodell, R. T. (2017). Stress and Skin Disease Quality of Life: The Moderating Role of Anxiety Sensitivity Social Concerns. British Journal of Dermatology.

Gill, M. J., & Mendes, D. M. (2016). When the Minority Thinks “Essentially” Like the Majority: Blacks Distinguish Bio-Somatic from Bio-Behavioral Essentialism in Their Conceptions of Whites, and Only the Latter Predicts Prejudice. PloS one, 11(8), e0160086.

Litman, L., Robinson, J., Weinberger-Litman, S. L., & Finkelstein, R. (2017). Both Intrinsic and Extrinsic Religious Orientation are Positively Associated with Attitudes Toward Cleanliness: Exploring Multiple Routes from Godliness to Cleanliness. Journal of religion and health, 1-12.

Litman, L., Williams, M. T., Rosen, Z., Weinberger-Litman, S. L., & Robinson, J. (2017). Racial Disparities in Cleanliness Attitudes Mediate Purchasing Attitudes Toward Cleaning Products: a Serial Mediation Model. Journal of racial and ethnic health disparities, 1-9.

Friday, December 15, 2017

New Feature: Exclude Highly Active Workers

Some workers on MTurk are extremely active, and take the majority of posted HITs. This can lead to many issues, some of which are outlined in our previous post. Although MTurk has over 100,000 workers who take surveys each year, and around 25,000 who take surveys each month, you are much more likely to recruit highly active workers who take a majority of HITs. About 1,000 workers (1% of workers) take 21% of the HITs. About 10,000 workers (10% of workers) take 74% of all HITs.

TurkPrime now has a feature to allow researchers to exclude the most active workers so that you can collect data from less experienced workers who are less likely to have previously taken part in research similar to your own. Below is a screenshot of the “Naivete (Exclude most active Workers)” feature. You can select what percentage of workers you would like to exclude from the dropdown menu seen below.

Friday, December 8, 2017

Best recruitment practices: working with issues of non-naivete on MTurk

It is important to consider how many highly experienced workers there are on Mechanical Turk. As discussed in previous posts, there is a population pool of active workers in the thousands, but this is far from exhaustible. A small group of workers take a very large number of HITs posted to MTurk, and these workers are very experienced and have seen measures commonly used in the social and behavioral sciences. Research has shown that when participants are repeatedly exposed to the same measures, this can have negative effects on data collection, changing the way workers perform, creating treatment effects, giving participants insight into the purpose of some studies, and in some cases impact effect sizes of experimental manipulations. This issue is referred to as non-naivete (Chandler, 2014; Chandler, 2016).

The current standard approaches to recruitment on MTurk actually compound this problem. When recruiting workers on Mechanical Turk, requesters have the ability to selectively recruit workers based on specific criteria such as the number of HITs previously approved, and the workers’ approval rating - the percentage of previous HITs that were approved of total HITs completed. A commonly used standard is to select workers who have approval ratings of >95% (see Peer, 2014). This is not quite enough on its own, however, because MTurk’s system assigns a 100% approval rating to all workers who have completed between 1 and 100 HITs, regardless of how many were actually approved. Once workers complete 100 HITs, their approval rating accurately reflects the number of HITs they were approved for. It is therefore recommended, and common practice to only recruit workers who have approval ratings of >95% and who have completed at least 100 HITs. Once researchers use the approval rating system as part of their qualifications for a study, by default, the TurkPrime system adds the qualification that workers must have previously completed at least 100 HITs in order to address this issue (researchers do of course have manual control of this).

By selectively recruiting workers with a high approval rating and a high number of previously completed HITs a requester can have increased confidence that workers in their sample can be trusted to follow instructions and pay attention to tasks. Indeed, many researchers choose to recruit participants who have high approval ratings and have completed a high number of previous studies. The approval rating system is unique, as it is a constant motivating factor that makes workers pay attention to each task. This system helps researchers collect high quality data. However, this leads to the exclusion of workers who have completed few HITs, even if they may be good providers of data but haven’t yet had the chance to “prove” it. The use of only workers who have high approval ratings has a negative effect, which is that it is a selection criteria that is based on recruiting only workers who are more experienced, and therefore less naive to measures used on the MTurk platform, bringing the issue of non-naivete to the fore.

Solutions

TurkPrime is introducing a new tool which allows requesters to exclude workers who are extremely active, thus making it possible to selectively recruit workers who are not overly active and are more naive to commonly used measures. We believe this will have great positive impacts on data collection if researchers choose to utilize it. Another option for researchers is to use Prime Panels, which has workers who are more naive to commonly used measures due to the size of the platform and its primary use for marketing research surveys which typically have very different data collection goals and uses different tools than those used on MTurk.