Using Amazon’s MTURK for Multiple Waves of Data Collection: Part 2 (Qualifications, Follow-up Survey Invitations and Tracking Responses between Surveys)
Now for the good stuff–follow-ups with MTURK!
After completing your first survey, you have your set of workers to follow-up with for one or more additional waves of data collection. Here’s how to make it work:
- First, you are going to need CRAN-R (free, open-source software) and the MTurkR package (also free and open-source; created by Thomas Leeper) to carry out mass mailings to MTURK workers (to send your follow-up) using the methods I recommend here. So download both as necessary from the given links. The MTurkR package provides access to MTURK’s Requester API tools. Feel free to pay for Amazon’s own services, but Thomas’ R package works very well and he is quite helpful and accessible.
- Once you’re all setup with MTurkR and Amazon, you are almost ready to go live with your follow-up HIT(s) and contact workers. But first, you need to go into MTURK and setup your follow-up HIT(s) to only allow those who completed your first survey to participate in the follow-up. To do this, you need to create a Qualification Type on MTURK. Click the “Manage” tab and then “Qualification Types” and “Create New Qualification Type” as in the screenshot below:
- Create a qualification type called “Follow-up Survey(s)” and assign this to all the Workers from your first survey by downloading a Worker CSV file from Amazon and assigning your unique qualification type to all the workers in your first survey. This can be done be clicking the “Manage” tab, then “Workers,” then “Download CSV.” In the “UPDATE-Follow-up Survey(s)” column enter the number “99” for all workers (that you want to take follow-up surveys).
- Now for each follow-up HIT, click the “Advanced” link on the HIT creation screen and select “Follow-up Survey(s),” “Equal to,” and “99” respectively under “Specify ALL the qualifications Workers must meet to work on your HITs:” Also select “Yes” under “Only Workers who qualify to do my HITs can preview my HITs.” This only allows those who completed your first survey to see and complete your follow-ups. Using the qualification name “Survey Two” you can see my setup in the screenshot below:
- You can now go live with your follow-up HIT and move on to contact the Workers to complete the next survey on MTURK using R and the MTurkR package.
- To prepare for using MTurkR, access your initial set of Amazon WorkerIDs by downloading those who completed your first survey. Click the “Manage” tab, then “Manage HITs Individually,” “Download Results,” and again “Download Results” as in the screenshot below.
- You will need to select the column of WorkerIDs from this file and copy the entries in order to paste into the MTurkR program. I recommend filtering by workers you approved in the first round so you do not invite problematic workers to participate in subsequent surveys.
- Now it’s time to power up R and load Leeper’s package. This will allow you to contact your workers for follow-up. I provide some thoughts for what should be included in your communication after the syntax shown in this step. The R syntax is below with comments in red. The actual selections you should enter are provided in blue. R output is black. Reference Leeper’s MTurkR-specific instruction for further guidance if needed.
Example using MTurkR Version 0.5.52
To get the latest version run the code below:
if(!require(“devtools”)){
install.packages(“devtools”)
library(“devtools”)}
install_github(“leeper/MTurkR”)
Install MTurkR
library(“MTurkR”, lib.loc=”~/R/win-library/3.0″)
wizard.simple()
MTurkR Wizard loading…
Retrieve your AWS access keys from https://aws-portal.amazon.com/gp/aws/securityCredentials
AWS/MTurk Access Key ID: [enter yours]
AWS/MTurk Secret Access Key: [enter yours]
Use Sandbox? (Y/N): N
MTurkR Operations
1: Check Account Balance 2: Check Sufficient Funds 3: Create HIT
4: Check HIT Status 5: Get Assignment(s) 6: Extend HIT
7: Expire HIT 8: Approve Assignment(s) 9: Reject Assignment(s)
10: Grant Bonus(es) 11: Contact Worker(s) 12: Block Worker(s)
13: Unblock Worker(s) 14: Manage Qualifications 15: Requester Statistics
16: Worker Statistics 17: Open MTurk RUI Pages 18: Load MTurkR Log File/Entries
19: Exit
Selection: 11
Contact one worker or multiple workers?
1: Single Worker
2: Multiple Workers
Selection: 2
Email Subject Line: MTURK Survey…[enter your subject]
Email body text: Hello,… [enter your body text]
NOTE: Use \n to indicate new lines. You cannot enter new lines by pressing “enter.” Ensure your text is all on one line or you will get errors!
How many workers to notify: 1000 [or however many you plan to contact]
NOTE: The line directly above may be obsolete in future versions.
Enter each WorkerID on its line
NOTE: You should copy and paste all your Workers directly from the Excel spreadsheet you downloaded in steps 6 and 7 earlier.
1: WorkerID
2: WorkerID
3: WorkerID
[… for each WorkerID]
Your final output should read [after a bunch of repetitious statements]:
Valid
1 TRUE
2 TRUE
3 TRUE
[… for each WorkerID contacted]
Then it’s just:
MTurkR Operations
1: Check Account Balance 2: Check Sufficient Funds 3: Create HIT
4: Check HIT Status 5: Get Assignment(s) 6: Extend HIT
7: Expire HIT 8: Approve Assignment(s) 9: Reject Assignment(s)
10: Grant Bonus(es) 11: Contact Worker(s) 12: Block Worker(s)
13: Unblock Worker(s) 14: Manage Qualifications 15: Requester Statistics
16: Worker Statistics 17: Open MTurk RUI Pages 18: Load MTurkR Log File/Entries
19: Exit
Selection: 19
- I recommend sending out an initial invitation to complete the survey and then two more, one at the midpoint of the survey window and one within the final 14 hours of the survey (we had a one week window of completion). You should provide a link for the users to copy and paste into their browser which directly searches for your HIT (I was unable to use a direct link to the HIT because of the way the MTURK site handles links). Thus, your HIT should have a unique name that you can search for and find independent of all other HITs. Once you’ve performed such a search, copy the address bar in your browser for this link and place that in you email so workers can easily find the follow up survey and complete it. I recommend reminding them that they must be logged-in to MTURK with the same WorkerID as the first survey in order to qualify for the HIT.
- I recommend emphasizing the need for workers to keep their browser window open while completing the survey in your communication. Something like “Open the survey in a new window to ensure the HIT stays open for you and you can submit your random number to the HIT on survey completion. To do this, right-click on the survey link and select “open link in a new window.””
- Once you’ve collected all the waves of your surveys, you’ll need to go through and match up the results based on WorkerIDs. The WorkerIDs can be associated with you surveys by matching them from the random number generated at the end of surveys and reported to MTURK, or by explicitly collecting WorkerIDs in each of your surveys. I use SAS data steps to carry out the merging of surveys by WorkerID.
That should do it! Through Part 1 and Part 2 of this series, you have learned how to host surveys independent of MTURK and carry out waves of data collection over time by setting up qualifications for Workers and batch contacting them with Amazon’s API and the MTurkR package.
Hello,
I’m not really certain how we can really be sure in accuracy of data that participants collected for us… Is there some another type of checking its consistency despite the qualification test? does the mTurk work in that field as well, per HIT on basis?
Thanks ahead for providing the answer, If any
Marko
Good question Marco. There are other quality controls procedures. You can also use MTURK to filter for workers with specific approval ratings. All said, the verdict is still out on the quality of MTURK. Ultimately there are limitations with any sample. MTURK seems to work well for some purposes. See the following for a couple published perspectives:
Barger, P., Behrend, T. S., Sharek, D. J., & Sinar, E. F. (2011). I-O and the Crowd: Frequently Asked Questions About Using Mechanical Turk for Research. TIP: The Industrial-Organizational Psychologist, 49(2), 11–17.
Richard N. Landers, & Tara S. Behrend. (2014). An Inconvenient Truth: Arbitrary Distinctions Between Organizational, Mechanical Turk, and Other Convenience Samples. Industrial and Organizational Psychology: Perspectives on Science and Practice. Retrieved from http://www.siop.org/journal/Article_1.aspx
I’m not really certain how we can really be sure in accuracy of data that participants collected for us… Is there some another type of checking its consistency despite the qualification test? does the mTurk work in that field as well, per HIT on basis?
You can implement quality control items and manipulation checks in your data (as in other survey research). The quality of the data has been discussed in a few articles (see below for a couple examples). I’m not sure I follow your last question. Could you expand on it?
Barger, P., Behrend, T. S., Sharek, D. J., & Sinar, E. F. (2011). I-O and the Crowd: Frequently Asked Questions About Using Mechanical Turk for Research. TIP: The Industrial-Organizational Psychologist, 49(2), 11–17.
Richard N. Landers, & Tara S. Behrend. (2014). An Inconvenient Truth: Arbitrary Distinctions Between Organizational, Mechanical Turk, and Other Convenience Samples. Industrial and Organizational Psychology: Perspectives on Science and Practice. Retrieved from http://www.siop.org/journal/Article_1.aspx