Using Amazon’s MTURK for Multiple Waves of Data Collection: Part 1 (External Survey Hosting and Random Number Generators)
Amazon’s MTURK is a promising way to collect data in the social sciences . The sample (as compared to other convenience samples), timeliness, ease, and cost are all strong positive attributes of the service (Landers & Behrend, 2014). However, there is not a convenient method to collect multiple waves of data over time from the same respondents, thus precluding an otherwise helpful reduction of common method bias (Podsakoff, MacKenzie, Lee, & Podsakoff, 2003).
After personally carrying out two waves of data collection (with a one-week time lag) for 1,000 respondents (MTURK Workers) with surveys hosted independent of MTURK, I would like to share the steps and methods used to track workers between the surveys and encourage participation in the follow-up (Part 2). First though (Part 1), I lay the ground work for those using external survey hosting.
- To start, create a separate HIT on MTURK for each wave of data collection you plan to carry out. The HIT should contain:
- A starting link to your independently hosted survey (e.g. Qualtrics)
- A single text box for data entry
- Instructions (i.e. “Click the link to be redirected to our survey. Please, LEAVE this browser window open as you will be asked to paste a code provided in the survey in the box below.”)
- Be sure to emphasize keeping the browser window open, as in the example provided above.
- This can all be easily accomplished through MTURK’s “Survey Link” template.
- Create separate surveys for each round of data collection using your respective survey service.
- Have the survey provide a randomly generated number at the end using a random number generator. Qualtrics uses their own web services to provide a random number generator and I recommend them. Tyler Burleigh offers a nice walk through here (and this will guide you with most other survey services as well). Survey Gizmo mentions an internal random number generator feature in this help document. Feel free to give that a try if you cannot settle with Qualtrics.
- As a buffer, in case a problem occurs with your random number generator, I recommend asking for the respondents’ WorkerID on the same page as the random number generator output or directly before (if possible and ethical, require this output in your survey).
- Use MTURK’s CSV file download/upload method to “mass-approve” HITs.
- Click the “Manage” tab, then “Manage HITs Individually,” then “Download Results,” and again “Download Results” as in the screenshot below.
- This provides you with a CSV file you can open in Excel or any type of text editor.
- Amending your spreadsheet created with Tyler’s guidance earlier in this step, you can now match up WorkerIDs lacking a legitimate verification code to your downloaded CSV file and use the Excel’s fill function to approve all HITs (with an “x” in the “approve” column). Then, hand-select those without legitimate verification codes for rejection and re-upload the file. Alternatively, you could create an Excel function to do this for you.
- That takes care of the externally hosted aspect of your surveys and the use of verification codes to ensure data integrity. The next part (2) explains how to follow-up with workers for subsequent surveys and match data between your survey waves.