Experiences of Crowd Sourced Testing

I’ve touched on crowd sourced testing before, and in this post I’ll go into some details of its limitations and the best scenarios to use it. I’ll also address the positive aspects of crowd sourcing QA, as well as the negatives

This is based on around 20 test runs over a period of several months. The product was in a advanced beta stage, and the testers were not instructed to perform specific task, but given various areas of the aplpications functions to test out.

The main bonus of using crowd sourced testers is that the the volume of testing and number of testers you have access to is far greater than we had internally. The combinations of OS, browser and hardware was also a better approximation of out target market. Crowd sourcing QA also means you are not necessarily bound by geographic or language limitations, so you can access a large number of testers with different locales and languages – especially important if your application is localized and/or your application needs to deal with various language inputs (asian, and right to left languages especially proved quite difficult to accomodate).



The Numbers
For each run, we selected to invite around 200 testers from vaious locations around the world. Of these, around 35-40 accepted the invitation and tested the application. The bulk of testers we located in the US, with India, and Russia next, followed by European contries and a smattering of other locations.
The tester selection was based on the rankings provided by the platform. Each tester has a ranking determined by the number and quality of the bugs found in their previous testing (with other applications, not only ours)

What About the Time?
Here is another advantage of using crowd sourced QA. The vast majority of issues are reported within the first few hours. Since testers get paid based on performance, and duplicate bugs are not accepted, they have an incentive to “get there first” and report as many issues as possible. It is incredibly useful to have this overview of the state of your application in the real world in a matter of a single business day. The geographic spread of the testers means that you are not limited by a 9-5 schedule.

What About the Quality?
Here, crowd sourcing isn’t as good as I’d hoped. A lot of issues reported were minor, or cosmetic. Furthermore, many issues were not described as thoroughly as I’d have liked, so needed to be verified, or the tester would be messaged to provide more detail and/or files that they used. Many testers however dug really deep into the application and submitted excellent issues, with complete documentation for reproducing. It is alos common for testers to provide a screencast of the actual bug happening.

So What are the Bad Things?
Luckily, not many. the main drawback is that after the first day of testing, there is a large drop off (around 90%) in the number of times the application is used. We included tracking in our application in order to gauge the useage, and it was quite surprising to see such a massive drop off in application starts. This is possibly because the diminishing returns of spending time to find issues, and the testers being invited to other test applications. I would recommend anyone using crowd sourced testers to include some tracking logic in their application to get a feels for how much it has been worked over.


Another issue, is that there may be unreproducible issues which may only happen on a particular testers machine. In an internal or contracted QA scenario, the machine can be observed and dubegging tools installed in order to trouble shoot the application. This is impossible with crowd sourced testers. I would recommend for this scenario to include some kind of crash reporting and indicating to the testers to submit these dumps to your dev team. Although this is useful, it is more work for the dev team, and more overhead for them to analyse. Tools like Bugsplat, Google Breakpad, and Microsofts Winqual are all useful in this regard.

Conclusions
In general, crowd sourcing some of your QA is a great idea. The challenge is to find the right time and things to test. It is not a substitute for dedicated internal QA, but can greatly reduce their workload, and can be extremely useful, especially at the tail end of the project, close to release. The best results I have achieved were with a advanced beta build which has limited number of known issues, and by directing the testers to specific functional areas rather than letting them loose on the whole application (although this may depend on the size of the application in question). The also will need to be a dedicated crowd source QA manager to review the submitted issues and adjust the scope if necessary. Proper planning is paramount, especially when setting out the scope of the testing, and the type of bugs that you will accept. Proper usage tracking and crash/error reporting mechanisms built into the application are paramount if you want to get any kind of metrics out of it (these can be taken out of the release version of the application).