How to take multiple factors into account when evaluating a Game with a Purpose? How is player behaviour or participation influenced by different incentives? How does player engagement impact their accuracy in solving tasks? In this paper, we present a detailed investigation of multiple factors affecting the evaluation of a GWAP and we show how they impact on the achieved results. We inform our study with the experimental assessment of a GWAP designed to solve a multinomial classification task.