The performance of objective image quality assessment (IQA) models has been evaluated primarily by comparing model predictions to human judgments. Perceptual datasets (e.g., LIVE and TID2013) gathered for this purpose provide useful benchmarks for improving IQA methods, but their heavy use creates a risk of overfitting. Here, we perform a large-scale comparison of perceptual IQA models in terms of their use as objectives for the optimization of image processing algorithms. Specifically, we evaluate eleven full-reference IQA models by using them as objective functions to train deep neural networks for four low-level vision tasks: denoising, deblurring, super-resolution, and compression. Extensive subjective testing on the optimized images allows us to rank the competing models in terms of their perceptual performance, elucidate their relative advantages and disadvantages for these tasks, and propose a set of desirable properties for incorporation into future IQA models.