The paper addresses the problem of acquiring highquality photographs with handheld smartphone cameras in low-light imaging conditions. We propose an approach based on capturing pairs of short and long exposure images in rapid succession and fusing them into a single highquality photograph. Unlike existing methods, we take advantage of both images simultaneously and perform a joint denoising and deblurring using a convolutional neural network. The network is trained using a combination of real and simulated data. To that end, we introduce a novel approach for generating realistic short-long exposure image pairs. The evaluation shows that the method produces good images in extremely challenging conditions and outperforms existing denoising and deblurring methods. Furthermore, it enables exposure fusion even in the presence of motion blur.