In this paper, we analyze if cascade usage of the context encoder with increasing input can improve the results of the inpainting. For this purpose, we train context encoder for 64x64 pixels images in a standard way and use its resized output to fill in the missing input region of the 128x128 context encoder, both in training and evaluation phase. As the result, the inpainting is visibly more plausible. In order to thoroughly verify the results, we introduce normalized squared-distortion, a measure for quantitative inpainting evaluation, and we provide its mathematical explanation. This is the first attempt to formalize the inpainting measure, which is based on the properties of latent feature representation, instead of L2 reconstruction loss.