We introduce CellSegmenter, a structured deep generative model and an amortized inference framework for unsupervised representation learning and instance segmentation tasks. The proposed inference algorithm is convolutional and parallelized, without any recurrent mechanisms, and is able to resolve object-object occlusion while simultaneously treating distant non-occluding objects independently. This leads to extremely fast training times while allowing extrapolation to arbitrary number of instances. We further introduce a transparent posterior regularization strategy that encourages scene reconstructions with fewest localized objects and a low-complexity background. We evaluate our method on a challenging synthetic multi-MNIST dataset with a structured background and achieve nearly perfect accuracy with only a few hundred training epochs. Finally, we show segmentation results obtained for a cell nuclei imaging dataset, demonstrating the ability of our method to provide high-quality segmentations while also handling realistic use cases involving large number of instances.