Text data present in multimedia contain useful information for automatic annotation, indexing. Extracted information used for recognition of the overlay or scene text from a given video or image. The Extracted text can be used for retrieving the videos and images. In this paper, firstly, we are discussed the different techniques for text extraction from images and videos. Secondly, we are reviewed the techniques for indexing and retrieval of image and videos by using extracted text.