Learning Effective Binary Visual Representations with Deep Networks

Jianxin Wu, Jian-Hao Luo

Although traditionally binary visual representations are mainly designed to reduce computational and storage costs in the image retrieval research, this paper argues that binary visual representations can be applied to large scale recognition and detection problems in addition to hashing in retrieval. Furthermore, the binary nature may make it generalize better than its real-valued counterparts. Existing binary hashing methods are either two-stage or hinging on loss term regularization or saturated functions, hence converge slowly and only emit soft binary values. This paper proposes Approximately Binary Clamping (ABC), which is non-saturating, end-to-end trainable, with fast convergence and can output true binary visual representations. ABC achieves comparable accuracy in ImageNet classification as its real-valued counterpart, and even generalizes better in object detection. On benchmark image retrieval datasets, ABC also outperforms existing hashing methods.

Knowledge Graph

arrow_drop_up

Comments

Sign up or login to leave a comment