In this paper, robust detection, tracking and geometry estimation methods are developed and combined into a system for estimating time-difference estimates, microphone localization and sound source movement. No assumptions on the 3D locations of the microphones and sound sources are made. The system is capable of tracking continuously moving sound sources in an reverberant environment. The multi-path components are explicitly tracked and used in the geometry estimation parts. The system is based on matching between pairs of channels using GCC-PHAT. Instead of taking a single maximum at each time instant from each such pair, we select the four strongest local maxima. This produce a set of hypothesis to work with in the subsequent steps, where consistency constraints between the channels and time-continuity constraints are exploited. In the paper it demonstrated how such detections can be used to estimate microphone positions, sound source movement and room geometry. The methods are tested and verified using real data from several reverberant environments. The evaluation demonstrated accuracy in the order of few millimeters.