While prior work on context-based music recommendation focused on fixed set of contexts (e.g. walking, driving, jogging), we propose to use multiple sensors and external data sources to describe momentary (ephemeral) context in a rich way with a very large number of possible states (e.g. jogging fast along in downtown of Sydney under a heavy rain at night being tired and angry). With our approach, we address the problems which current approaches face: 1) a limited ability to infer context from missing or faulty sensor data; 2) an inability to use contextual information to support novel content discovery.