Virtual Reality is used successfully to treat people for regular phobias. A new challenge is to develop Virtual Reality Exposure Training for social skills. Virtual actors in such systems have to show appropriate social behavior including emotions, gaze, and keeping distance. The behavior must be realistic and real-time. Current approaches consist of four steps: 1) trainee social signal detection, 2) cognitive-affective interpretation, 3) determination of the appropriate bodily responses, and 4) actuation. The "cognitive" detour of such approaches does not match the directness of human bodily reflexes and causes unrealistic responses and delay. Instead, we propose virtual reflexes as concurrent sensory-motor processes to control virtual actors. Here we present a virtual reflexes architecture, explain how emotion and cognitive modulation are embedded, detail its workings, and give an example description of an aggression training application.