Spiking neural networks (SNNs) enable power-efficient implementations due to their sparse, spike-based coding scheme. This paper develops a bio-inspired SNN that uses unsupervised learning to extract discriminative features from speech signals, which can subsequently be used in a classifier. The architecture consists of a spiking convolutional/pooling layer followed by a fully connected spiking layer for feature discovery. The convolutional layer of leaky, integrate-and-fire (LIF) neurons represents primary acoustic features. The fully connected layer is equipped with a probabilistic spike-timing-dependent plasticity learning rule. This layer represents the discriminative features through probabilistic, LIF neurons. To assess the discriminative power of the learned features, they are used in a hidden Markov model (HMM) for spoken digit recognition. The experimental results show performance above 96% that compares favorably with popular statistical feature extraction methods. Our results provide a novel demonstration of unsupervised feature acquisition in an SNN.