An algorithm is described that adaptively learns a non-linear mutation distribution. It works by training a denoising autoencoder (DA) online at each generation of a genetic algorithm to reconstruct a slowly decaying memory of the best genotypes so far. A compressed hidden layer forces the autoencoder to learn hidden features in the training set that can be used to accelerate search on novel problems with similar structure. Its output neurons define a probability distribution that we sample from to produce offspring solutions. The algorithm outperforms a canonical genetic algorithm on several combinatorial optimisation problems, e.g. multidimensional 0/1 knapsack problem, MAXSAT, HIFF, and on parameter optimisation problems, e.g. Rastrigin and Rosenbrock functions.