We propose and analyze a compact and non-volatile nanomagnetic (all-spin) non-binary matrix multiplier performing the multiply-and-accumulate (MAC) operation using two magnetic tunnel junctions - one activated by strain to act as the multiplier, and the other activated by spin-orbit torque pulses to act as a domain wall synapse that performs the operation of the accumulator. It has two advantages over the usual crossbar-based non-binary matrix multiplier. First, while the crossbar architecture requires N**2 devices to multiply two matrices, we require only two devices regardless of the value of N. Second, while the energy dissipation in the crossbar architecture scales as N**2, in our construct, it scales as N. Each MAC operation can be performed in ~5 ns and the maximum energy dissipated per operation is ~60N aJ. This provides a very useful hardware accelerator for machine learning and artificial intelligence tasks which often involve the multiplication of large matrices. The non-volatility allows the matrix multiplier to be embedded in powerful non-von-Neumann architectures. It also allows all computing to be done at the edge while reducing the need to access the cloud, thereby making artificial intelligence more resilient against cyberattacks.