The issue of discrete probability estimation for samples of small size is addressed in this study. The maximum likelihood method often suffers over-fitting when insufficient data is available. Although the Bayesian approach can avoid over-fitting by using prior distributions, it still has problems with objective analysis. In response to these drawbacks, a new theoretical framework based on thermodynamics, where energy and temperature are introduced, was developed. Entropy and likelihood are placed at the center of this method. The key principle of inference for probability mass functions is the minimum free energy, which is shown to unify the two principles of maximum likelihood and maximum entropy. Our method can robustly estimate probability functions from small size data.