Energy forecasting has a vital role to play in smart grid (SG) systems involving various applications such as demand-side management, load shedding, optimum dispatch, etc. Managing efficient forecasting while ensuring the least possible prediction error is one of the main challenges posed in the grid today, considering the uncertainty and granularity in the SG data. This paper presents a comprehensive and application-oriented review of state-of-the-art forecasting methods for SG systems considering the different models and architectures. Traditional statistical and machine learning-based forecasting methods are extensively investigated in terms of their applicability to energy forecasting. In addition, the significance of hybrid methods and data pre-processing techniques for better forecasting accuracy is also highlighted. A comparative case study using the Victorian electricity consumption benchmark and American electric power (AEP) datasets is conducted to analyze the performance of different forecasting methods. The analysis demonstrates higher accuracy of the recurrent neural network (RNN) and long-short term memory (LSTM) methods when sample sizes are larger and hyperparameters are appropriately tuned. Furthermore, hybrid methods such as CNN-LSTM are also highly effective to deal with long sequences in energy data.