Evaluating the clinical similarities between pairwise patients is a fundamental problem in healthcare informatics. A proper patient similarity measure enables various downstream applications, such as cohort study and treatment comparative effectiveness research. One major carrier for conducting patient similarity research is Electronic Health Records(EHRs), which are usually heterogeneous, longitudinal, and sparse. Though existing studies on learning patient similarity from EHRs have shown being useful in solving real clinical problems, their applicability is limited due to the lack of medical interpretations. Moreover, most previous methods assume a vector-based representation for patients, which typically requires aggregation of medical events over a certain time period. As a consequence, temporal information will be lost. In this paper, we propose a patient similarity evaluation framework based on the temporal matching of longitudinal patient EHRs. Two efficient methods are presented, unsupervised and supervised, both of which preserve the temporal properties in EHRs. The supervised scheme takes a convolutional neural network architecture and learns an optimal representation of patient clinical records with medical concept embedding. The empirical results on real-world clinical data demonstrate substantial improvement over the baselines. We make our code and sample data available for further study.