Two aspects of improvements are proposed for the OpenCL-based implementation of the social field pedestrian model. In the aspect of algorithm, a method based on the idea of divide-and-conquer is devised in order to overcome the problem of global memory depletion when fields are of a larger size. This is of importance for the study of finer pedestrian walking behavior, which usually requires larger fields. In the aspect of computation, the OpenCL heterogeneous framework is thoroughly studied. Factors that may affect the numerical efficiency are evaluated, with regarding to the social field model previously proposed. This includes usage of local memory, deliberate patch of data structures for avoidance of bank conflicts, and so on. Numerical experiments disclose that the numerical efficiency is brought to an even higher level. Compared to the CPU model and the previous GPU model, the current GPU model can be at most 71.56 and 13.3 times faster respectively so that it is more qualified to be a core engine for analysis of super-large scale crowd.