We determine the expected error by smoothing the data locally. Then we optimize the shape of the kernel smoother to minimize the error. Because the optimal estimator depends on the unknown function, our scheme automatically adjusts to the unknown function. By self-consistently adjusting the kernel smoother, the total estimator adapts to the data. Goodness of fit estimators select a kernel halfwidth by minimizing a function of the halfwidth which is based on the average square residual fit error: $ASR(h)$. A penalty term is included to adjust for using the same data to estimate the function and to evaluate the mean square error. Goodness of fit estimators are relatively simple to implement, but the minimum (of the goodness of fit functional) tends to be sensitive to small perturbations. To remedy this sensitivity problem, we fit the mean square error %goodness of fit functional to a two parameter model prior to determining the optimal halfwidth. Plug-in derivative estimators estimate the second derivative of the unknown function in an initial step, and then substitute this estimate into the asymptotic formula.