Best Prediction of Lactation Yield

By Paul VanRaden and John B. Cole

A cow is milked two or more times each day during her lactation, but often only a few of those milkings are weighed or sampled. Mathematical formulas can be used to estimate the milk and component yields that weren't measured. For each test day, the cow's 24-hour yield is estimated by the dairy records processing center. Then, beginning in February 1999, AIPL combined these measured yields and estimates of all other daily yields into a lactation yield using a method called best prediction.

For the previous 30 years, the test interval method (TIM) was used to calculate lactation records. Nonmeasured yields were estimated by simple linear interpolation (connecting the dots) between measured yields. Yields, at the peak of lactation, before the first test, or after the last test were estimated from tables of Shook factors or projection factors. These methods worked well with monthly weighing and sampling but weren't designed for the wider variety of test plans now used.

The term "best prediction" was defined by C.R. Henderson to describe the analysis of data with known averages, variances, and covariances and a normal distribution. Much earlier, L.N. Hazel called the same procedure "selection index." Non-measured yields can be estimated by best prediction from their covariances with measured yields and the variances of measured yields. Herd averages are assumed to be known even though they are estimated. This method is simpler to compute than a test-day model but still provides accurate lactation records for a variety of test plans.

Correlations between any two daily yields were estimated from test-day data for 500,000 lactations that included milk, fat, and protein yields, and somatic cell score. Instead of storing the million or so individual correlations, smooth functions were developed to approximate the correlations. First- and later-lactation curves also were plotted from this data set. Herd averages are provided to AIPL each month by the dairy records processing centers.

Variation is affected by the test plan. Measurement errors are decreased if several daily yields are averaged as in labor efficient records (LER). Errors are increased in a.m-p.m. testing because only a fraction (2/3, 1/2, or 1/3) of the cow's daily yield is weighed or sampled. Errors also are assumed larger if measurements are reported by the cow's owner instead of a supervisor because the owner might profit from inaccurate data whereas supervisors are trained to provide accurate data. Finally, variation of lactation yield is affected by the number and pattern of tests within the lactation.

Best prediction provides improved estimates of lactation yield for a wide range of data. The accuracy of each estimate is also provided and is called a "data collection rating." Lactation curves can also be graphed from the observed yields and the estimated yields for all other days of lactation. All of these computations can be performed at breed associations and dairy records processing centers and perhaps on-farm computers in addition to the national database. With best prediction, farmers will have more choice in data collection.

Long Lactations

A new version of best prediction was implemented in January 2009, and includes support for lactations longer than 305-d, new correlations within and between traits, improved reference lactation curves, and BP of daily yields. The original implementation of BP was limited to 305-d lactations, although longer lactations can be accommodated by calculating breed- and lactation-specific standard lactation curves longer than 365-d and estimating covariances among test days for DIM >365-d. Additionally, lactation curves originally were constructed using the test interval method (Sargent et al.1968), which uses simple linear interpolation that assumes incorrectly that yields change at a constant rate in the interval between successive test days. Dematawewa et al. (2007) used data from long lactations to fit a number of lactation curves to test day data for milk (M), fat (F), and protein (P), concluding that Wood's (1967) curves best described yield out to 999-d. Standard curves are now calculated using the "smooth" Wood's curves, although the test interval curves are still available in the software. However, new curves for SCS yield, as well as curves describing the standard deviations of M, F, P, and SCS yield, were needed for a complete implementation of smooth curves. Standard deviations of M, F, and P are modeled using Wood's curves. Mean and SD of SCS are modeled using a function first described by Morant and Gnanasakthy (1989) and applied to SCS data by Rodriguez-Zas et al. (2000). Curves describing the mean and SD of all traits were developed for each of the major U.S. dairy breeds (Ayrshire, Brown Swiss, Guernsey, Jersey, Holstein, Milking Shorthorn) for first- and later-lactation cows, providing breed-, parity-, and trait-specific reference curves.

Cole et al. (2007) also estimated correlations among test day yields using a simplified model that included an identity matrix (I) to model daily measurement error and an autoregressive matrix (E) to account for biological change. Autoregressive parameters (r) describe how similar adjacent test days are expected to be, and were estimated separately for first- and later-parities; values for M, F, and P were slightly larger than previous estimates (Norman et al., 1999) due to the inclusion of the identity matrix. Parameters were not previously calculated for SCS. The matrix of correlations within traits (B) was calculated as a weighted function of I and E, and separate functions were used to model the yield traits and SCS. Milk, F and P are mutually correlated, while SCS is assumed to be uncorrelated with M, F, and P.


Cole, J.B., Null, D.J., and VanRaden, P.M. 2008. Best prediction of yields for long lactations. J. Dairy Sci. (Accepted)

Cole, J.B., and VanRaden, P.M. 2007. A Manual for Use of BESTPRED: A Program for Estimation of Lactation Yield and Persistency Using Best Prediction. Available: Accessed: 3 October, 2008.

Cole, J.B., VanRaden, P.M., and Dematawewa, C.M.B. 2007. Estimation of yields for long lactations using best prediction. Journal of Dairy Science 90(Suppl. 1), 421(abstr. 558). 

Norman, H.D., VanRaden, P.M., Wright, J.R., and Clay, J.S. 1999. Comparison of test interval and best prediction methods for estimation of lactation yield from monthly, a.m.-p.m., and trimonthly testing. J. Dairy Sci. 82(2):438-444.

VanRaden, P.M. 1997. Lactation yields and accuracies computed from test day yields and (co)variances by best prediction. J. Dairy Sci. 80(11):3015-3022.