Livestock Research for Rural Development 21 (4) 2009  Guide for preparation of papers  LRRD News  Citation of this paper 
An essential aspect of understanding any natural system is the ability to acquire knowledge through experience and to adapt to new situations. This study, investigates the use of backpropagation Artificial Neural Networks (ANN) (an artificial intelligence technique) approach to model and predict the performance of daughter first lactation milk yield in recorded dairy cattle herds in Kenya. Such prediction is a prerequisite to selection which ultimately leads to optimal breeding strategies and increased annual genetic progress.
Data consisting of 6095 lactation records made by the Kenyan HolsteinFriesian cows from 76 officially milk recorded herds of 445 sires, 1956 dams and 2267 daughters and collected over the period 1988 2005 were used to predict the first lactation performance of the female offspring based on recorded genetic traits of their parents using computer Neural Networks (NN). Weka and MATLAB softwares were used for NN analyses while SAS (2003) and Derivative free Restricted maximum likelihood (DFREML) (Meyer 1989) computer packages were used for statistical analyses.
Different ANN were modeled and the best performing number of hidden layers and neurons and training algorithms retained. The performance of the ANN model in simulating daughter performance was compared with the industry default technique linear regression (LR) model. The best NN model had one hidden layer with 8 hidden nodes and tangent sigmoid transfer function for hidden layer.
The correlation coefficients between the observed and the estimated daughter milk yield for the two estimation methods was generally high (>0.80). Including sire information resulted to more accurate predictions by a neural network as shown by reduced root mean square error. Generally more accurate prediction was obtained by a neural network approach than by linear regression. This suggests a nonlinear relationship exists among the feature variables in the data and that these are learned by the hidden layer of the NN. Thus, prediction tests show that the ANN models used in this study have the potential of predicting daughter first lactation milk yield by using parent performance variables.
Keywords: artificial neural networks, dairy cattle, prediction
Dramatic improvements in yields of animal protein are crucial in meeting the everincreasing food needs in the world (Delgado et al 2001). Selecting animals for breeding that excel in growth, egg, meat, milk or wool production; exhibit increased disease resistance; or have other desirable traits has revolutionized poultry, livestock, and fish production. It is important for animal geneticists to identify and maintain economically profitable animal genotypes (and genes) and to integrate genotype interaction with onfarm production and environmental pressures that affect the genetic potential of food animals.
The main objectives for designing breeding programs are the increased rates of genetic progress and the reduced rate of inbreeding (Dekkers and Gibson 1998). In order to promote genetic improvement in the domestic species, animal geneticists’ ultimate goal is to design breeding programs which optimize selection and mating strategies, under the best population structure.
Whereas in the developed countries there has been marked improvement in livestock production, in developing countries improvements in livestock production have generally been inadequate. One of the principal limiting factors has been the lack of genetically improved animals, a reflection of ineffective breeding programs, if any.
Kenya has made improvement in its livestock production with the dairy industry employing over 50% of the agricultural labor force (directly and indirectly) and contributes to about 10% of gross domestic product(GDP). There are about 3 million dairy cattle producing nearly 2.3 billion litres of milk annually (Mosi and Inyangala 2004). With increased demand for dairy products and high population pressure (Delgado et al 2001) the continuing importance of this subsector in the Kenyan economy depends on productivity increases through breeding programs and more efficient management practices. It is postulated that the subsector has the potential to contribute up to 30% of the total Kenya GDP (Mutungi 2004).
Creating an effective Kenyan dairy industry’s artificial insemination program is both a predictive problem and an optimization problem. In the former the objective is to predict the performance of the offspring based on some of the genetic traits of their parents while the latter has selection (of sires and dams) and allocation of specific matings among the selected animals so as to maximize the rates of genetic progress while minimizing inbreeding.
Breeding programs are based primarily on milk yield. It is imperative that accurate measurements or prediction of milk yield (MY) is essential for the economy of the dairy industry. The latter are also affected by factors including season, number of lactation, environmental conditions, etc. These factors need to be determined and evaluated. The mathematical models which have been used in dairy science to predict MY have limitations like considering linear relations between the input and the desired output and others assume apriori relationship between the input and output variables that could not be true necessarily. For example breeding values (BV) have been calculated using Best Linear Unbiased Predictor (BLUP) (statistical procedure), which assume a linear relationship between traits. In animal breeding data, there is a lot of noise (unexplained factors) from the environment and hence linearity assumptions might be untenable.
Where models are of nonlinear nature, artificial neural networks (ANN) have been found to be tolerant to both the noise and ambiquity in data resulting from environmental influences (Widrov et al 1994, Finn et al 1996). They have been used for prediction in agriculture. For example, Peacock et al (2007) have used ANN in the application in areas of plant protection and biosecurity. Adamcsyk et al (2005) predicted bulls’ slaughter value. Wade and Lacroix (1994) investigated the role of artificial neural networks in animal breeding whereas Banos et al. (2003) used machine learning algorithm for the analysis of national genetic evaluation results.
Artificial Neural Network is mathematical modeling that learns nonlinear relationships in datasets. ANN is a system loosely modeled based on the human brain. Neural networks (NN) are powerful techniques to solve many real world problems. They have the ability to learn from experience in order to improve their performance and to adapt themselves to changes in the environment. In addition they are able to deal with incomplete and noisy data, and can be very effective especially in situations where it is not possible to define the rules or steps that lead to the solution of a problem. They typically consist of many simple processing units, which are wired together in a complex communication network. This structure then is close to the physical workings of the human brain and leads to a new type of computer system that is rather good at a range of complex tasks. Figure 1 shows a simple processing element having n weights, {w1, w2… wn}


and Figure 2 illustrates a typical ANN with 5 inputs, 1 hidden layer with 5 processing elements and 1 output node.


In this study we used supervised learning strategy where the network learns without any intervention of the trainer. Neural networks have two distinct phases of operation: training and testing. Typically, a number of key design parameters need to be chosen before training the network such as: (i) System architecture (topology); (ii) Training algorithm and (iii) Number of training cycles (epochs).
During the learning phase, the network learns by adjusting the weights so as to be able to correctly predict or classify the output target of a given set of input samples. With supervised learning, the network is able to learn from the input and the error (the difference between the output and the desired response). One distinguishing characteristic of ANN is their adaptability, which requires a unique information flow design depicted in Figure 3.


The performance feedback loop utilizes a cost function to provide a measure of deviation between the calculated output and the desired output. This performance feedback is utilized directly to adapt the parameters, weights and biases, so that the system output improves with respect to the desired goal.
Once trained, the neural network is able to recognize similarities when presented with a new input pattern, resulting in a predicted output pattern. It is important to note that unlike in the training phase, the network parameters remain unchanged during the testing phase.
There are also various classes of NN models dependent onproblem type (Classification, Clustering, and Prediction), structure of the model and model building algorithm. This study focuses on feedforward backpropagation (BP) NN used for prediction and classification problems. BP is the most popular method of training multilayer feedforward networks (Rumelhart et al 1986).
NN modeling should be a good approach for combining several input variables for predicting MY. An interesting approach for dynamic mathematical models like NN should be to take management decisions on cows and breeding heifers. In the latter, the generation interval is reduced because you don’t have to wait until the heifer is mated, calves and completes lactation in order to make a breeding decision. The aim of this paper was to discuss the development and optimization of different neural network models to obtain the best model configuration for the prediction of daughter first lactation
Kenya’s cattle population is estimated at 12.5 million head, of which approximately 28% (3.5 million) form the national grade dairy herd, comprising the highgrade European dairy breeds and their crosses (Wakhungu 2001; MLFD 2003). Of the grade dairy herd only about 17,000 dairy cows were officially milk recorded by 2004 (DRSK 2004).
The main dairy breeds in Kenya include the Holstein Friesian, Ayrshire, Guernsey, Jersey and their crosses with the unimproved native Zebu, Boran and Sahiwal. The breeds are evenly distributed in the central highlands, Rift Valley, Western and parts of Nyanza and Coast provinces.
In this study only HolsteinFriesian cattle data were used because they comprise a high proportion of the exotic dairy animals raised in Kenya. Large and mediumscale farms rear approximately 24% of the Holstein Friesian dairy cows in the country and produce most of milk sold in main urban centers (Ojango and Pollot 2001, Wakhungu 2001).
In this study 6095 lactation records from 2267 HosteinFriesian officially milk recorded with the DRSK, in the period 19882005 were available for the analyses.
Data for this study were obtained from cow files maintained at the DRSK in Nakuru, which is the organization responsible for the official milk recording in Kenya.
Each record contained the following information; herd identification, individual cow identification, cows date of birth (daymonthyear), cows calving dates (day monthyear), Lactation milk yield (kg), Lactation length (days), Parity, sire and Dam.
The data were preprocessed with all the inconsistency removed eg. animal without known sire and dam. For the breeding value analysis data was coded, with sires coded lower than the dams and the dams lower than the cows. A pedigree file was then created.
Parity classes were described as 1^{st }to 6^{th} with the 6^{th} and subsequent parities being grouped into sixth subclass because of small numbers of observations and pairwise parity records comparison which showed there was no significant difference between parities.
Season effect was categorized into four classes based on rainfall amounts as dry (JanuaryMarch), long rains (AprilJune), intermediate rains (JulySeptember), short rains (OctoberDecember).
Neural network development
The methodology consists of statistical analysis of the data and development of the neural network model. Weka [WekaWaikato Environment for Knowledge Analysis available from http://www.cs.waikato.ac.nz/ml/weka/] (Weka 2005) and MATLAB [MATLABThe Language of Technical Computing] (MATLAB 2002) softwares were used for NN analyses while SAS (SAS 2003) and DFREML [ DFREMLDerivative Free Restricted Maximum Likelihood] (Meyer 1989) computer packages were used for statistical analyses.
The development followed five steps:
Step 1: Preliminary analyses of the data
Various graphs and scatter plots of the data were examined to find out descriptive variation patterns for variables within the data.
Step 2: Statistical analyses using commercially available statistical software
Multivariate analyses of data were performed by the Statistical Analysis Software (SAS 2003) for windows to determine the fixed effects and random effects. Multiple regression analyses were performed to come up with regression equations.
Breeding value derivation
BLUPDFREML (Meyer 1989) procedure with animal model with relationships was used to predict the sire breeding values for the milk yield. The milk yield records where analyzed using the following model;
Y=Xb + Zu + e (1)
Where:
Y = a vector of observation of Milk Yield (MY),
X = a known incidence matrix accounting for the fixed effects (herd, yearseason, parity), b = the unknown vector of fixed effects,
Z = a known incidence matrix associating animal effects to the vector of observation Y,
u = the vector of individual animal breeding values and
e = the vector of residual random error terms.
The breeding values for milk yields were estimated for each sire using the DFREML animal model procedure.
Feature sets
Progeny performance is predicted from the traits of the sire, the dam and the environment. From statistical analyses different feature sets of importance judged to contribute to dependent variable, first lactation milk yield were identified. The minimum amount of attributes which gave reasonable predictive accuracy was identified through checking at the root mean square error (RMSE) and the coefficient of determination (R^{2}) from ANOVA outputs. Three features datasets are shown in Table 1. Analysis on feature set 3 was on both neural network and linear, the latter being used as baseline.
Table 1. Neural network inputs 

Feature 
Network Input Feature 
1 
Herd mean milk yield excluding first lactation^{*}, Dam second lactation milk yield, BV milk yield 
2 
Herd mean milk yield excluding first lactation, BV milk yield 
3 
Herd mean milk yield excluding first lactation, Dam second milk yield 
* the first lactation milk yield is always low compared to subsequent parities as the cow is still growing, so it is under more environmental influence. 
Step 3: Neural network model construction
The back propagation training algorithm was employed to predict the production trait of daughter milk for first lactation. Multilayer Perceptron (MLP) is a layered feedforward networks typically trained with backpropagation (learning algorithm). MLPs have been proven to be universal approximators (Reed and Marks, 1998), capable of implementing any given function through the use of various nonlinear transfer functions. A number of hidden layer transfer functions logsig and tansig were tested and the latter give more accuracy, so only this one will be described here. The hyperbolic tangent function compresses a unit’s set input into an activation value in the range [1,1].
Step 4: Network optimization
The network architecture was optimized by selecting the best number of hidden layers and nodes per layer. MLP with one hidden layer was shown to model the daughter first lactation milk yield resulting with the best accuracy. It has been demonstrated that at most two hidden layers are sufficient to solve any problem (Haykin 1999).
To circumvent overoptimistic and biased results, data was randomly split, 50%, 75% and 90% of the data was used for training while the rest was used for validation and testing sets. Mean Squared Error (MSE) and Correlation Coefficient are some of the performance functions used to improve the generalization performance of the feed forward neural network (FFNN).
The performance of best MLP with the best model was compared with logistic regression. In all above analyses full training set of 6095 records was used. An additional run was carried out on a random subsample of 1000 records from the training set, with the full set used for prediction.
Attribute coding
The way data is presented to neural networks is important as it may lead to an improvement in the learning process of ANN (Stein 1993). To study the effect of assorted and unsorted data on ANN training, two ANNs were simulated; ANN1 for which 6095 training records were assorted and categorized into four production levels (Table 2) and ANN2 for which records were unsorted. For the sorted data, level 1 was for records less than 3754 Kg per lactation, level 2 production of between 3754 Kg to 5230 Kg, level 3 production between 5230 Kg to 6798 Kg and level 4 production of over 6798 Kg of milk yield.
Table 2. Classification of milk yield for ANN1 training 

Milk yield, kg 
Level 
<3754 
1 
37545230 
2 
52306798 
3 
>6798 
4 
ANN1: Artificial neural network prediction for the assorted data 
Step 5: Multiple tests run on the selected mode
After optimization, a number of runs on the selected ANN were performed
Of the exploratory variables (fixed effects) only parity was significant (P<0.01). The data as expected showed an increase in milk yield up to third parity before declining (Figure 4). This is in agreement with other studies (Rege 1991, Njubi et al 1992, Ojango et al 2000). Season was not significant.


This is the main reason that the average herd and the dam milk yields excluded the first parity milk yield. Parity and season of calving were weakly correlated to milk yield (0.038 and 0.022 respectively) hence the different feature sets excluded them (Table 1). The final feature set one(1) had the ideal number of variables.
The target mean milk yield per lactation was 5082 Kg with SD [Standard deviation] 2144 Kg, predicted milk yield 5383 with SD 1756 and breeding value(BV) for milk yield of 23 Kg with SD 467.Kg
The regression equation was:
MY=49556 + 2.50 EBV + 38.25 Parity – 5.86 Herd + 15.51 SeasonC + 25.42 YearC + 12.24 Lactlen (RSquare = 52.45%)
Where:
MY is milk yield,
EBV is estimated breeding value,
Parity is the number of calving,
Herd is the herd no,
SeasonC and Yearc is season and year of calving respectively and
Lactlen is the duration of a lactation in days.
The pvalues for the estimated coefficients of EBV, Parity, Herd, YearC and Lactlen were 0.00, indicating that they were significantly related to milk yield (MY). The pvalue for SeasonC were >0.05, indicating that it is not related to milk yield at a p –level of 0.05.
The RSquare value obtained was 52.45%, which was fairly low suggesting that the relationship between the predictor and response variable is not linear. Recursive modeling by stepwise regression was performed by eliminating some of the input variables and the Rsquare drastically reduced particularly when EBV was eliminated. To see if the model could be improved, power transformation was applied. Log transformation results did not improve reinforcing the fact that the relationship between the predictor and response variables is not linear.
The breeding value (BV) defined as the animal’s individual value as a genetic parent is very important because in a breeding program we endeavor to select the animals with the best breeding values. If we consistently select the animals with the best breeding values to be parents, we will maximize the rate of genetic change in our herd. The mean BV over the years generally has remained stagnant (Figure 5) a reflection of inefficient selection methodology and mating strategy inspite of constant importation of the proven sire semen (Rege and Mosi 1989; Ojango and Pollot 2001).


Computational searchbased nonparametric modeling techniques such as neural networks (NNs) make no prior assumptions and are capable of ignoring the trend in environmental factors eg. age and season.(Widrov et al 1994, Finn et al 1996). The earlier we identify bull dam will reduce generation interval and also increase selection intensity and hence higher genetic gain. Currently National Dairy Cattle Breeding Programme takes 5 to 7 years to proof sires (Wakhungu and Baptist 1992; Wakhungu 2001). With the decision support system genetic evaluations will be on real time basis and the results should be used during the selection of bull dams.
The predicted milk yield series obtained by linear regression do not fit well to the milk yield series (Figure 6). The Pearson correlation coefficient is 0.724. This meant that there is some nonlinearity in the data and hence neural network (NN) was applied.

Figure 6. Predicted and Observed Milk Yield 
MLP with 1, 2 and 3 hidden layers with varying number of neurons per layer were analyzed. The training algorithm implementation was backpropagation. MLP with two and three hidden layers showed no significant improvement over the one hidden layer. Subsequently only one hidden layer analyses results are reported. Transfer function combination of tangent sigmoid in the hidden layer and output layer had a high correlation coefficient and low mean square error. MLP with one hidden layer with different hidden nodes ranging from 2 to 12 was tested using tansig activation functions for hidden layer.
Training was confined to 5000 epochs, but in most cases there were no significant improvement in the MSE after 1000 epochs. The best MLP was obtained through structural learning where the number of hidden nodes ranged from 2 to 10, while the training set size was setup at 25%, 50%, 75% and 90% of the sample set.
Figure 7 shows the MSE of the training data with the best MLP with 8 hidden nodes and 50% training set size. The MSE of the training data started to increase after 500 epochs, therefore we can use at most 500 epochs for future models.


Table 3 shows the correlations coefficients between the target daughter first milk yield value and the predicted value by the NN for network architectures of two, five, eight and ten hidden nodes. Although not significantly different the correlation coefficient for the eight nodes with 50% training set provides higher values of correlation coefficient than other cases. Figure 8 gives the average of the correct classification rates given different numbers of hidden nodes assuming the best splitting of data.


The results were consistent with Table 3. The correlation coefficient value of 0.842 shows that the network is reasonably good. The effect of a smaller training set on the performance of the NN was investigated. An additional training run using 8 hidden units was performed on a random sample of 1000 records from the training set. The correlation between the target value of daughter first milk yield and the predicted value was 0.818 set. Moreover, the reduction in training time for the network is dramatic (6 times). This finding is significant show a statistical test for this fact in terms of both the clock time necessary to train the network and also the amount of data necessary for training.
Table 3. Correlation coefficients between the target values of daughter first milk yield and the predicted values for MLP 

Number of hidden units 
Correlation coefficients 
Size of training set 
2 
0.839 
90% 
5 
0.839 
90% 
8 
0.842 
90% 
10 
0.842 
90% 
2 
0.831 
75% 
5 
0.832 
75% 
8 
0.817 
75% 
10 
0.837 
75% 
2 
0.827 
66% 
5 
0.835 
66% 
8 
0.837 
66% 
10 
0.837 
66% 
2 
0.835 
50% 
5 
0.839 
50% 
8 
0.842 
50% 
10 
0.839 
50% 
2 
0.835 
25% 
5 
0.836 
25% 
8 
0.838 
25% 
10 
0.830 
25% 
Table 4 results show that there was no significant difference between sorted (ANN1) and unsorted (ANN2) predictions and the observed values (p>0.05). This shows that the results of both ANNs are reliable for milk yield prediction. The high correlations showed that the predicted average milk yield were close to the observed values (Table 4). The performance of sorted data (ANN1) relative to unsorted data seems to be more justified in this study which is consistent with other study by Kominakis et al (2002). The sorted data may cause a better trend, proper updated weights and less bias in the network system and result to high correlation coefficient with the observed values.
Table 4. Correlations and comparisons between the observed and predicted ANN1 and ANN2 data 

Data 
t 
r 
ANN1 
0.121^{a} 
0.862*** 
ANN2 
0.132^{a} 
0.873*** 
***p<0.001; ANN1: artificial neural network prediction for the sorted data; ANN2: artificial neural network prediction for the unsorted data; t: tvalue for the mean difference ( ^{ab} means in the same column significantly different(p<0.05)); r: Correlation coefficient 
The results obtained by MLP with 8 hidden nodes (best case of MLP) for different feature sets and linear regressions were compared. The correlation coefficient was high overall (Table 5). The feature sets 1 and 2 had higher correlation than the third feature set implying that by including sire information more accurate predictions were obtained by a neural network. Generally more accurate prediction were obtained by a neural network than by linear regression as can be seen by comparing the feature sets 1 and 2 (Table 5) correlation coefficients and root mean square error(RMSE). This suggests a nonlinear relationship exists among the feature variables in the data and that these are learned by the hidden layer of the NN. The findings that ANN models are good alternatives for traditional approaches such as multiple regression are similar to those by several authors (Lek et al 1996).
Table 5. The correlation coefficients between target and predicted yields on test set 

Feature set 
1 (NN) 1 (LR) 
2 (NN) 2 (LR) 
3 (NN) 3 (LR) 

Milk 
0.842 (1139) 
0.833 (1176) 
0.839 (1159) 
0.8367 (1157) 
0.813 (1237) 
0.817 (1219) 
* In bracket is the root mean square error (RMSE) 








The correlation coefficient was very high in all feature sets and between the NN and baseline. For an intelligent decision support system it should contain a predictive model capable of working independently under a range of circumstances and therefore a choice of NN is more appropriate.
The neural network prediction from trained neural networks has proven useful and can be incorporated to enhance the capabilities of intelligence decision support system in cattle breeding programme. The results suggest that a nonlinear relationship exists among the feature variables in the data and that these are learned by the hidden layer of the NN.
Artificial Neural Network models are efficient tools to predict the daughter first lactation milk yield based on the characteristics of their parents. The results of this research indicate also that the selection of bull dams can be done earlier without necessarily waiting for the heifers to calve hence reducing the generation interval and consequently increase the genetic gain. The good performance of working to a small dataset can also save time for decision making.
Appreciation is expressed to the Dairy Recording Society of Kenya for providing data.
Adamcsyck K, Molenda M, Szarek J and Skrzynski G 2005 Prediction of bulls’ slaughter value fom growth data using artifiicial neural network. Journal Central European of Agriculture 6(2): 133142 http://www.agr.hr/jcea/issues/jcea62/pdf/jcea625.pdf
Banos G., Mitkas P A, Abas Z, Symeonidis A L, Milis G and Emanuelson U 2003 Quality control national genetic evaluation results using data mining techniques; a progress report, Proceedings 2003 Interbull Annual Meeting, 31:815 http://issel.ee.auth.gr/ktree/Documents/Root%20Folder/ISSEL/Publications/8_DMpaper.pdf
Dekkers J C M and Gibson J P 1998 Applying breeding objectives to dairy cattle improvement. Journal of Dairy Science 81(2):1935 http://jds.fass.org/cgi/reprint/81/suppl_2/19.pdf
Delgado C, Rosegrant M and Meijer S 2001 The Revolution continues. Paper presented at the annual meetings of the International Agricultural Trade Research Consortium (IATRC). Auckland, New Zealand, January 1819
DRSK 2004 Dairy Recording Services of Kenya, Annual report.
Finn G D, Lister R, Szabo R, Simonetta D, Mulder H and Young R 1996 Neural Networks applied to a large biological database to analyse dairy industry pattern. Neural Computing and Applications 4: 237253
Haykin S 1999 Neural Networks: A comprehensive Foundation. Prentice Hall.
Kominakis A P, Abas Z, Maltaris I and Rogdakis 2002 A preliminary study of the application of artificial neural networks to prediction of milk yield in dairy sheep. Computers and Electronics in Agriculture 35: 3548
Lek S, Delacoste M, Baran P, Dimopoulos I, Lauga J and Aulagnier S 1996 Application of neural networks to modelling nonlinear relationships in ecology. Ecological Modelling 90: 39–52
MATLAB 2002 MATrix LABoratory. Matlab 6.5 (Release 13), The Language of Technical Computing, The MathWorks, Natick, Mass, USA.
Meyer K 1989 Restricted maximum likelihood to estimate variance components for animal models with several random effects using a derivativefree algorithm. Genetics Selection Evolution 21: 317340 http://www.gsejournal.org/index.php?option=article&access=standard&Itemid=129&url=/articles/gse/pdf/1989/03/GSE_07540264_1989_21_3_ART0008.pdf
Ministry of Livestock and Fisheries Development 2003 Annual report
Mosi B and Inyangala B 2004 A review of cattle genetic research and development in Kenya. In Proceedings of a workshop on Cattle Production in KenyaStrategies for research planning and implementation, 1516th December 2003. Kenya Agricultural Research Institute, pages 4384.
Mutungi J 2004 A review of major infectious diseases of cattle and their control measures with suggested improvements on disease control to enhance cattle health productivity. In Proceedings of a workshop on Cattle Production in KenyaStrategies for research planning and implementation, 1516th December 2003. Kenya Agricultural Research Institute, pages 340.
Njubi D M, Rege J E O, Thorpe W, Collins‑Lusweti E and Nyambaka R 1992 Genetic and Environmental variation in reproductive and lactational performance of Jersey cattle in the coastal lowland semi‑humid tropics. Tropical Animal Health and Production 24(4):231‑241
Ojango J, Ducrocq V and Pollott G 2000 Survival analysis of factors affecting culling early in the productive life of HolsteinFriesian cattle in Kenya. Livestock Production Science 92(3): 317322
Ojango J M K and Pollott G E 2001 Genetics of milk yield and fertility traits in HolsteinFriesian cattle on largescale Kenyan farms. Journal of Animal Science 79: 17421750 http://jas.fass.org/cgi/reprint/79/7/1742.pdf
Peacock L, Worner S and Pitt J 2007 The application of artificial neural networks in plant protection. Bulletin OEPP/EPPO Bulletin 37, 277282
Reed R D and Marks R J 1998 Neural smithing: Supervised learning in feedforward artificial neural networks. Cambridge, MA: MIT Press.
Rege J E O and Mosi R O 1989 Analysis of the Kenyan Friesian breed from 1968 to 1984: genetic and environmental trends and related parameters of milk production. Bulletin of Animal Health and Production in Africa 37:267
Rege J E O 1991 Genetic analysis of reproductive and productive performance of Friesian cattle in Kenya. I. Genetic and phenotypic parameters. Journal of Animal Breeding and Genetics 108: 412–423
Rumelhart D E, Hinton G E and Williams R J 1986 " Learning representations by backpropagating errors" Nature 323: 533536 http://www.iro.umontreal.ca/~vincentp/ift3395/lectures/backprop_old.pdf
SAS 2003 Procedures guide for personal computers (version 9 edition). SAS Institute Inc, Cary, NC, USA.
Stein R 1993 “Preprocessing Data for Neural Networks,” AI Expert 8 (3): 3237
Wade K M and Lacroix R 1994 The role of artificial neural networks in animal breeding. The fifth world congress on Genetics Applied to Livestock Production, pages 3134.
Wakhungu J W 2001 Ph.D. Thesis title “Dairy Cattle Breeding Policy for Kenyan Smallholders:An Evaluation Based on a Demographic Stationary State Productivity Model” University of Nairobi
Wakhungu J W and Baptist R 1992 Kenya Artificial Insemination: Policy Issues Beyond Rehabilitation and Breeding Programmes Consideration. The Kenya Veterinarian 16: 3337
Weka 2005 Data Mining Software in Java, http://www.cs.waikato.ac.nz/ml/weka/
Widrov B, Rundhart D E and Lehr M A 1994 Neural Networks Applications in industry, business and science communications of ACM 37:93105
Received 13 September 2008; Accepted 31 January 2009; Published 18 April 2009