Comparison of data mining approaches for estimating soil nutrient contents using diffuse reflectance spectroscopy
Diffuse reflectance spectroscopy (DRS) operating in wavelength range of 350–2500 nm is emerging as a rapid and noninvasive approach for estimating soil nutrient content. The success of the DRS approach relies on the ability of the data mining algorithms to extract appropriate spectral features while accounting for nonlinearity and complexity of the reflectance spectra. There is no comparative assessment of spectral algorithms for estimating nutrient content of Indian soils. We compare the performance of partialleastsquares regression (PLSR), support vector regression (SVR), discrete wavelet transformation (DWT) and their combinations (DWT–PLSR and DWT–SVR) to estimate soil nutrient content. The DRS models were generated for extractable phosphorus (P), potassium (K), sulphur (S), boron (B), zinc (Zn), iron (Fe) and aluminium (Al) content in Vertisols and Alfisols and were compared using residual prediction deviation (RPD) of validation dataset. The best DRS models yielded accurate predictions for P (RPD = 2.27), Fe (RPD = 2.91) in Vertisols and Fe (RPD = 2.43) in Alfisols, while B (RPD = 1.63), Zn (RPD = 1.49) in Vertisols and K (RPD = 1.89), Zn (RPD = 1.41) in Alfisols were predicted with moderate accuracy. The DWT–SVR outperformed all other approaches in case of P, K and Fe in Vertisols and P, K and Zn in Alfisols; whereas the PLSR approach was better for B, Zn and Al in Vertisols and B, Fe and Al in Alfisols. The DWT–SVR approach yielded parsimonious DRS models with similar or better prediction accuracy than PLSR approach. Hence, the DWT–SVR may be considered as a suitable data mining approach for estimating soil nutrients in Alfisols and Vertisols of India.