CLASSIFICATION OF RED WINES FROM CONTROLLED DESIGNATION OF ORIGIN BY ULTRAVIOLET-VISIBLE AND NEAR-INFRARED SPECTRAL ANALYSIS

Spectroscopy has become one of the most attractive and commonly used methods of analysis in many agricultural products. Chemometrics combined with ultraviolet (UV), visible (VIS) and near-infrared (NIR) spectral analysis were evaluated to classify wines between two controlled designation of origin (DO) of Spain (Rías Baixas and Riberia Sacra). The aim of this work was to determine the feasibility of using the UV-VISNIR spectroscopy combined with chemometrics tools to discriminate between red wines of different DO. Support Vector Machine (SVM) and Linear Discriminant Analysis (LDA) were applied to classify the red wines by their UV-VIS-NIR spectra. Several pre-treatments were applied to improve the classification. The best classification of red wines was obtained in UV-VIS-NIR raw data for LDA models (100% of classification). Results of classification with SVM classification models were slightly lower than LDA results (97.3% for the pretreatment Centred and scaled). This shows the importance of a good selection of the chemometric method of classification. UV, VIS and NIR spectral data with chemometrics tools showed the feasibility of classifying red wines.


INTRODUCTION
The wine production sector is one of the most important industry in Spain, having 69 designation of origin (DO).Galicia is a region of Norwest of Spain with high tradition of wine industry.This region has climatic and topographic conditions different to other regions of production in Spain, allowing wines with different component profile from others.Galicia has five DO (Rías Baixas, Ribeira Sacra, Monterrei, Valdeorras and Ribeiro) (Figueiredo-González et al., 2012).
There is a public interest in wine quality, methods of production and safety of production and consumption (Cozzolino et al., 2011).The wineries demand techniques and highly controlled processes to obtain the highest quality for their products.Furthermore, the methods must be rapid, wide-ranging and easy (Garde-Cerdán et al., 2012a,b).
Grape and grape products, as wine, are a natural source of compounds with important health benefits such as antioxidant, anti-inflammatory, antibacterial and anticarcinogenic activities (Atanacković et al., 2012;Chouchouli et al., 2013;Toaldo et al., 2013).
Wine is a complex mixture of different compounds at several concentrations (Tarantilis et al., 2008;Astray et al., 2010;Ferrer-Gallego et al., 2011;Shen et al., 2012a).Water and ethanol are the main compounds.However, other compounds as glycerol, sugars, organic acids, metals and polyphenols provide different characteristics to wines (Caruso et al., 2012).Generally different methods used to analyse concentrations, compositions and differentiation of wines, such as gas chromatography, mass spectrometry and liquid chromatography waste time and require time and are labour intensive due to sample preparation, time of analysis, preparation of reagents which can be expensive (Cozzolino et al., 2012;Fudge et al., 2013).
Different methods and analytical techniques together inexpensive and powerful computers have developed and optimized to analyse wine composition in several fields as medical, pharmaceutical, petrochemical or food production.These techniques and methods combined with chemometric analyses allow to determine origin of foods, adulterated foods and composition (Cozzolino et al., 2011;Perez et al., 2011;Villagra et al., 2012;Alamprese et al., 2013).Recently, techniques of spectroscopy become in promising techniques that increase the speed of analysis and the same time they decrease the cost.Additionally, they have minimal requirements and preparation of samples are not needed (Canaza-Cayo et al., 2012) Techniques based on spectral data have been applied to determine composition, characterization and detection of adulterations of agricultural products (Fudge et al., 2011) as oil (Rohman and Man, 2011;Luna et al., 2013;Pizarro et al., 2013), honey (Rios-Corripio et al., 2012), meat (Villagra et al., 2012;Alamprese et al., 2013), diverse vegetables (Serranti et al., 2013) and fruit and juices (Ferrer-Gallego et al., 2012).
Near infrared spectroscopy (NIR) is a non-destructive method for measuring of chemical compounds in heterogeneous products as wine (Chauchard et al., 2004;Martelo-Vidal et al., 2013).This technique is used to determine properties of foods for obtaining the characteristics (Cozzolino et al., 2012) and chemical composition.Sometimes, Visible (VIS) and Ultraviolet (UV) spectral data are included due to the presence of pigments in foods like red wines (Alamprese et al., 2013;Martelo-Vidal and Vázquez, 2014a).Characterisation of wines using multivariate data analysis according the geographical origin can be very useful when there is a large quantity of experimental data.The analysis of spectral data usually is performed using statistical procedures called chemometrics tools.
Principal component analysis (PCA) is a multivariate technique that uses a mathematical procedure to transform a set of correlated response variables into principal components (PCs), generating a new set of non-correlated variables.This principal component represent de pattern of observations in maps (Abdi and Williams, 2010;Martelo-Vidal et al., 2013) and provide information about structure of data (Martelo-Vidal et al., 2013).
Supervised and unsupervised methods are the two principal methods to classify and interpret data matrix (Cozzolino et al., 2011).In unsupervised methods, the samples give the algorithm without information on belong to any class, however in supervised methods, data training are composed by a set of training samples and to perform the models and output of cases, are used training samples (Cetó et al., 2013).
For the supervised method Linear Discriminant Analysis (LDA), the categories are defined previously and samples are belonging each category (Cozzolino et al., 2011;Cetó et al., 2013;Martelo-Vidal et al., 2013).LDA method search discriminate functions achieving maximum separation between categories maximize variance between classes and minimising the variance in the class (Pizarro et al., 2013).LDA technique can use three methods (linear, quadratic and mahalanobis methods).
Support Vector Machine (SVM) is another supervised method of classification.SVM can classify linear and no linear multivariate samples.It is a method of classification that successfully applied to high number of classifications.Advantages of SVM over other classification methods are that produce a solution unique and are less susceptible to overfitting (Callejón et al., 2012;Martelo-Vidal et al., 2013).SVM can use four methods (linear, polynomial, Radial Basic Function and Sigmoid methods).
The aim of this work was to determine the feasibility of using the UV, VIS and NIR spectroscopy combined with chemometrics tools to discriminate between red wines of different DO.All the chemometrics tools cited above were used to classify different red wines of DO Rías Baixas and Ribeira Sacra (Spain) in this study.

Samples
The samples were obtained from Regulatory Council of Rías Baixas (19 samples, Table I) and Regulatory Council of Ribeira Sacra (20 samples, Table II).All samples were stored in refrigeration at 5ºC until the time of analysis.

Spectral measurements
Samples were analysed in spectrophotometer V-670 (Jasco Inc, Japan) using transmittance mode in UV/VIS/NIR regions from 190 nm to 2500 nm at 2 nm intervals.Quartz cell with 1 mm path length was used to scan samples (Martelo-Vidal et al., 2013).Samples were equilibrated at 33 ºC (Cozzolino et al., 2007) for 10 min before scanning (Martelo-Vidal et al., 2013).Samples were scanned in duplicate obtaining 78 spectra.

Multivariate data analysis
Data were exported from Spectra Manager TM II software (Jasco Inc, Japan) and imported into Unscrambler software (version X 10.2; CAMO ASA, Oslo, Norway) for pre-treatment and classification analysis.Two replicates of each sample (78 spectra) were analysed in Unscrambler software.PCA was applied to explore data and to obtain relevant information such as to detect outliers and the possible grouping of samples (Cozzolino et al., 2012;Ferrer-Gallego et al., 2012;Cetó et al., 2013;Kečkeš et al., 2013;Martelo-Vidal et al., 2013;Shen et al., 2012b).
The spectra were pre-treated with different techniques and combinations of then (application of a sequence of several pretreatments on the same spectra) to reduce noise and remove or minimise different phenomenon as scatter effects or baseline variations (Luna et al., 2013).Multiplicative Scatter Correction (MSC), Savizky-Golay smoothing, Centred and Scaled, Savizky-Golay second derived, Standar Normal Variate (SNV), De-trending, Baseline correction and their combinations were applied in this study (Martelo-Vidal and Vázquez, 2014b).LDA and SVM calibration models were developed to raw and pre-treated data using cross validation to validate models of classification.

RESULTS AND DISCUSSION
Red wines UV-VIS-NIR spectra are shown in Figure 1.Visual differences can be found.UV and VIS zones (290 to 850 nm) showed clear differences between the wines of study.NIR zone also showed differences although they were not so clear as UV and VIS zones.

Dados espectrais dos vinhos tintos
First and second derived of Savitzky-Golay derivation transformations showed the zones of absorption of wines are showed with peaks more pronounced as can be seen in Figure 2 (Cozzolino et al., 2003(Cozzolino et al., , 2004)).Ethanol absorptions at 1600 and 1900 nm are related to O-H combinations and C-H stretch first overtones, water absorption bands at 950 and 1460 nm are related with third overtone of O-H.Around 540 nm and 2200-2300 nm are related combinations vibration and overtones of ethanol, sugars, phenolic compounds, condensed tannins and nitrogen compounds (Cozzolino et al., 2004).Aromatic acids and sugars present variations in 990 nm by O-H stretch second overtones and sugars.Absorptions were observed at 1690 nm and 1750 nm related to C-H 3 stretch first overtone and C-H 2 , C-H stretch first overtone respectively in glucose, ethanol and water (Martelo-Vidal et al., 2013).
Score plot of PCA (Figure 3) of raw data for two first principal components explain 97 % of total variance of the spectra in wines analyzed.Separation of DO Rías Baixas and Ribeira Sacra is not clear because the samples are overlapped.However, there are a tendency of samples of wines from DO Ribeira Sacra are spread along of PC1 and samples from DO Rías Baixas are spread along of PC2.
The eigenvectors of PCA were analysed to investigate the basis of the separation obtained between DO Rías Baixas and DO Ribeira Sacra.In Figure 4 shows the first two PCs explain 97 % of variation observed.First PC explains 96 % of variation present highest loading around 350 to 600 nm.This spectral region is characteristic of pigments from red wines.For example, oenin has at maximum absorption at 530 nm and malvin at 529 nm (Martelo-Vidal and Vázquez, 2014a,c).Second PC explain 1% of the variation and the loading showed inverse correlation with visible and ultraviolet region, around 1500 nm, 1800 nm and around 2300 nm (Cozzolino et al., 2012).
LDA and SVM were performed for classification analysis of the red wines.The classification methods were applied to raw and pre-processed data.
Table III show the proportion of DO Rías Baixas and Ribeira Sacra wines correctly classified when SVM was applied.Pre-processed data with centred and scaled data were showed a proportion of right classification of 97.37 % with 100 % of right Classification rates with LDA of Rías Baixas and Ribeira Sacra red wines according their DO are showed in Table IV.In this case, a total classification  Eigenvectors das duas componentes principais (PC) da análise PCA dos vinhos tintos das Denominações de Origem Rías Baixas e Ribeira Sacra.
Baixas.The 100% classification means that the variety is not the origin of the differences in the wine spectra.The regions have different climatic and topographic conditions allowing wines with different component profile from others.This allows the application of this technique to authentication purposes of red wines although they were made with the same grape variety.

Figure 1 -
Figure 1 -Spectral data of red wines.

Figure 4 -
Figure 4 -Eigenvectors for the PCA of two principal components (PC) of red wines analysed from Rías Baixas and Ribeira Sacra Designation of Origin.

TABLE I
Red wines of DO Rías Baixas analysed in this studyVinhos Tintos da DO Rías Baixas analisados neste estudo

TABLE III
Proportion (%) of support vector machine (SVM) classification of red wines according to Designation of Origin Proporção (%) de classificação da máquina de vectores de suporte (SVM) de vinhos tintos em função da Denominação de Origem