SENSORY PROFILE OF PORT WINES: CATEGORICAL PRINCIPAL COMPONENT ANALYSIS, AN APPROACH FOR SENSORY DATA TREATMENT

Port wine is a fortified wine. After the grape spirit addition the fermentation stops and the wine retains some of the natural sweetness of the grape. Port wine exhibits a variety of different styles, each with its own characteristic flavours: White Ports, Ruby Ports and Tawny Ports. Information about the wines sensory characteristics is critical for the successful development and marketing of each new wine brand. This type of information can be obtained using descriptive sensory tests with trained panels that can recognize different sensory descriptors in wines. Given that the collected variables are measured on an ordinal scale a Categorical Principal Component Analysis (CATPCA) can be performed. However, for many years, multivariate analysis has been used for wine characteristic evaluation and Principal Component Analysis (PCA) has long been applied to sensory data treatment. The two main purposes of this study were to describe a specific sensory method, used by a trained sensory panel including chemical compounds reference development, to establish the most important descriptive and discriminative sensory attributes of different Port wine styles and brands and to compare


INTRODUCTION
Port wine is a fortified wine made by adding a proportion of grape spirit, or brandy, to the wine before the must/wine has finished fermenting. After the grape spirit addition the fermentation stops and the wine retains some of the natural sweetness of the grape, making it rich, round and smooth on the palate. One of the interesting aspects of Port wine is its variety of different styles, each with its own characteristic flavours. Ruby, Reserve Ports and Late Bottled Vintage Ports (LBV) aged usually in vat for two, three years or even six years (LBV) share a deep red youthful colour and intense fruity flavours, reminiscent of cherry, blackberry and blackcurrant. Tawny Ports (10,20,30 and 40 year old tawny), which age for longer periods in oak casks, present delicious nuttiness and aromas of butterscotch and fine oak wood; White Ports, made from classic white grapes, usually aged for two or three years in large vats and are available in sweeter or drier styles.
In the Portuguese market, within each Port wine style, there are several Port wine brands. Brand management in today's business world is extremely related to the organizations purpose and improvement of their strategies (Miralles et al., 2008). Information about the wines sensory characteristics is critical for the successful development and marketing of each new wine brand. This type of information can be obtained using descriptive sensory tests with trained panels (Meilgaard et al., 1999;Stone and Sidel, 2004).
Aiming at detecting different sensory descriptors in wines, and given that the collected variables are measured on an ordinal scale a Categorical Principal Component Analysis (CATPCA) can be performed.
In analyzing ordinal variables it should be taken into account that the categories of the variable have a fixed a priori order, but this does not imply that the differences between numeric labels of the categories are maintained. As known, in sensory sciences many variables are nominal or ordinal and relationships between variables are frequently nonlinear. Hence, linear or standard PCA could be not appropriate and should be used only after linearity in ordinal variables has been verified. In fact, multivariate analysis has been used for wine characteristic evaluation and Principal Component Analysis (PCA) has long been applied to sensory data treatment (Noble and Shannon, 1987;Zamora and Guirao, 2002;Vilanova et al., 2008Vilanova et al., , 2009Rodriguez-Nogales et al., 2009;Esti et al., 2010;Dreyer et al., 2013;Liang et al., 2013). During the last years, nonlinear PCA has been introduced and developed to avoid the limitations of standard PCA (Gifi, 1990;Linting et al., 2007). The CATPCA procedure (Meulman et al., 2004) belongs to such class of methods, and it is based on quantification of categorical variables by applying optimal scaling technique. The CATPCA finds category quantifications that are optimal in the sense that the overall variance accounted for in the transformed variables, given the number of components, is maximized. In the optimal scaling process, information in the original categorical data is retained in the optimal quantifications, depending upon the optimal scaling level that can be chosen for each variable separately (Meulman et al., 2004).
The two main purposes of this study were to describe a specific sensory method, used by a trained sensory panel (Monteiro et al., 2014), including chemical compounds reference development, to establish the most important descriptive and discriminative sensory attributes of different Port wine styles and brands and to compare the results of Principal Component Analysis (PCA) with the results of Categorical of Principal Component Analysis (CATPCA), in order to assess the feasibility of both techniques.

Wines
In this study were evaluated 28 samples of Port wines from 3 different styles: ten White Ports; nine Ruby Ports and nine Tawny Ports, all from different Demarcated Douro Region (DDR) wineries, with cellars in Oporto (Table I). The brands are coded in our work, to avoid revealing commercial names. These wines were commercial wines presented in bottles of 0.75 L, and produced according to the process of each winery/wine cellar. The bottles were stored in a cellar, lying down and under the same conditions -relative humidity around 85% and at a temperature around 12° C. Prior to each tasting session the bottles were maintained at 12º C until tasting.

Selection of descriptors for Port wines and development of references
Two wines of each Port style were tasted and discussed by twelve trained panelists over three tasting sessions in an attempt to generate terms. A free choice of attributes to describe Port was used. Each session lasted around 1 hour. In all the sessions, the Wine Aroma Wheel (Noble and Shannon, 1987) was provided to facilitate term generation. Appearance (colour and clarity), aroma, taste, flavour and mouthfeel references were provided to facilitate the discussion. From an original long list of attributes, a reduced list was compiled by analyzing the frequency of citations. For the development of quantitative references, in order to make reference evaluation as close as possible to wine-tasting conditions, identical glasses as used for wine evaluation (ISO, 1977) were used for the aroma reference presentation.
In an attempt to make the panelists more familiar with the wines, during the reference defining sessions, three wines, one of each style, were evaluated and discussed. After all the references were developed, 3 training sessions were carried out according to the methodology that would be used to evaluate the wines.

Wine Tasting
In this study we analyzed 28 Port wines available in the Portuguese market. All the wines were evaluated in triplicate in nine tasting sessions, one session per week, from 10.00 to 12:00 a.m. The wines were randomly distributed throughout the sessions of each series in a way that the three replications were consecutive: all the samples were assessed once, then a second time and then a third time.
Sessions were carried out under controlled temperature conditions (20 ± 2°C) and relative humidity (60 ± 20%). Aroma references (Table II) were served in standardized wine-tasting glasses (ISO, 1977). Wine bottles were opened immediately before tasting, and 35 mL samples of each wine were served in standardized glasses. Reference and wine glasses were covered with Petri dishes and were immediately brought to the tasting booths in the sensory evaluation laboratory.
The references and wines were evaluated in isolated booths according to the methodology describe by Monteiro et al. (2014). Attribute intensities were scored on a 5-point scale (ranging from 1-lowest intensity to 5 -highest intensity) by comparison with the intensity of the references. References and samples were expectorated. The panelists were instructed to rinse their mouth with water between references and between wines, as well as to use unsalted crackers to decrease astringency carryover. The panelists were told to have a rest and to leave the tasting room if necessary.

Data Analysis
All statistical analysis was performed using SPSS (IBM SPSS Statistics 20). In order to establish the most important descriptive and discriminative sensory attributes of different Port wine styles and brands, Principle Component Analysis (PCA) and Categorical Principle Component Analysis (CATPCA) were applied on the data set of 23 attributes.

RESULTS AND DISCUSSION
With the aim of establishing and interpreting the sensory descriptors of 28 Port Wines, a PCA was applied on the total data set of 23 attributes. To use a PCA is necessary to check some assumptions namely, the Bartlett test of sphericity (Hair et al., 2009), a statistical test for the presence of correlation among the variables, and the measure of sampling adequacy of Kaiser-Meyer-Olkin (KMO) (Hair et al., 2009), which must exceed 0.5. As shown in Table III, a statistically significant Bartlett's test of sphericity, sig=0.00, indicates that sufficient correlation exists among the variables yet, a lower value of KMO, 0.209, indicates a not good sampling adequacy.
The two-dimensional model, The principal components are illustrated in Figure 1. The model did not highlight differences among wines from winery brands, however wine samples are grouped on the plane according to wine style. As we said before a PCA was applied on the data set of 23 attributes, but, only 19 of them contributed to the two-dimensional model in a meaningful way (factor loadings > 0.5,  Natural products placed in sensory tasting glasses. 5 Red fruits (aroma) Aroma associated to berries, as raspberry and strawberry.
Natural products placed in sensory tasting glasses. 5 Dried fruits (aroma) Aroma associated to dried fruits such us almonds, nutmegs and raisins.
Maceration of 10 g of almonds, nutmegs and raisins in 100 ml of a hydro alcoholic solution, 19% (v/v) in ethanol. (1) Only descriptors/attributes that had a frequency of citation higher than 2.5% were used.
(2) Nominal scale for aroma and flavour attributes intensity scoring: The attribute is not perceived at all 1 Doubts about the presence of the attribute 2 The attribute is clearly perceived, although it is slight 3 The attribute is clearly perceived, but the intensity is lower than the reference 4 The attribute is clearly perceived and the intensity is close or similar to the reference 5  (Table VI). For the second component the internal consistency coefficient is 0.862 with an eigenvalue of 5.695, indicating that its proportion of variance is 24.761%. Thus, the two components explained 74.253% of the total amount of initial variance (Table VI), a higher value than the one achieved with PCA analysis.
The first principal component distinguishes Ruby brands, located on the positive axis from White brands on the negative axis. In the Ruby brands, the attributes Ruby, Red fruits, Fruity flavour, Astringency and Floral were dominant, whereas in the White brands, attributes like Honey, Sweet taste, Alcoholic sensation, Balance, Acid taste and Moscatel are the ones that better characterize these wines. Tawny Port Wines are characterized by the ortho-nasal attribute Dried fruits.
Principle Component Analysis and Categorical Principle Component Analysis are appropriated for "good" variable selection and dimension reduction. They can be used to analyze interrelationships among a large number of variables and explain these variables in terms of their common underlying dimensions (factors) (Hair et al., 2009). The objective is to find a few linear combinations of the variables (factors) that can be used to summarize the data without losing too much information in the process. As mentioned before, the PCA is a technique that should only, in principle, be applied when the variables are quantitative, have multivariate normal distribution, linearly related to each other and the sample size should be large enough, at least five times as many observations as the number of variables to be analyzed (Hair et al., 2009). This statistical procedure requires three stages: validation of the model, factor extraction and factor rotation (optional). The first stage involves the calculation of the matrix correlation to determine the degree of association between the variables. A rule of thumb will be to consider correlations between 0.3 and 0.7. Another method of determining the appropriateness of PCA is the Bartlett test of sphericity, which provides the statistical significance that the correlation matrix has significant correlations among at least some variables. A statistically significant Bartlett's test of sphericity (sig < 0.05) indicates that sufficient correlation exist among the variables (Hair et al., 2009). A third measure to quantify the degree of inter-correlations among the variables and the appropriateness of this method is the measure of sampling adequacy (MSA). Measure of sampling adequacy values must exceed 0.5 (Hair et al., 2009).
Categorical principal components analysis is a nonparametric method that quantifies categorical variables through a process called optimal quantification (also referred to as optimal scaling, or optimal scoring) (Meulman et al., 2004). Optimal quantification replaces the category labels with category quantifications in such a way that as much as possible of the variance in the quantified variables is accounted for. The most important advantages of nonlinear over linear PCA are that it incorporates nominal and ordinal variables and that it can handle and discover nonlinear relationships between variables. Because CATPCA directly analyses the data matrix and not the derived correlation matrix, there need not be the usual concern to have at least five times as many observations as the variables. In fact, CATPCA is suited for analysis in which there are more variables than objects (Meulman et al., 2004).   In some related works, PCA analysis proves to demonstrate interesting results. For instances, in a work that aimed to investigate the sensory and chemical characteristics of Blanc Du Bois wines to characterize quality differences among them, PCA analysis showed specific attributes to be correlated with high-or low-quality wines (Dreyer et al., 2013). In another interesting work that aimed to improve local wine aroma and quality of wines of Cabernet Sauvignon grape-must inoculated with twelve autochthonous strains of Saccharomyces cerevisiae, PCA analysis of active aroma compounds, which contents were higher than thresholds, distinguished wines prepared into four groups according to the yeasts applied for microvinifications (Liang et al., 2013). However, as we have demonstrated in our work, the CATPCA data analysis seems to be more robust: in the CATPCA biplot the two components explained 74.253% of the total amount of initial variance while in the PCA biplot the two components only explained 60.325% of the total amount of initial variance. Moreover, the CATPCA model did not highlight differences among wines from winery brands while, in the PCA, Port Wines are grouped according to wine style and there are some discrimination between winery brands.

CONCLUSIONS
The work presented here allowed us to obtain two solutions that must be properly weighted and demonstrated that the application of computational resources should be taken with some care in order not to commit methodological errors.
In both analyzes were considered two components, however, the percentage of total amount of initial variance explained by CATPCA is higher (74.253%) than the one explained by PCA analysis (60.325%).
Clearly, the PCA violated some basic principles: the variables used were qualitative, the measure of sampling adequacy of Kaiser-Meyer-Olkin, which must exceed 0.5 gave a value of only 0.209, indicating a not good sampling adequacy. The sample size should be large enough, at least five times as many observations as the number of variables to be analyzed, which is not the case in our study, where we had 23 variables and 28 observations. In fact, CATPCA is suited for analysis in which there are more variables than observations. Moreover, the CATPCA grouped the wines according to wine style, independently of the wines brands and there is greater cohesion between groups which seems to be appropriated to the wine samples in question.