0 avis
Sparse Multiple Correspondence Analysis. Analyse des Correspondances Multiples Parcimonieuse
Archive ouverte : Communication dans un congrès
Edité par HAL CCSD
International audience. Multiple Correspondence Analysis (MCA) is the method of choicefor themultivariate analysis of categorical data. In MCA each qualitative variable is representedby a group of binary variables (with a coding scheme called “complete disjunctive coding”)and each binary variable has a weight inversely proportional to its frequency. The datamatrix concatenates all these binary variables, and once normalized and centered thisdata matrix is analyzed with a generalized singular value decomposition (GSVD) thatincorporates the variable weights as constraints (or “metric”). The GSVD is, of course,based on the plain SVD and so MCA can be sparsified by extending algorithms designedto sparsify the SVD. To do so requires two additional features: to include weights andto be able to sparsify entire groups of variables at once. Another important feature ofsuch a sparsification should be to preserve the orthogonality of the components, Here, weintegrate all these constraints by using an exact projection scheme onto the intersectionof subspaces (i.e., balls) where each ball represents a specific type of constraints. Weillustrate our procedure with the data from a questionnaire survey on the perception ofcheese in two French cities.