Aller au contenu principal

Mickaël Martin-Nevot

Membre associé Aix-Marseille UniversitéInstitut universitaire de technologie (IUT)

Économétrie, finance et méthodes mathématiques
Statut
Professeur associé
Téléchargement
CV
Résumé Multi-criteria decision analysis in databases has been actively studied, especially through the Skyline operator. Yet, few approaches offer a relevant comparison of Pareto optimal, or Skyline, points for high cardinality result sets. We propose to improve the dp-idp method, inspired by tf-idf, a recent approach computing a score for each Skyline point, by introducing the concept of dominance hierarchy. As dp-idp lacks efficiency and does not ensure a distinctive rank, we introduce the RankSky method, the adaptation of Google’s well-known PageRank solution, using a square stic matrix, a teleportation matrix, a damping factor, and then a row score eigenvector and the IPL algorithm. For the same reasons as RankSky, and also to offer directly embeddable in DBMS solution, we establish the TOPSIS-based CoSky method, derived from both information research and multi-criteria analysis. CoSky automatically ponderates normalized attributes using the Gini index, then computes a score using Salton’s cosine toward an ideal point. By coupling multilevel Skyline to dp-idp, RankSky or CoSky, we introduce DeepSky. Implementations of the improved version of dp-idp, RankSky and CoSky are evaluated experimentally using generated synthetic data sets. All of the proposed methods highlight relevance and performance: dp-idp with dominance hierarchy seems twice as efficient as the original while RankSky provides a fast robust usual approach transposed to Skyline’ ranking, and CoSky offers a far more effective solution than any other method.
Mots clés Multiple-criteria decision analysis Skyline Information retrieval Ranking
Résumé Multi-criteria decision analysis in databases has been actively studied, especially through the Skyline operator. Yet, few approaches offer a relevant comparison of Pareto optimal, or Skyline, points for high cardinality result sets. We propose to improve the dp-idp method, inspired by tf-idf, a recent approach computing a score for each Skyline point, by introducing the concept of dominance hierarchy. As dp-idp lacks efficiency and does not ensure a distinctive rank, we introduce the RankSky method, the adaptation of Google’s well-known PageRank solution, using a square stic matrix, a teleportation matrix, a damping factor, and then a row score eigenvector and the IPL algorithm. For the same reasons as RankSky, and also to offer directly embeddable in DBMS solution, we establish the TOPSIS-based CoSky method, derived from both information research and multi-criteria analysis. CoSky automatically ponderates normalized attributes using the Gini index, then computes a score using Salton’s cosine toward an ideal point. By coupling multilevel Skyline to dp-idp, RankSky or CoSky, we introduce DeepSky. Implementations of the improved version of dp-idp, RankSky and CoSky are evaluated experimentally using generated synthetic data sets. All of the proposed methods highlight relevance and performance: dp-idp with dominance hierarchy seems twice as efficient as the original while RankSky provides a fast robust usual approach transposed to Skyline’ ranking, and CoSky offers a far more effective solution than any other method.
Mots clés Ranking, Information retrieval, Skyline, Multiple-criteria decision analysis