26 Mai - 1 Juin


Retour à la vue des calendrier
Mardi 27 Mai
Heure: 12:00 - 13:30
Lieu: Salle B107, bâtiment B, Université de Villetaneuse
Résumé: Statistical Data Analysis techniques for “distribution-valued data”
Description: Rosanna Verde In many real experiences, data are collected and/or represented by frequency distributions. If Y is a numerical and continuous variable, many distinct values yi  can be observed. In these cases, the values are usually grouped in a smaller number H of consecutive and disjoint bins Ih (groups, classes, intervals, etc.). The frequency distribution of the variable Y is obtained considering the number of data values nh falling in each Ih. The histogram is then the typical graphical representation for the variable Y.The interest in analyzing data expressed by frequency distributions, as well as by histograms, is evident in many fields of research. Involving the treatment of experimental data that are collected in a range of values, whereas the measurement instrument gives only approximated (or rounded) values. An example can be given by sensors for air pollution control located in different zones of an urban area. The different distributions of measured data about the different levels of air pollutants across a day, allow to compare, and then to group into homogeneous clusters, the different controlled zones.In the framework of Symbolic Data Analysis (SDA) multi-valued data, represented by an empirical distribution(like a histogram or an observed density or a quantile function) of a quantitative variable, is defined distribution-valued data, as well as, the variable which takes as values distribution-valued data, is defined as a distributional variable.Many techniques have been recently developed for this kind of data (). The comparison of empirical distribution functions is possible by using a suitable family of distances based on the Wasserstein metric that furnishes interesting interpretative results about the characteristics (or the moments) of the distributions. The Wasserstein metric is defined as the distance (in different norm) between the empirical quantile functions (the inverse of the cumulate distribution functions associated to each observed distribution).The seminar will introduce some basic statistics for distribution valued data.Novel univariate statistics emerge from the definition of a measure of variability that is related to a distance between distributions. Then, considering the ℓ2 Wasserstein distance it is possible to define a product operator between two distributions, that has allowed to propose an extension of the classical covariance and correlation measures between distributional variables.Among the techniques of Data Analysis extended to this kind of data, during the seminar, it will be presented an approach of the dynamic clustering algorithm (like Nuées Dynamiques, (Diday (1971), Diday and Simon (1976)), based on the Wasserstein distance, with the aim to discover typologies on the basis of the similarity of the observed distributions.An application of the DCA is shown in the framework of the data stream analysis in order to detect changes in the data structure.Furthermore, a simple linear regression model will be proposed as a suitable model to estimate a distributional response variable by a linear transformation of another independent distributional variable. The main idea is to use the Wasserstein metric to measure the sum of squared errors between the observed and predicted distributional data.A space dimension reduction technique (like principal component analysis) will be  also proposed to visualize the proximities between observations and relationships among quantile function on factorial plans.Some application on real and synthetic data will be shown in order to evaluate the performance of the proposed approaches.    
Heure: 14:00 - 17:00
Lieu: Salle B107, bâtiment B, Université de Villetaneuse
Résumé: La fonction à deux points et à trois points pour les quadrangulations et cartes
Description: Eric Fusy Pour une famille F de cartes planaires on appelle "fonction à k points" la série génératrice de comptage des cartes de F avec k pointsmarqués dont les distances deux à deux sont prescrites. On sait depuis les résultats de Bouttier, Di Francesco et Guitter(s'appuyant sur une bijection de Schaeffer) que la fonction à 2 points desquadrangulations admet une expression explicite, et des réultats plusrécents de Bouttier et Guitter (s'appuyant sur une bijection de Miermont)ont établi une expression explicite pour la fonction à trois points des quadrangulations.Nous passerons en revue ces résultats et montrerons comment on peut exploiter une bijection récente due à Ambjorn et Budd pour établir desexpressions explicites pour les fonctions à deux points et à trois points des cartes générales.Travaux en commun avec Jérémie Bouttier et Emmanuel Guitter