Seminar on Statistics and Data Science

This seminar series is organized by the research group in statistics and features talks on advances in methods of data analysis, statistical theory, and their applications. The speakers are external guests as well as researchers from other groups at TUM. All talks in the seminar series are listed in the Munich Mathematical Calendar.

The seminar takes place in room 8101.02.110, if not announced otherwise. To stay up-to-date about upcoming presentations please join our mailing list. You will receive an email to confirm your subscription.

Upcoming talks

02.07.2024 14:00 Thomas Richardson (University of Washington, Seattle): Short Course on “Graphical causal modeling” (Lecture 3/3)

This short course covers recent developments in graphical and causal modeling in Statistics/Machine Learning. It is comprised of the following three lectures, each two hours long. \[ \] June 25, 2024; Lecture 1: “Learning from conditional independence when not all variables are measured: Ancestral graphs and the FCI algorithm” \[ \] June 27, 2024; Lecture 2: “Identification of causal effects: A reformulation of the ID algorithm via the fixing operation” \[ \] July 2, 2024; Lecture 3: “Nested Markov models” \[ \] The course targets an audience with exposure to basic concepts in graphical and causal modeling (e.g., conditional independence, DAGs, d-separation, Markov equivalence, definition of causal effects/the do-operator).
Quelle

Previous talks

within the last 180 days

02.07.2024 14:00 Thomas Richardson (University of Washington, Seattle): Short Course on “Graphical causal modeling” (Lecture 3/3)

This short course covers recent developments in graphical and causal modeling in Statistics/Machine Learning. It is comprised of the following three lectures, each two hours long. \[ \] June 25, 2024; Lecture 1: “Learning from conditional independence when not all variables are measured: Ancestral graphs and the FCI algorithm” \[ \] June 27, 2024; Lecture 2: “Identification of causal effects: A reformulation of the ID algorithm via the fixing operation” \[ \] July 2, 2024; Lecture 3: “Nested Markov models” \[ \] The course targets an audience with exposure to basic concepts in graphical and causal modeling (e.g., conditional independence, DAGs, d-separation, Markov equivalence, definition of causal effects/the do-operator).
Quelle

27.06.2024 14:00 Thomas Richardson (University of Washington, Seattle): Short Course on “Graphical causal modeling” (Lecture 2/3)

This short course covers recent developments in graphical and causal modeling in Statistics/Machine Learning. It is comprised of the following three lectures, each two hours long. \[ \] June 25, 2024; Lecture 1: “Learning from conditional independence when not all variables are measured: Ancestral graphs and the FCI algorithm” \[ \] June 27, 2024; Lecture 2: “Identification of causal effects: A reformulation of the ID algorithm via the fixing operation” \[ \] July 2, 2024; Lecture 3: “Nested Markov models” \[ \] The course targets an audience with exposure to basic concepts in graphical and causal modeling (e.g., conditional independence, DAGs, d-separation, Markov equivalence, definition of causal effects/the do-operator).
Quelle

25.06.2024 14:00 Thomas Richardson (University of Washington, Seattle): Short Course on “Graphical causal modeling” (Lecture 1/3)

This short course covers recent developments in graphical and causal modeling in Statistics/Machine Learning. It is comprised of the following three lectures, each two hours long. \[ \] June 25, 2024; Lecture 1: “Learning from conditional independence when not all variables are measured: Ancestral graphs and the FCI algorithm” \[ \] June 27, 2024; Lecture 2: “Identification of causal effects: A reformulation of the ID algorithm via the fixing operation” \[ \] July 2, 2024; Lecture 3: “Nested Markov models” \[ \] The course targets an audience with exposure to basic concepts in graphical and causal modeling (e.g., conditional independence, DAGs, d-separation, Markov equivalence, definition of causal effects/the do-operator).
Quelle

17.06.2024 09:00 Saber Salehkaleybar (Leiden University): Causal Inference in Linear Structural Causal Models

The ultimate goal of causal inference is so-called causal effect identification (ID), which refers to quantifying the causal influence of a subset of variables on a target set. A stepping stone towards performing ID is learning the causal relationships among the variables which is commonly called causal structure learning (CSL). In this talk, I mainly focus on the problems pertaining to CSL and ID in linear structural causal models, which serve as the basis for problem abstraction in various scientific fields. In particular, I will review the identifiability results and algorithms for CSL and ID in the presence of latent confounding. Then, I will present our recent result on the ID problem using cross-moments among observed variables and discuss its applications to natural experiments and proximal causal inference. Finally, I conclude the presentation with possible future research directions.
Quelle

10.06.2024 10:30 Adèle Ribeiro (Philipps-Universität Marburg): Recent Advances in Causal Inference under Limited Domain Knowledge

One pervasive task found throughout the empirical sciences is to determine the effect of interventions from observational (non-experimental) data. It is well-understood that assumptions are necessary to perform causal inferences, which are commonly articulated through causal diagrams (Pearl, 2000). Despite the power of this approach, there are settings where the knowledge necessary to fully specify a causal diagram may not be available, particularly in complex, high-dimensional domains. In this talk, I will briefly present two recent causal effect identification results that relax the stringent requirement of fully specifying a causal diagram. The first is a new graphical modeling tool called cluster DAGs (for short, C-DAGs) that allows for the specification of relationships among clusters of variables, while the relationships between the variables within a cluster are left unspecified [1]. The second includes a complete calculus and algorithm for effect identification from a Partial Ancestral Graph (PAG), which represents a Markov equivalence class of causal diagrams, fully learnable from observational data [2]. These approaches are expected to help researchers and data scientists to identify novel effects in real-world domains, where knowledge is largely unavailable and coarse. \[ \] References: [1] Anand, T. V., Ribeiro, A. H., Tian, J., & Bareinboim, E. (2023). Causal Effect Identification in Cluster DAGs. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37, No. 10, pp. 12172-12179. [2] Jaber, A., Ribeiro, A., Zhang, J., & Bareinboim, E. (2022). Causal identification under markov equivalence: Calculus, algorithm, and completeness. Advances in Neural Information Processing Systems, 35, 3679-3690.
Quelle

05.06.2024 12:15 Han Li (The University of Melbourne): Constructing hierarchical time series through clustering: Is there an optimal way for forecasting?

Forecast reconciliation has attracted significant research interest in recent years, with most studies taking the hierarchy of time series as given. We extend existing work that uses time series clustering to construct hierarchies, with the goal of improving forecast accuracy. First, we investigate multiple approaches to clustering, including not only different clustering algorithms, but also the way time series are represented and how distance between time series is defined. Second, we devise an approach based on random permutation of hierarchies, keeping the structure of the hierarchy fixed, while time series are randomly allocated to clusters. Third, we propose an approach based on averaging forecasts across hierarchies constructed using different clustering methods, that is shown to outperform any single clustering method. Our findings provide new insights into the role of hierarchy construction in forecast reconciliation and offer valuable guidance on forecasting practice.
Quelle

15.05.2024 17:00 Richard Samworth (University of Cambridge): Optimal convex M-estimation via score matching

In the context of linear regression, we construct a data-driven convex loss function with respect to which empirical risk minimisation yields optimal asymptotic variance in the downstream estimation of the regression coefficients. Our semiparametric approach targets the best decreasing approximation of the derivative of the log-density of the noise distribution. At the population level, this fitting process is a nonparametric extension of score matching, corresponding to a log-concave projection of the noise distribution with respect to the Fisher divergence. The procedure is computationally efficient, and we prove that our procedure attains the minimal asymptotic covariance among all convex M-estimators. As an example of a non-log-concave setting, for Cauchy errors, the optimal convex loss function is Huber-like, and our procedure yields an asymptotic efficiency greater than 0.87 relative to the oracle maximum likelihood estimator of the regression coefficients that uses knowledge of this error distribution; in this sense, we obtain robustness without sacrificing much efficiency.
Quelle

13.05.2024 15:15 Chandler Squires (MIT, Cambridge): Decision-centric causal structure learning: An algorithm of data-driven covariate adjustment

When learning a causal model of a system, a key motivation is the use of that model for downstream decision-making. In this talk, I will take a decision-centric perspective on causal structure learning, focused on a simple setting that is amenable to careful statistical analysis. In particular, we study causal effect estimation via covariate adjustment, when the causal graph is unknown, all variables are discrete, and the non-descendants of treatment are given. \[ \] We propose an algorithm which searches for a data-dependent "approximate" adjustment set via conditional independence testing, and analyze the bias-variance tradeoff entailed by this procedure. We prove matching upper and lower bounds on omitted confounding bias in terms of small violations of conditional independence. Further, we provide a finite-sample bound on the complexity of correctly selecting an "approximate" adjustment set and of estimating the resulting adjustment functional, using results from the property testing literature. \[ \] We demonstrate our algorithm on synthetic and real-world data, outperforming methods which ignore structure learning or which perform structure learning separately from causal effect estimation. I conclude with some open questions at the intersection of structure learning and causal effect estimation.
Quelle

26.03.2024 13:00 Tobias Boege (KTH Royal Institute of Technology, Stockholm): Colored Gaussian DAG models

Colored Gaussian DAG models generalize linear structural equation models by allowing additional equalities to be specified among the error variances and regression coefficients. We show that these models are smooth manifolds and give a characterization of their vanishing ideals up to a saturation. We also initiate the study of faithfulness and structural identifiability. Our results are facilitated by an in-depth analysis of parameter identification maps for ordinary Gaussian DAG models and our techniques carry over easily to other classes of rationally identifiable statistical models. This is joint work with Kaie Kubjas, Pratik Misra and Liam Solus.
Quelle

20.03.2024 12:15 Yichen Zhu (Università Bocconi, Milano): Posterior Contraction Rates for Vecchia Approximations of Gaussian Processes

Gaussian Processes (GP) are widely used to model spatial dependency in geostatistical data, yet the exact Bayesian inference has an intractable time complexity of $O(n^3)$. Vecchia approximation has become a popular solution to this computational issue, where spatial dependency is characterized by a sparse directed acyclic graph (DAG) that allows scalable Bayesian inference. Despite the popularity in practice, little is understood about its theoretical properties. In this paper, we systematically study the posterior contraction rates of Vecchia approximations of GP. Under minimal regularity conditions, we prove that by appropriate selection of the underlying DAG, the Vecchia approximated GP possess the same posterior contraction rates as the mother GP. Therefore, by optimal choices of the tunning hyper-parameters, the Vecchia approximation can achieve the minimax contraction rate, providing strong frequentist guarantees to the procedure. Our theoretical findings are demonstrated numerically as well using synthetic and real world data sets.
Quelle

13.03.2024 12:30 Bryon Aragam (University of Chicago): Optimal structure learning in structural equation models

We study the optimal sample complexity of structure learning in Gaussian structural equation models. In the first part of the talk, we compare the complexity of structure learning via the PC algorithm and distribution learning via the Chow-Liu algorithm in directed polytrees. We will show how both algorithms are optimal under different assumptions, and lead to different statistical complexities. Moving beyond polytrees, we then investigate the problem of neighbourhood selection, which is an important primitive when learning the overall structure of a graphical model. We will introduce a new estimator, called klBSS, and compare its performance to best subset selection (BSS). We show by example that—even when the structure is unknown—the existence of underlying structure can reduce the sample complexity of neighbourhood selection compared to classical methods such as BSS and the Lasso.
Quelle

28.02.2024 11:45 Søren Wengel Mogensen (Lund University): Graphical models of local independence in stochastic processes

Graphs are often used as representations of conditional independence structures of random vectors. In stochastic processes, one may use graphs to represent so-called local independence. Local independence is an asymmetric notion of independence which describes how a system of stochastic processes (e.g., point processes or diffusions) evolves over time. Let A, B, and C be three subsets of the coordinate processes of the stochastic system. Intuitively speaking, B is locally independent of A given C if at every point in time knowing the past of both A and C is not more informative about the present of B than knowing the past of C only. Directed graphs can be used to describe the local independence structure of the stochastic processes using a separation criterion which is analogous to d-separation. In such a local independence graph, each node represents an entire coordinate process rather than a single random variable. \[ \] In this talk, we will describe various properties of graphical models of local independence and then turn our attention to the case where the system is only partially observed, i.e., some coordinate processes are unobserved. In this case, one can use so-called directed mixed graphs to describe the local independence structure of the observed coordinate processes. Several directed mixed graphs may describe the same local independence model, and therefore it is of interest to characterize such equivalence classes of directed mixed graphs. It turns out that directed mixed graphs satisfy a certain maximality property which allows one to construct a simple graphical representation of an entire Markov equivalence class of marginalized local independence graphs. This is convenient as the equivalence class can be learned from data and its graphical representation concisely describes what underlying structure could have generated the observed local independencies. \[ \] Deciding Markov equivalence of two directed mixed graphs is computationally hard, and we introduce a class of equivalence relations that are weaker than Markov equivalence, i.e., lead to larger equivalence classes. The weak equivalence classes enjoy many of the same properties as the Markov equivalence classes, and they provide a computationally feasible framework while retaining a clear interpretation. We discuss how this can be used for graphical modeling and causal structure learning based on local independence.
Quelle

05.02.2024 14:15 Michael Joswig (TU Berlin) : What Is OSCAR?

The OSCAR project is a collaborative effort to shape a new computer algebra system, written in Julia. OSCAR is built on top of the four "cornerstone systems" ANTIC (for number theory), GAP (for group and representation theory), polymake (for polyhedral and tropical geometry) and Singular (for commutative algebra and algebraic geometry). We present examples to showcase the current version 0.14.0. This is joint work with The OSCAR Development Team, currently lead by Wolfram Decker, Claus Fieker, Max Horn and Michael Joswig. \[ \] Interested participants can also install OCSCAR before the workshop. More information about the installation can be found here: https://www.oscar-system.org/install/
Quelle

05.02.2024 15:30 Antony Della Vecchia (TU Berlin): OSCAR demo + The mrdi File Format

After a demo of the OSCAR system, we introduce the mrdi file format and discuss the advantages of using serialization for collaborative work and scientific research. We demonstrate how users can benefit from OSCAR's built-in serialization mechanism, which employs that file format. Key applications include the reproduction of mathematical results computed with OSCAR and the interoperability between OSCAR and other software applications.
Quelle

For talks more than 180 days ago please have a look at the Munich Mathematical Calendar (filter: "Oberseminar Statistics and Data Science").