pic

Annie Gray

PhD student in statistics
School of Mathematics, University of Bristol

annie.gray@bristol.ac.uk

Summary of research

I'm a PhD student at the Centre for Doctoral Training in Computational Statistics and Data Science at the University of Bristol, supervised by Professor Patrick Rubin-Delanchy and Professor Nick Whiteley.

My research consists of providing a general statistical grounding for manifold structure in high-dimensional data and to demonstrate that rich topological and geometric structure can emerge from generic and simple statistical assumptions involving correlations and latent variables. The aim of this work is to shed light on the efficacy of PCA for reduction from high to moderate dimensions before clustering, topological data analysis, nonlinear dimension reduction, regression and classification. Recently, we have been working to use these insights to recover hidden tree structure in data via hierarchical clustering with dot products.

Code for this research can be found on my GitHub.

Papers

Matrix factorisation and the interpretation of geodesic distance Nick Whiteley, Annie Gray, Patrick Rubin-Delanchy, NeurIPS, 2021

Discovering latent topology and geometry in data: a law of large dimension Nick Whiteley, Annie Gray, Patrick Rubin-Delanchy, arXiv:2208.11665, 2022

Hierarchical clustering with dot products recovers hidden tree structure Annie Gray, Alexander Modell, Patrick Rubin-Delanchy, Nick Whiteley, arXiv:2305.15022, Accepted at NeurIPS (spotlight) 2023

Applications

Following an internship and ongoing collaboration with Microsoft Research Special Projects working in Human Rights Technology, I have been involved in developing and implementing techniques for detecting and understanding risk from relationships in large-scale datasets. One application for this work is to identify corruption in public procurement data using multiple data sources that describe relationships between companies. The open source code for this project can be found here.