statsgen | |
emitanaka@fosstodon.org | |
emitanaka | |
emi.tanaka@anu.edu.au | |
Australian National University 46 Sullivan’s Creek Road R. N. Robertson Building Room S307 Acton, ACT 2601 Australia |
Hi there!
I’m an academic statistician at the Australian National University (ANU) living in Canberra with a passion for data science and open-source software. I’m currently the Deputy Director at the Biological Data Science Institute and an Executive Editor of the R Journal. I also have an affiliation at the ANU Research School of Finance, Actuarial Studies and Statistics, where I visit on a weekly basis.
My primary interest is to develop impactful methods and tools that can be readily used by practitioners. I enjoy working in a collaborative environment with people from diverse backgrounds, with an aim to enhance our knowledge and understanding of the real world data. I interface across multiple disciplines to bridge statistical concepts and findings to a broad range of individuals. To this end, I have developed numerous open-source tools, primarily as R-packages, and resources aimed at making statistical methods accessible to a diverse audience. My proudest work to date is the edibble
R package where I reframe the specification of an experimental design by the so-called “grammar of experimental designs” (words = fundamental components of a comparative experiment, e.g. units, treatments and its relationships, and express design as a “sentence” by stringing together “words” that follow a certain grammatical rule).
I take a proactive approach to community development. I’m actively involved in the Statistical Society of Australia (SSA), International Biometrics Society (Australasian Region) and other committees. I recently completed my term as President of the SSA Victorian Branch where I led the creation of the Di Cook Award (an open source statistical software award for students) and the launch of the Statistical Computing and Visualisation Section within SSA. My contributions have been recognised by the SSA Distinguished Presenter’s Award (2021), SSA Leadership in Statistics (2022) and I was featured on the list of 60 prominent Australian statisticians in the Significance magazine (2023).
My research interests are:
- Experimental design and optimal design,
- Mixed models (also known as multi-level models, panel data models or hierarchical models),
- Data visualisation and visual inference,
- Applications in bioinformatics and selective breeding (particularly plant breeding),
- Statistical software development, particularly in R,
- Machine learning, computer vision and large language models,
- Statistical practice and workflow (encompassing reproducibility, design, infrastructure and communication).
I speak English, Japanese (conversational) and R (base + tidyverse) fluently. I’ve lived in Australia almost all of my life (+-10 years standard deviation) and my university major was mathematics and statistics. Most of my code written during my PhD was in Python and Bash (both of which I am rusty now). I also like to dabble on front-end web development (HTML/CSS/JS). I’ve developed web apps using Shiny and practice computational reproducibility using Git, R Markdown (now Quarto) and internal DSLs (e.g. targets and renv R-packages). I’m a very data-oriented person to the extent that I’ve collected my own data via webscrapping and API. I adapt quickly and am quite open to taking new challenges – occasionally, I do text analysis and play in machine learning competitions. I’ve now also started to do research in computer vision and large language models.
I am a big advocate of open science and an avid research software engineer. My code are mostly available on my GitHub profile where I also host a number of R-packages that I have developed.
This webpage is made using Quarto.
Contact
Please note that I will not respond to emails if I cannot identify who you are (i.e. not anonymous and has affiliation).