Dr. Thomas Winterbottom
Machine learning engineer with a PhD in multimodal machine learning, 3 years industry experience, and a further 6+ years across research and applied ML systems.
Currently working at Kromek, UK.
Delivered ML systems for production in physics and healthcare domains. Emphasis on robustness, dataset quality, and deployment under real-world constraints.
I think its important to maintain strong mathematical and computer science foundations. Interested in linear algebra, multivariable calculus, probability, statistics, and optimisation. Moving towards manifold learning =).
Equally important to keep on top of comp sci fundamentals. Complexity, profiling, systems/server management.
Recently moving into performance-aware ML engineering: CUDA, numerical stability, ExecuTorch.
Capabilities
Selected Projects
CUDA MLEM Compton Reconstruction
Implemented a CUDA-based Maximum Likelihood Expectation Maximisation (MLEM) reconstruction pipeline for radiation detection systems.
- Achieved ~2× speedup over existing multi-threaded Julia implementation
- Optimised kernel execution via fusion and memory access pattern redesign
- Performance guided by Nsight profiling and bottleneck analysis
- Work conducted in proprietary context (Kromek); CUDA learning pre-work available below
CUDA learning pathway: github.com/Jumperkables/cuda-benchmarking
ARM Deployment of PyTorch Models (ExecuTorch)
Explored deployment of PyTorch models onto ARM-based microcontrollers (i.MX RT1064) using ExecuTorch.
- Managed cross-compilation toolchains and embedded build systems
- Worked through low-level dependency and runtime constraints
- Focused on inference feasibility in resource-constrained environments
5MAP — 5 Modalities of Air Pollution Data
Did a bit of data engineering for an air pollution project: global air quality, meteorological, and geospatial datasets across 14k+ monitoring sites.
- Engineered large-scale geospatial data pipelines combining satellite imagery, GRIB weather data, and OpenStreetMap features
- Built scraping and processing infrastructure for global grid-based environmental datasets
- Designed alignment and preprocessing strategies for heterogeneous multi-source environmental data
- Achieved near-zero sparsity across modalities in a dataset significantly larger than comparable open-source benchmarks
Dataset: github.com/Jumperkables/clean_air
Home ML Systems Infrastructure
Built and maintain a personal compute and data infrastructure environment for ML experimentation and deployment.
- 18U server rack with compute, NAS, VPN, and routing stack
- Multi-GPU setup for VRAM-heavy experimentation and mixed precision practice
- Self-hosted storage with RAIDZ2-backed redundancy
- Fine-tuned local inference models including text-to-speech systems (voices include Victor Saltzpyre and Gul’dan)
Repository: github.com/Jumperkables/sigmarvis
Fuzzy Label Representations via Neurolinguistic Word Norms
Investigated replacing one-hot encodings with similarity-based fuzzy label representations derived from neurolinguistic word norms.
- Applied to VQA classification settings within thesis work
- Observed negative results, highlighting structural limitations of the approach
- I'm planning in extending concept to softmax and LLM integration via CUDA kernels
Repository: github.com/Jumperkables/llm_wordnorms
LLM Repository: github.com/Jumperkables/llm_wordnorms