Projects
Professional Work
a brief overview of the kind of projects I've been working on professionally
- Data Pipelines for Construction Projects (2023/24): Developed and optimized data pipelines using PySpark and AWS Glue to integrate and analyze data about construction projects. Refactored an existing codebase to improve efficiency, automation, maintainability, and test coverage.
- Analysis of Large Scale Insurance Data (2022/23): Rebuilt a failed data lake project for large-scale insurance data from scratch, integrating multiple data sources and applying dynamic quality checks. With the help of polars and TDD, we were able to bring this into production in less than 6 months.
- Digital Twin for Public Transport with Reinforcement Learning (2021/22): Simulated hundreds of vehicles in a public transport network using real-time data, providing continuous intervention recommendations for delays.
- Detection and Analysis of Defective Vehicle Parts (2019-2021): Trained custom ML models to classify defective parts, aiming to detect production errors early. Developed data pipelines with Spark, managed workflows with Airflow, and handled model management with MLflow.
- Vehicle Diagnostic Data (2018-2019): Extraction and standardization of data from varying sources in the automotive industry, especially about electronic control units (ECUs) and vehicle diagnostics.
- Machine Learning Platform (2015-2019): Development of a ML platform as web service with heavy customizations for several customers. I started to work on this project from scratch while I was still a master student, which allowed me to hone my Python skills and to get comfortable with pandas, scikit-learn and keras. As of 2018 we had projects for three costumers in production and the code base was getting frequent updates.
- Semantic Web (2014-2018): Building knowledge graphs and other semantic data structures based on Wikipedia and Wikidata.
Pet Projects
Here you'll find a non-exhaustive list of software-related projects I've built or contributed to in my free time or as part of job-related open source activity
- cookiecutter-server (2021): a local development server to get live previews of cookiecutter templates (pypi)
- personio-py (2020-2024): a lightweight Personio API client library for Python (docs, pypi)
- AT Python Template (2020-2023): the official Python Project Template of Alexander Thamm GmbH - designed to bridge the gap between exploratory work and production-ready projects.
- Epic Search (2017): a session-based semantic search engine, written as part of my master's thesis.
- Bigger Train Stations (2017): a mod for the game Transport Fever. It adds more options to the train stations available in the game, like more tracks and longer platforms. You can get it from the Steam Workshop.
- maps4cim (2013): a real-world map generator for the traffic simulation game Cities in Motion 2. Relies on free geospatial data from the SRTM and OpenStreetMap.
- c4 (2013): An implementation of the game Connect Four in the cubic 4x4x4 version. It's written in Java, using the jMonkey engine, a 3D game engine based on LWJGL. Never brought it to a state where I was comfortable to publish it, though...
- WoT omniscient Tables (2012): Detailed information about the vehicles in the computer game World of Tanks.
- VideoBatchProcessor (2011): A tool to scan hard drives for video files, using custom criteria, and creating a batch file of the selected videos, ready to be automatically processed by open source video transcoder HandBrake. Some of VideoBatchProcessor's features were later integrated in HandBrake.
- Catan (2011): An implementation of the classic board game Settlers of Catan, result of a Java programming course in my third semester at the LMU Munich. In a team of seven we managed to implement a client-server application with a pretty user interface.
- FS-Location Hack (2010): Crawled the publicly accessible parts of the FS-Location database. The goal was to alert the users and operators of some privacy issues in the social network. Sadly, all the tools and documentation I wrote for this purpose are lost...
Check out my github and stackoverflow to learn more.