Data Science Projects

Below are selected projects designed to demonstrate my technical skills and applied experience in data science, natural language processing, and statistical modeling. Each project highlights a specific set of tools and methods, and links to a live demo or interactive report where available.

Gender Bias in Language Prediction: Humans vs. Large Language Models

This multi-part portfolio project investigates how humans and large language models (LLMs) respond to gendered pronouns in political contexts, particularly when interpreting role nouns like “the next president… she.”

The project extends a large-scale psycholinguistic experiment I conducted during my PhD, comparing human response patterns to model behavior and exploring whether predictive biases can be modified through targeted fine-tuning.

Key highlights:

Statistical modeling using brms and reaction time data from 2,000+ human participants
LLM surprisal analysis using HuggingFace models in Python
Interactive dashboards built with Shiny (R) and Streamlit (Python)
Strong focus on reproducibility, transparency, and structured reasoning

📄 View the live project website

📂 GitHub repository

🛠 Tools: R, brms, ggplot2, tidyverse, Python, HuggingFace, Streamlit, Shiny, Quarto, GitHub Pages

Till Poppels

Data Science Projects

Gender Bias in Language Prediction: Humans vs. Large Language Models