Problem
Classify tennis shots from video using pose trajectories instead of brittle pixel-only heuristics.
Approach
- MoveNet for human pose estimation on frames.
- Sequential models (RNN, GRU, LSTM, BiLSTM) with attention and CNN encoders to capture motion patterns.
- Built a custom dataset and an end-to-end ML pipeline aimed at near–real-time coaching and analytics.
Outcome
~94% accuracy on the thesis task — with a path toward real-world deployment constraints (latency, data drift, and labeling).