Interesting write up.
Though, their methodology does seem a little strange.
If you cannot guarantee that the two pieces are exactly the same performance-wise, prior to changing one element (in this case, the angle (which actually means more than one change - gear angles etc etc) then your experiment is full of noise.
If they did a full DoE on the watch, and displayed the results, I'd be really interested.