As far as I can tell, all existing certification tests are designed to allow a favored group of watches to pass. When the Poincon de Geneve decided that wire springs would be prohibited, you had better believe that first they checked to be sure their clients were not using wire springs. C.O.S.C. is the closest to providing an objective test, but they still don't test in six positions, and it's MUCH easier to get a watch to keep time in five positions than in six.
It's like the car insurance crash tests: the testing agency makes clear that the test will involve a collision from a certain angle, so the car makers build cars that are strong from that angle. Good luck, however, if your car is hit from a different angle.
I would very much like to see a test that really checks out what a watch can do. Example: at time of test, watch must have been in its case and running for at least three years since last maintenance; watch is tested in 14 positions (2 flat, 4 vertical, 8 inclined) and under a few different types of motion as well as temperature. Watch is then retested having been subjected to a magnetic field of reasonable strength.
Of course, right now if a company sends in a watch for testing the watch simply is not sold (or is adjusted and retested). If we really want to make it fun, have the watch company send in watches in batches of 100 and the test result will reflect what percentage passed.