Why do the sound and mouth movements not match up?

We use Voice Over IP (VOIP) for the sound to avoid echoing problems that arose during our early events. This is more robust, but unfortunately, there is a time lag because internet WiFi speed does not keep up with the speed of the phone.