I'm guessing that it would entail each cube 'face' to achieve a one third rotation along their horizontally back-forward-most axis as an overlay to the tesseract itself going through a one quarter rotation?
I wonder if that is easy to calculate, or not? One because the cube one third rotation would need to be relative to the viewer and because its axis would change angles through the process...
We already know it takes a one third rotation to get an identical forward shape cube (colours rotated) that is rotating in best orientation. That is because their are three faces toward front and three towards the back.
With a tesseract in best orientation there are four cube faces at the front and four cube faces at the back. Which is why I assume it takes a quarter rotation to get back to the original shape (if coupled with the cube rotations).
But the rotations need to be simultaneously overlaid otherwise the cube faces would overshoot their alignment; I'm guessing?