See also Joris's list.
No (as of March 2019). Yes, it was published in March 2021.
No, and we have no plans to write any.
Short answer: we have no idea.
Longer answer:
Much internet commentary has focused on the quantity n_{0} = 2^{172912} appearing in the paper. This number is roughly 10^{214857091104455251940635045059417341952}.
Some people have asserted that the new algorithm only works for n ≥ n_{0}, and/or that the algorithm only becomes faster than previous algorithms for n ≥ n_{0}. These assertions are somewhat misleading.
For n < n_{0}, the new algorithm is by definition no faster than existing algorithms, because it calls an existing algorithm to do its work. The value n_{0} is simply an explicit threshold that was chosen to ensure that the inductive step of our proof of the complexity bound works out correctly. The actual value was chosen to make the proof as straightforward as possible; it was not chosen to optimise the overall running time.
On the other hand, for n slightly larger than n_{0}, the new algorithm might turn out to be either faster or slower than existing algorithms. This depends on, among other things, the implied big-O constants for both the new algorithm and the algorithm with which it is being compared. Since we do not know (yet) what these constants are, we cannot tell which is faster.
In any case, it is not really possible to make a fair comparison without first making a serious effort to optimise the new algorithm. There are many points where the algorithm could be tightened up (see for example the discussion in Section 5.4), and there are many parameters that could be tweaked (such as the threshold n_{0}, the dimension d, and the resampling parameter α). This is a very complex problem which is beyond the scope of the paper. Moreover, we observe that for implementations on actual computers, the efficiency of the new algorithm depends heavily on hardware characteristics; as the new algorithm reduces the bulk of the workload to additions and subtractions, it may be particularly well-suited for processors that perform hardware additions much faster than hardware multiplications.
Just a lucky coincidence!
If implemented in a reasonable way, and assuming O(1) processors, the space complexity should be O(n) bits.
Maybe: try this paper instead.