Well I've been working hard and I've managed to make some major speedups and progress; I now have modular addition, subtraction, and multiplication working.
On top of that, integer addition and subtraction are now ~5x as fast, and I've fixed a couple bugs.
That means that adding/subtracting two ECC-sized numbers now takes just over a tenth of a millisecond.
I'll be pushing changes in a few moments.
I just need to get integer division working 100%, and then I can implement modular division. Once that's done I can get straight to implementing ECC point addition and doubling. If anyone's willing to help it would be greatly appreciated...