A Scalable Architecture For Hardware Acceleration Of Large Sparse Matrix Calculations