Multiplying matrices; order maters
I’m working on speeding up my did2s
package and came across a perfect example.
Version 1 below is the order I wrote in the current implementation.
I didn’t think too carefully and just wrote out the matrix product in the “natural order”.
In a typical example x1
and x10
are much larger than x2
, so
Matrix::solve(Matrix::crossprod(x10), Matrix::t(x10 * first_u))
takes a very long time
(it’s like ncol(x10)
regressions).
I played around and switched the order of multiplications and inversion (solve
) and saw an over 100x speedup!!
The following three rules help a lot with this:
Example ran on my M3 macbook pro:
library(Matrix)dim(x10)#> [1] 46500 1530dim(x1)#> [1] 46500 1530dim(x2)#> [1] 46500 40
library(tictoc)tic("v. 1")for (i in 1:5) { IF_FS_1 <- x2tx2_inv %*% crossprod(x2, x1) %*% solve(crossprod(x10), t(x10 * first_u))}toc()tic("v. 2")for (i in 1:5) { IF_FS_2 <- x2tx2_inv %*% t( (x10 * first_u) %*% solve(crossprod(x10), crossprod(x1, x2)) )}toc()tic("v. 3")for (i in 1:5) { IF_FS_3 <- tcrossprod( x2tx2_inv, (x10 * first_u) %*% solve(crossprod(x10), crossprod(x1, x2)) )}toc()tic("v. 4")for (i in 1:5) { IF_FS_4 <- x2tx2_inv %*% (t(solve( crossprod(x10), crossprod(x1, x2) )) %*% t((x10 * first_u)))}toc()tic("v. 5")for (i in 1:5) { IF_FS_5 <- (x2tx2_inv %*% t(solve( crossprod(x10), crossprod(x1, x2) ))) %*% t((x10 * first_u))}toc()expect_equal(IF_FS_1, IF_FS_2)expect_equal(IF_FS_2, IF_FS_3)expect_equal(IF_FS_3, IF_FS_4)expect_equal(IF_FS_4, IF_FS_5)
v. 1: 6.193 sec elapsedv. 2: 0.19 sec elapsedv. 3: 0.197 sec elapsedv. 4: 0.181 sec elapsedv. 5: 0.051 sec elapsed