Fig. 9

Download original image
cuThomasBatch performance, (execution time of gtsvStridedBatch divided by the execution time of cuThomas) using single (a, b) and double (c, d) operations for computing multiple, 256–256 000 (a, c) and 20–20 000 (b, d), tridiagonal systems using different sizes: 64–512 (a, c) and 1024–8192 (b, d).