Numerical methods and HPC
Open Access
Issue
Oil Gas Sci. Technol. – Rev. IFP Energies nouvelles
Volume 73, 2018
Numerical methods and HPC
Article Number 63
Number of page(s) 15
DOI https://doi.org/10.2516/ogst/2018061
Published online 27 November 2018
  • National Institutes of Health. Brain research through advancing innovative neurotechnologies (brain). https://www.braininitiative.nih.gov/about/index.htm. [Google Scholar]
  • Ecole Polytechnique Federale de Lausanne (EPFL)The Blue Brain Project http://bluebrain.epfl.ch/. [Google Scholar]
  • Human Brain Project (HBP). European Commission Future and Emerging Technologies Flagship. https://www.humanbrainproject.eu/. [Google Scholar]
  • Diaz-Pier S., Naveau M., Butz-Ostendorf M., Morrison A. (2016) Automatic generation of connectivity for large-scale neuronal network models through structural plasticity, Front. Neuroanat. 10, 57. ISSN 1662-5129. doi: 10.3389/fnana.2016.00057. [CrossRef] [PubMed] [Google Scholar]
  • Hines M. (1984) Efficient computation of branched nerve equations, Int. J. Bio-med. Comput. 15, 1, 69–76. [CrossRef] [Google Scholar]
  • Conte S.D., De Boor C.W. (1980) Elementary numerical analysis: An algorithmic approach, 3rd edn., McGraw-Hill Higher Education. [Google Scholar]
  • Valero-Lara P., Pinelli A., Prieto-Matias M. (2014) Fast finite difference Poisson solvers on heterogeneous architectures, Comput. Phys. Commun. 185, 4, 1265–1272. [Google Scholar]
  • Valero-Lara P., Pinelli A., Favier J., Prieto-Matias M. (2012) Block tridiagonal solvers on heterogeneous architectures, in: 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA, Leganes, Madrid, Spain, July, pp. 609–616. [Google Scholar]
  • Davidson A.A., Zhang Y., Owens J.D. (2011) An auto-tuned method for solving large tridiagonal systems on the GPU, in: 25th IEEE International Symposium on Parallel and Distributed Processing, IPDPS, Anchorage, Alaska, USA, May, pp. 956–965. [Google Scholar]
  • Zhang Y., Cohen J., Owens J.D. (2010) Fast tridiagonal solvers on the GPU, in: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP, Bangalore, India, January, pp. 127–136. [Google Scholar]
  • NVIDIA. Nvidia-cuda toolkit documentation. http://docs.nvidia.com/cuda/cusparse/. [Google Scholar]
  • Ben-Shalom R., Liberman G., Korngreen A. (2013) Accelerating compartmental modeling on a graphical processing unit, Front. Neuroanat. 7, 4. [PubMed] [Google Scholar]
  • Stone H.S. (1973) An efficient parallel algorithm for the solution of a tridiagonal linear system of equations, J. ACM 20, 1, 27–38. [CrossRef] [Google Scholar]
  • Valero-Lara P., Pelayo F.L. (2011) Towards a more efficient use of GPUs, in: International Conference on Computational Science and Its Applications, ICCSA 2011, Santander, Spain, June 20–23, pp. 3–9. [Google Scholar]
  • Valero-Lara P., Pelayo F.L. (2013) Analysis in performance and new model for multiple kernels executions on many-core architectures, in: IEEE 12th International Conference on Cognitive Informatics and Cognitive Computing, ICCI*CC 2013, New York, NY, USA, July 16–18, pp. 189–194. [Google Scholar]
  • Valero-Lara P. (2014) Multi-gpu acceleration of DARTEL (early detection of alzheimer), in: 2014 IEEE International Conference on Cluster Computing, CLUSTER 2014, Madrid, Spain, September 22–26, pp. 346–354. [Google Scholar]
  • Valero-Lara P., Nookala P., Pelayo F.L., Jansson J., Dimitropoulos S., Raicu l. (2016) Many-task computing on many-core architectures, Scalable Computing: Practice and Experience 17, 1, 32–46. [CrossRef] [Google Scholar]
  • Valero-Lara P., Martínez-Perez I., Peña A.J., Martorell X., Sirvent R., Labarta J. (2017) CuHines-batch: Solving multiple Hines systems on GPUs Human Brain Project*, in: International Conference on Computational Science, ICCS 2017, Zurich, Switzerland, June 12–14, pp. 566–575. [Google Scholar]
  • Cumming B. (2010) Coreneuron overview. CSCS – Swiss National Supercomputing Center. [Google Scholar]
  • Zhang Y., Cohen J., Owens J.D. (2010) Fast tridiagonal solvers on the GPU, SIGPLAN Not. 45, 5, 127–136. [CrossRef] [Google Scholar]
  • Kim H.-S., Wu S.Z., Chang L.W., Hwu W.W. (2011) A scalable tridiagonal solver for GPUs, in: 2013 42nd International Conference on Parallel Processing, pp. 444–453. [Google Scholar]
  • Sakharnykh N. (2010) Efficient tridiagonal solvers for adi methods and fluid simulation in: NVIDIA GPU Technology Conference, September. [Google Scholar]
  • Davidson A., Zhang Y., Owens J.D. (2011) An autotuned method for solving large tridiagonal systems on the GPU in: IEEE International Parallel and Distributed Processing Symposium, May. [Google Scholar]
  • Valero-Lara P., Martínez-Pérez I., Sirvent R., Martorell X., Peña A.J. (2019) cuThomasBatch and cuThomasVBatch, CUDA Routines to compute batch of tridiagonal systems on NVIDIA GPUs, Concurrency and Computation: Practice and Experience. [Google Scholar]
  • Valero-Lara P., Martínez-Perez I., Sirvent R., Martorell X., Peña A.J. (2017) NVIDIA GPUs scalability to solve multiple (batch) tridiagonal systems implementation of cuThomasBatch in: Parallel Processing and Applied Mathematics - 12th International Conference, PPAM2017, Lublin, Poland, Revised Selected Papers, Part I, September 10–13, pp. 243–253. [Google Scholar]
  • Vooturi D.T., Kothapalli K., Bhalla U.S. (2017) Parallelizing Hines matrix solver in neuron simulations on GPU, in: 24th IEEE International Conference on High Performance Computing, HiPC 2017, Jaipur, India, December 18–21, pp. 388–397. [Google Scholar]
  • Dongarra J.J., Hammarling S., Higham N.J., Relton S.D., Valero-Lara P., Zounon M. (2017) The design and performance of batched BLAS on modern high-performance computing systems, in: International Conference on Computational Science, ICCS 2017, Zurich, Switzerland, June 12–14, pp. 495–504. [Google Scholar]
  • Intel. Intel(r) math kernel library - introducing vectorized compact routines, https://software.intel.com/en-us/articles/intelr-math-kernel-library-introducing-vectorized-compact-routines [Google Scholar]
  • Kim K.J., Costa T.B., Deveci M., Bradley A.M., Hammond S.D., Guney M.E., Knepper S., Story S., Rajamanickam S. (2017) Designing vector-friendly compact BLAS and LAPACK kernels, in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017, Denver, CO, USA, November 12–17, pp. 55:1–55:12. [Google Scholar]
  • Chang L.-W., Stratton J.A., Kim H.-S., Hwu W.-W. (2012) A scalable, numerically stable, high-performance tridiagonal solver using GPUs, in: SC Conference on High Performance Computing Networking, Storage and Analysis, SC ‘12, Salt Lake City, UT, USA, November 11–15, p. 27. [Google Scholar]
  • Zhang Y., Cohen J., Owens J.D. (2010) Fast tridiagonal solvers on the GPU, in: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2010, Bangalore, India, January 9–14, pp. 127–136. [Google Scholar]
  • László E., Giles M.B., Appleyard J. (2016) Many-core algorithms for batch scalar and block tridiagonal solvers, ACMTrans. Math. Softw. 42, 4, 311–3136. [Google Scholar]

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.