In: Proceedings of the 16th ACM SIGPLAN annual symposium on principles and practice of parallel programming (PPoPP’11), San Antonio, TX, USA, pp 12–16 Kourtis K, Karakasis V, Goumas G, Kozirisl N (2011) CSX: an extended compression format for SpMV on shared memory systems. In: Proceedings of the 5th conference on computing frontiers (CF’08), Ischia, Italy, pp 87–96 Kourtis K, Goumas G, Koziris N (2008) Optimizing sparse matrix–vector multiplication using index and value compression. In: Proceedings of the 23rd international conference on supercomputing (ICS’09), Yorktown Heights, NY, USA, pp 100–109 Int J High Perform Comput Appl 24(2):136–153īelgin M, Back G, Ribbens CJ (2009) Pattern-based sparse matrix representation for memory-efficient SMVM kernels. Pichel JC, Heras DB, Cabaleiro JC, García-Loureiro AJ, Rivera FF (2010) Increasing the locality of iterative methods and its application to the simulation of semiconductor devices. Vuduc R, Demmel JW, Yelick KA (2005) OSKI: a library of automatically tuned sparse matrix kernels. In: Proceedings of the 3rd conference on partitioned global address space programming models (PGAS’09), Ashburn, Virginia, USA Jin H, Hood R, Mehrotra P (2009) A practical study of UPC using the NAS parallel benchmarks. In: Proceedings of the 14th ACM/IEEE international conference for high performance computing, networking, storage and analysis (SC’02), Baltimore, MD, USA, pp 1–26 In: Proceedings of the 17th international conference on supercomputing (ICS’03), San Francisco, CA, USA, pp 63–73Įl-Ghazawi T, Cantonnet F (2002) UPC performance and potential: a NPB experimental study. (Last visit August 2014)Ĭhen WY, Bonachea D, Duell J, Husbands P, Iancu C, Yelick K (2003) A performance analysis of the Berkeley UPC compiler. Zheng Y (2010) Optimizing UPC programs for multi-core systems. Shan H, Blagojević F, Min SJ, Hargrove P, Jin H, Fuerlinger K, Koniges A, Wright NJ (2010) A programming model performance study using the NAS parallel benchmarks. In: Proceedings of the 3rd conference on partitioned global address space programming models (PGAS’09). Mallón DA, Gómez A, Mouriño JC, Taboada GL, Teijeiro C, Touriño J, Fraguela BB, Doallo R, Wibecan B (2009) UPC performance evaluation on a multicore system. (Last visit August 2014)Įl-Ghazawi T, Carlson W, Sterling T, Yelick K (2003) UPC: distributed shared-memory programming. (Last visit August 2014)īerkeley UPC Project. Petitet A, Whaley RC, Dongarra J, Cleary A (2014) HPL-a portable implementation of the high-performance linpack benchmark for distributed-memory computers. Technical Report SAND2013-4744, Sandia National Laboratories, USA Int J High Perform Comput Appl 5:63–73ĭongarra J, Heroux MA (2013) Toward a new metric for ranking high performance computing systems. Bailey DH, Barszcz E, Barton JT, Browning DS, Carter RL, Dagum L, Fatooh RA, Frederickson PO, Lasinski TA, Schreiber RS, Simon HD, Venkatakrishnan V, Weeratunga SK (1991) The NAS parallel benchmarks.
0 Comments
Leave a Reply. |