Reading Resources: CGRA Accelerators for General-Purpose Computing

CGRA Architectures

  • Bingfeng Mei, M. Berekovic, and J. Y. Mignolet. “Adres & dresc: Architecture and compiler for coarse-grain reconfigurable processors.” In Fine-and coarse-grain reconfigurable computing, pp. 255-297. Springer, Dordrecht, 2007.
  • Hartej Singh, Ming-Hau Lee, Guangming Lu, Fadi J. Kurdahi, Nader Bagherzadeh, and Eliseu M. Chaves Filho. “MorphoSys: an integrated reconfigurable system for data-parallel and computation-intensive applications.” IEEE transactions on computers 49, no. 5 (2000): 465-481.
  • Manupa Karunaratne, Aditi Kulkarni Mohite, Tulika Mitra, and Li-Shiuan Peh. “Hycube: A cgra with reconfigurable single-cycle multi-hop interconnect.” In 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC), pp. 1-6. IEEE, 2017.

Compilation Techniques for Mapping Loops on CGRAs

  • Hyunchul Park, Kevin Fan, Scott A. Mahlke, Taewook Oh, Heeseok Kim, and Hong-seok Kim. “Edge-centric modulo scheduling for coarse-grained reconfigurable architectures.” In Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pp. 166-176. ACM, 2008.
  • Mahdi Hamzeh, Aviral Shrivastava, and Sarma Vrudhula. “REGIMap: register-aware application mapping on coarse-grained reconfigurable architectures (CGRAs).” In Proceedings of the 50th Annual Design Automation Conference, p. 18. ACM, 2013.
  • B. Ramakrishna Rau. “Iterative modulo scheduling.” International Journal of Parallel Programming 24, no. 1 (1996): 3-64.
  • Shail Dave, Mahesh Balasubramanian, and Aviral Shrivastava. “Ramp: Resource-aware mapping for cgras.” In 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), pp. 1-6. IEEE, 2018.
  • Mahdi Hamzeh, Aviral Shrivastava, and Sarma Vrudhula. “EPIMap: Using epimorphism to map applications on CGRAs.” In DAC Design Automation Conference 2012, pp. 1280-1287. IEEE, 2012.
  • Liang Chen, and Tulika Mitra. “Graph minor approach for application mapping on cgras.” ACM Transactions on Reconfigurable Technology and Systems (TRETS) 7, no. 3 (2014): 21.
  • Taewook Oh, Bernhard Egger, Hyunchul Park, and Scott Mahlke. “Recurrence cycle aware modulo scheduling for coarse-grained reconfigurable architectures.” In ACM Sigplan Notices, vol. 44, no. 7, pp. 21-30. ACM, 2009.

Techniques for Data Management of CGRA Accelerators

  • Yongjoo Kim, Jongeun Lee, Aviral Shrivastava, and Yunheung Paek. “Memory access optimization in compilation for coarse-grained reconfigurable architectures.” ACM Transactions on Design Automation of Electronic Systems (TODAES) 16, no. 4 (2011): 42.
  • Yongjoo Kim, Jongeun Lee, Aviral Shrivastava, Jonghee W. Yoon, Doosan Cho, and Yunheung Paek. “High throughput data mapping for coarse-grained reconfigurable architectures.” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 30, no. 11 (2011): 1599-1609.
  • Zhongyuan Zhao, Yantao Liu, Weiguang Sheng, Tushar Krishna, Qin Wang, and Zhigang Mao. “Optimizing the data placement and transformation for multi-bank cgra computing system.” In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 1087-1092. IEEE, 2018.
  • Yansheng Wang, Leibo Liu, Shouyi Yin, Min Zhu, Peng Cao, Jun Yang, and Shaojun Wei. “On-chip memory hierarchy in one coarse-grained reconfigurable architecture to compress memory space and to reduce reconfiguration time and data-reference time.” IEEE Transactions on Very Large Scale Integration (VLSI) Systems 22, no. 5 (2013): 983-994.

Techniques for Executing Loops with Control Flow on CGRAs

  • Kyuseung Han, Junwhan Ahn, and Kiyoung Choi. “Power-efficient predication techniques for acceleration of control flow execution on cgra.” ACM Transactions on Architecture and Code Optimization (TACO) 10, no. 2 (2013): 8.
  • Mahdi Hamzeh, Aviral Shrivastava, and Sarma Vrudhula. “Branch-aware loop mapping on CGRAs.” In Proceedings of the 51st Annual Design Automation Conference, pp. 1-6. ACM, 2014.
  • Michael Pellauer, Angshuman Parashar, Michael Adler, Bushra Ahsan, Randy Allmon, Neal Crago, Kermin Fleming et al. “Efficient control and communication paradigms for coarse-grained spatial architectures.” ACM Transactions on Computer Systems (TOCS) 33, no. 3 (2015): 10.

Techniques for Executing Nested Loops on CGRAs

  • Manupa Karunaratne, Cheng Tan, Aditi Kulkarni, Tulika Mitra, and Li-Shiuan Peh. “Dnestmap: mapping deeply-nested loops on ultra-low power CGRAs.” In 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), pp. 1-6. IEEE, 2018.
  • Yongjoo Kim, Jongeun Lee, Toan X. Mai, and Yunheung Paek. “Improving performance of nested loops on reconfigurable array processors.” ACM Transactions on Architecture and Code Optimization (TACO) 8, no. 4 (2012): 32.
  • Jongeun Lee, Seongseok Seo, Hongsik Lee, and Hyeon Uk Sim. “Flattening-based mapping of imperfect loop nests for CGRAs.” In Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis, p. 9. ACM, 2014.
  • Dajiang Liu, Shouyi Yin, Leibo Liu, and Shaojun Wei. “Polyhedral model based mapping optimization of loop nests for CGRAs.” In Proceedings of the 50th Annual Design Automation Conference, p. 19. ACM, 2013.

Multithreading on CGRAs

  • Kehuai Wu, Andreas Kanstein, Jan Madsen, and Mladen Berekovic. “MT-ADRES: multithreading on coarse-grained reconfigurable architecture.” In International Workshop on Applied Reconfigurable Computing, pp. 26-38. Springer, Berlin, Heidelberg, 2007.
  • Aviral Shrivastava, Jared Pager, Reiley Jeyapaul, Mahdi Hamzeh, and Sarma Vrudhula. “Enabling multithreading on CGRAs.” In 2011 International Conference on Parallel Processing, pp. 255-264. IEEE, 2011.
  • Andreas Kanstein, Sebastian López Suárez, and Bjorn De Sutter. “Optimizing coarse-grain reconfigurable hardware utilization through multiprocessing: An H. 264/AVC decoder example.” In VLSI Circuits and Systems III, vol. 6590, p. 65900C. International Society for Optics and Photonics, 2007.

Techniques for Optimizing Register Management of CGRAs

  • Bjorn De Sutter, Paul Coene, Tom Vander Aa, and Bingfeng Mei. “Placement-and-routing-based register allocation for coarse-grained reconfigurable arrays.” ACM Sigplan Notices 43, no. 7 (2008): 151-160.
  • Shail Dave, Mahesh Balasubramanian, Aviral Shrivastava, “URECA: A Compiler Solution to Manage Unified Register File for CGRAs”, in Proceedings of the 21st International Conference on Design Automation and Test in Europe (DATE), 2018
  • Zion Kwok, and Steven JE Wilton. “Register file architecture optimization in a coarse-grained reconfigurable architecture.” In 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’05), pp. 35-44. IEEE, 2005.