Shail Dave; Tony Nowatzki; Aviral Shrivastava
Explainable-DSE: An Agile and Explainable Exploration of Efficient Hardware/Software Codesigns of Deep Learning Accelerators Using Bottleneck Analysis Proceedings Article
In: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024, (Won Silver Medal at ACM Student Research Competition 2022-23 (Host: ACM SIGBED)).
Abstract | BibTeX | Tags: Accelerated Computing, Machine Learning, Machine Learning Accelerators | Links:
@inproceedings{Dave2024ASPLOS,
title = {Explainable-DSE: An Agile and Explainable Exploration of Efficient Hardware/Software Codesigns of Deep Learning Accelerators Using Bottleneck Analysis},
author = {Shail Dave and Tony Nowatzki and Aviral Shrivastava},
url = {https://mpslab-asu.github.io/publications/papers/Dave2024ASPLOS.pdf, pdf
https://mpslab-asu.github.io/publications/slides/Dave2024ASPLOS.pptx, slides
https://mpslab-asu.github.io/publications/posters/Dave2024ASPLOS.pdf, poster
https://youtu.be/y-F1Cp66_oQ, teaser},
year = {2024},
date = {2024-04-02},
urldate = {2024-04-02},
booktitle = {Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)},
abstract = {Effective design space exploration (DSE) is paramount for hardware/software codesigns of deep learning accelerators that must meet strict execution constraints. For their vast search space, existing DSE techniques can require excessive number of trials to obtain valid and efficient solution because they rely on black-box explorations that do not reason about design inefficiencies. In this paper, we propose Explainable-DSE – a framework for DSE of DNN accelerator codesigns using bottleneck analysis. By leveraging information about execution costs from bottleneck models, our DSE is able to identify the bottlenecks and therefore the reasons for design inefficiency, and can therefore make mitigating acquisitions in further explorations. We describe the construction of such bottleneck models for DNN accelerator domain. We also propose an
API for expressing such domain-specific models and integrating them into the DSE framework. Acquisitions of our DSE framework caters to multiple bottlenecks in executions of workloads like DNNs that contain different functions with diverse execution characteristics. Evaluations for recent computer vision and language models show that Explainable-DSE mostly explores effectual candidates, achieving codesigns of 6× lower latency in 47× fewer iterations vs. non-explainable techniques using evolutionary or ML-based optimizations. By taking minutes or tens of iterations, it enables opportunities for runtime DSEs.},
note = {Won Silver Medal at ACM Student Research Competition 2022-23 (Host: ACM SIGBED)},
keywords = {Accelerated Computing, Machine Learning, Machine Learning Accelerators},
pubstate = {published},
tppubtype = {inproceedings}
}
API for expressing such domain-specific models and integrating them into the DSE framework. Acquisitions of our DSE framework caters to multiple bottlenecks in executions of workloads like DNNs that contain different functions with diverse execution characteristics. Evaluations for recent computer vision and language models show that Explainable-DSE mostly explores effectual candidates, achieving codesigns of 6× lower latency in 47× fewer iterations vs. non-explainable techniques using evolutionary or ML-based optimizations. By taking minutes or tens of iterations, it enables opportunities for runtime DSEs.
Shail Dave; Aviral Shrivastava
Automating the Architectural Execution Modeling and Characterization of Domain-Specific Architectures Conference
Proceedings of the TECHCON, 2023.
Abstract | BibTeX | Tags: Accelerated Computing, Machine Learning, Machine Learning Accelerators | Links:
@conference{Dave2023TECHCON,
title = {Automating the Architectural Execution Modeling and Characterization of Domain-Specific Architectures},
author = {Shail Dave and Aviral Shrivastava},
url = {https://mpslab-asu.github.io/publications/papers/Dave2023TECHCON.pdf, pdf},
year = {2023},
date = {2023-09-11},
urldate = {2023-09-11},
booktitle = {Proceedings of the TECHCON},
abstract = {Domain-specific architectures (DSAs) are increasingly designed to efficiently process a variety of workloads, such as deep learning, linear algebra, and graph analytics. Most research efforts have focused on proposing new DSAs or efficiently exploring hardware/software designs of previously proposed architecture templates. Recent architectural modeling or simulation frameworks for DSAs can analyze execution costs, e.g., for a limited architectural templates for dense DNNs such as systolic arrays or a spatial architecture with an array of processing elements and 3-level memory hierarchy. However, they are manually developed by domain-experts, containing several 1000s of lines-of-code, and extending them for characterizing new architectures is infeasible, such as DSAs for sparse DNNs. Further, the lack of automated architecture-level execution modeling limits the design space of novel architectures that can be explored/optimized, affecting overall efficiency of solutions, and it delays time-to-market with low sustainability of design process.
To address this issue, this paper introduces DSAProf : a framework for automated execution modeling and bottleneck characterization by a modular, dataflow-driven approach. The framework uses a flow-graph-based methodology for modeling DSAs in a modular manner via a library of architectural components and analyzing their executions. The methodology can account for analytically modeling and simulating intricacies in the presence of a variety of architectural features such as asynchronous execution of workgroups, sparse data processing, arbitrary buffer hierarchies, and multi-chip or mixed-precision modules. Preliminary evaluations of modeling previously proposed DSAs for dense/sparse deep learning demonstrate that our approach is extensible for novel DSAs and it can accurately and automatically characterize their latency and identify execution bottlenecks, without requiring designers to manually build analysis/simulator from scratch for every DSA.},
keywords = {Accelerated Computing, Machine Learning, Machine Learning Accelerators},
pubstate = {published},
tppubtype = {conference}
}
To address this issue, this paper introduces DSAProf : a framework for automated execution modeling and bottleneck characterization by a modular, dataflow-driven approach. The framework uses a flow-graph-based methodology for modeling DSAs in a modular manner via a library of architectural components and analyzing their executions. The methodology can account for analytically modeling and simulating intricacies in the presence of a variety of architectural features such as asynchronous execution of workgroups, sparse data processing, arbitrary buffer hierarchies, and multi-chip or mixed-precision modules. Preliminary evaluations of modeling previously proposed DSAs for dense/sparse deep learning demonstrate that our approach is extensible for novel DSAs and it can accurately and automatically characterize their latency and identify execution bottlenecks, without requiring designers to manually build analysis/simulator from scratch for every DSA.
Yi Hu; Chaoran Zhang; Edward Andert; Harshul Singh; Aviral Shrivastava; James Laudon; Yanqi Zhou; Bob Iannucci; Carlee Joe-Wong
GiPH: Generalizable Placement Learning for Adaptive Heterogeneous Computing Proceedings Article
In: Proceedings of the Sixth Conference on Machine Learning and Systems (MLSys), 2023.
BibTeX | Tags: Accelerated Computing, Machine Learning, Machine Learning Accelerators, Real-Time Systems | Links:
@inproceedings{Hu2023MLSYS,
title = {GiPH: Generalizable Placement Learning for Adaptive Heterogeneous Computing},
author = {Yi Hu and Chaoran Zhang and Edward Andert and Harshul Singh and Aviral Shrivastava and James Laudon and Yanqi Zhou and Bob Iannucci and Carlee Joe-Wong},
url = {https://mpslab-asu.github.io/publications/papers/Hu2023MLSYS.pdf, pdf},
year = {2023},
date = {2023-06-04},
urldate = {2023-06-04},
booktitle = {Proceedings of the Sixth Conference on Machine Learning and Systems (MLSys)},
keywords = {Accelerated Computing, Machine Learning, Machine Learning Accelerators, Real-Time Systems},
pubstate = {published},
tppubtype = {inproceedings}
}
Aviral Shrivastava; Xiaobo Sharon Hu
Report on the 2022 Embedded Systems Week (ESWEEK) Journal Article
In: IEEE Design & Test, vol. 40, iss. 1, pp. 108-111, 2023.
Abstract | BibTeX | Tags: Accelerated Computing, CPS, Efficient Embedded Computing, Error Resilience, Machine Learning Accelerators, Real-Time Systems | Links:
@article{Shrivastava2023D&T,
title = {Report on the 2022 Embedded Systems Week (ESWEEK)},
author = {Aviral Shrivastava; Xiaobo Sharon Hu},
url = {https://mpslab-asu.github.io/publications/papers/Shrivastava2023D&T.pdf, pdf},
year = {2023},
date = {2023-01-23},
urldate = {2023-01-23},
journal = {IEEE Design & Test},
volume = {40},
issue = {1},
pages = {108-111},
abstract = {Embedded Systems Week (ESWEEK) is the premier event covering all aspects of hardware and software design for intelligent and connected computing systems. By bringing together three leading conferences [the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES); the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS); and the International Conference on Embedded Software (EMSOFT)] and a variety of symposia, hot-topic workshops, tutorials, and education classes, ESWEEK presents to the attendees a wide range of topics unveiling state-of-the-art embedded software, embedded architectures, and embedded system designs.},
keywords = {Accelerated Computing, CPS, Efficient Embedded Computing, Error Resilience, Machine Learning Accelerators, Real-Time Systems},
pubstate = {published},
tppubtype = {article}
}
Quoc Long Vinh Ta
COMSAT: Modified Modulo Scheduling Techniques for Acceleration on Unknown Trip Count and Early Exit Loops Masters Thesis
Arizona State University, 2022.
BibTeX | Tags: Accelerated Computing, CGRA | Links:
@mastersthesis{Ta2022THESIS,
title = {COMSAT: Modified Modulo Scheduling Techniques for Acceleration on Unknown Trip Count and Early Exit Loops},
author = {Quoc Long Vinh Ta},
url = {https://mpslab-asu.github.io/publications/papers/Ta2022THESIS.pdf, pdf
https://mpslab-asu.github.io/publications/slides/Ta2022THESIS.pptx, slides
},
year = {2022},
date = {2022-12-08},
urldate = {2022-12-08},
school = {Arizona State University},
keywords = {Accelerated Computing, CGRA},
pubstate = {published},
tppubtype = {mastersthesis}
}
Shail Dave; Alberto Marchisio; Muhammad Abdullah Hanif; Amira Guesmi; Aviral Shrivastava; Ihsen Alouani; Muhammad Shafique
Special Session: Towards an Agile Design Methodology for Efficient, Reliable, and Secure ML Systems Proceedings Article
In: Proceedings of the 2022 IEEE 40th VLSI Test Symposium (VTS), 2022.
Abstract | BibTeX | Tags: Accelerated Computing, Efficient Embedded Computing, Error Resilience, Machine Learning, Machine Learning Accelerators, Soft Error | Links:
@inproceedings{DaveVTS2022,
title = {Special Session: Towards an Agile Design Methodology for Efficient, Reliable, and Secure ML Systems},
author = {Shail Dave and Alberto Marchisio and Muhammad Abdullah Hanif and Amira Guesmi and Aviral Shrivastava and Ihsen Alouani and Muhammad Shafique},
url = {https://mpslab-asu.github.io/publications/papers/Dave2022VTS.pdf, pdf
https://mpslab-asu.github.io/publications/slides/Dave2022VTS.pptx, slides},
year = {2022},
date = {2022-04-25},
urldate = {2022-04-25},
booktitle = {Proceedings of the 2022 IEEE 40th VLSI Test Symposium (VTS)},
abstract = {The real-world use cases of Machine Learning (ML) have exploded over the past few years. However, the current computing infrastructure is insufficient to support all real-world applications and scenarios. Apart from high efficiency requirements, modern ML systems are expected to be highly reliable against hardware failures as well as secure against adversarial and IP stealing attacks. Recent developments have also highlighted various privacy concerns. Towards trustworthy ML systems, in this work we highlight different challenges faced by the embedded systems community towards enabling efficient,
dependable and secure deployment of ML. To address these challenges, we present an agile design methodology to generate efficient, reliable and secure ML systems based on user-defined constraints and objectives.},
keywords = {Accelerated Computing, Efficient Embedded Computing, Error Resilience, Machine Learning, Machine Learning Accelerators, Soft Error},
pubstate = {published},
tppubtype = {inproceedings}
}
dependable and secure deployment of ML. To address these challenges, we present an agile design methodology to generate efficient, reliable and secure ML systems based on user-defined constraints and objectives.
Mahesh Balasubramanian; Aviral Shrivastava
PathSeeker: A Fast Mapping Algorithm for CGRAs Proceedings Article
In: Proceedings of the 25th International Conference on Design Automation and Test in Europe (DATE), 2022.
BibTeX | Tags: Accelerated Computing, CGRA | Links:
@inproceedings{Balasubramanian2022DATE,
title = {PathSeeker: A Fast Mapping Algorithm for CGRAs},
author = {Mahesh Balasubramanian and Aviral Shrivastava},
url = {https://mpslab-asu.github.io/publications/papers/Balasubramanian2022DATE.pdf, pdf
https://mpslab-asu.github.io/publications/slides/Balasubramanian2022DATE.pptx, slides},
year = {2022},
date = {2022-03-16},
urldate = {2022-03-16},
booktitle = {Proceedings of the 25th International Conference on Design Automation and Test in Europe (DATE)},
journal = {Proceedings of the 25th International Conference on Design Automation and Test in Europe (DATE)},
keywords = {Accelerated Computing, CGRA},
pubstate = {published},
tppubtype = {inproceedings}
}
Shail Dave; Aviral Shrivastava
Design Space Description Language for Automated and Comprehensive Exploration of Next-Gen Hardware Accelerators Workshop
Workshop on Languages, Tools, and Techniques for Accelerator Design (LATTE), 2022, (co-located with ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).).
Abstract | BibTeX | Tags: Accelerated Computing, CGRA, Machine Learning, Machine Learning Accelerators | Links:
@workshop{DaveLATTE2022,
title = {Design Space Description Language for Automated and Comprehensive Exploration of Next-Gen Hardware Accelerators},
author = {Shail Dave and Aviral Shrivastava},
url = {https://mpslab-asu.github.io/publications/papers/Dave2022LATTE.pdf, pdf
https://mpslab-asu.github.io/publications/slides/Dave2022LATTE.pptx, slides
https://capra.cs.cornell.edu/latte22/, workshop
https://youtu.be/Z5jZ2dbE0To, talk},
year = {2022},
date = {2022-03-01},
urldate = {2022-03-01},
booktitle = {Workshop on Languages, Tools, and Techniques for Accelerator Design (LATTE)},
abstract = {Exploration of accelerators typically involves an architectural template specified in architecture description language (ADL). It can limit the design space that can be explored, reusability and automation of system stack, explainability, and exploration efficiency. We envision Design Space Description Language (DSDL) for comprehensive, reusable, explainable, and agile DSE. We describe how its flow graph abstraction enables comprehensive DSE of modular designs, with architectural components organized in various hierarchies and groups. We discuss automation of characterizing, simulating, and programming new architectures. Lastly, we describe how DSDL flow graphs facilitate bottleneck analysis, yielding explainability of costs and selected designs and super-fast exploration.},
note = {co-located with ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).},
keywords = {Accelerated Computing, CGRA, Machine Learning, Machine Learning Accelerators},
pubstate = {published},
tppubtype = {workshop}
}
Mahesh Balasubramanian
Compiler Design for Accelerating Applications on Coarse-Grained Reconfigurable Architectures PhD Thesis
2021.
BibTeX | Tags: Accelerated Computing, CGRA | Links:
@phdthesis{Balasubramanian2021THESIS,
title = {Compiler Design for Accelerating Applications on Coarse-Grained Reconfigurable Architectures},
author = {Mahesh Balasubramanian },
url = {https://mpslab-asu.github.io/publications/papers/Balasubramanian2021THESIS.pdf, pdf
https://mpslab-asu.github.io/publications/slides/Balasubramanian2021THESIS.pptx, slides},
year = {2021},
date = {2021-10-25},
urldate = {2021-12-13},
keywords = {Accelerated Computing, CGRA},
pubstate = {published},
tppubtype = {phdthesis}
}
Shail Dave; Riyadh Baghdadi; Tony Nowatzki; Sasikanth Avancha; Aviral Shrivastava; Baoxin Li
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights Journal Article
In: Proceedings of the IEEE (PIEEE), 2021, (arXiv: 2007.00864).
BibTeX | Tags: Accelerated Computing, CGRA, Low-power Computing, Machine Learning, Machine Learning Accelerators | Links:
@article{DavePIEEE2021,
title = {Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights},
author = {Shail Dave and Riyadh Baghdadi and Tony Nowatzki and Sasikanth Avancha and Aviral Shrivastava and Baoxin Li},
url = {https://mpslab-asu.github.io/publications/papers/Dave2021PIEEE.pdf, paper},
year = {2021},
date = {2021-10-01},
urldate = {2021-10-01},
journal = {Proceedings of the IEEE (PIEEE)},
note = {arXiv: 2007.00864},
keywords = {Accelerated Computing, CGRA, Low-power Computing, Machine Learning, Machine Learning Accelerators},
pubstate = {published},
tppubtype = {article}
}
Abhishek Singh; Shail Dave; PanteA Zardoshti; Robert Brotzman; Chao Zhang; Xiaochen Guo; Aviral Shrivastava; Gang Tan; Michael Spear
SPX64: A Scratchpad Memory for General-Purpose Microprocessors Journal Article
In: ACM Transactions on Architecture and Code Optimization (TACO), 2021.
BibTeX | Tags: Accelerated Computing, Scratchpad Memory | Links:
@article{SinghTACO2020,
title = {SPX64: A Scratchpad Memory for General-Purpose Microprocessors},
author = {Abhishek Singh and Shail Dave and PanteA Zardoshti and Robert Brotzman and Chao Zhang and Xiaochen Guo and Aviral Shrivastava and Gang Tan and Michael Spear},
url = {https://mpslab-asu.github.io/publications/papers/Singh2020TACO.pdf, paper},
year = {2021},
date = {2021-01-18},
journal = {ACM Transactions on Architecture and Code Optimization (TACO)},
keywords = {Accelerated Computing, Scratchpad Memory},
pubstate = {published},
tppubtype = {article}
}
Mahesh Balasubramanian; Aviral Shrivastava
CRIMSON: Compute-intensive loop acceleration by Randomized Iterative Modulo Scheduling and Optimized Mapping on CGRAs Journal Article
In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2020.
BibTeX | Tags: Accelerated Computing, CGRA | Links:
@article{Balasubramanian2020TCAD,
title = {CRIMSON: Compute-intensive loop acceleration by Randomized Iterative Modulo Scheduling and Optimized Mapping on CGRAs},
author = {Mahesh Balasubramanian and Aviral Shrivastava },
url = {https://MPSLab-ASU.github.io/publications/papers/Balasubramanian2020TCAD.pdf, paper},
year = {2020},
date = {2020-09-01},
journal = {IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD)},
keywords = {Accelerated Computing, CGRA},
pubstate = {published},
tppubtype = {article}
}
Mahesh Balasubramanian; Trevor Ruiz; Brandon Cook; Mr Prabhat; Sharmodeep Bhattacharyya; Aviral Shrivastava; Kristofer Bouchard
Scaling of Union of Intersections for Inference of Granger Causal Networks from Observational Data Proceedings Article
In: Proceeding of the 34th IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2020.
BibTeX | Tags: Accelerated Computing, HPC | Links:
@inproceedings{BalasubramanianI2020IPDPS,
title = {Scaling of Union of Intersections for Inference of Granger Causal Networks from Observational Data},
author = {Mahesh Balasubramanian and Trevor Ruiz and Brandon Cook and Mr Prabhat and Sharmodeep Bhattacharyya and Aviral Shrivastava and Kristofer Bouchard},
url = {https://mpslab-asu.github.io/publications/papers/Balasubramanian2020IPDPS.pdf, paper
https://mpslab-asu.github.io/publications/slides/Balasubramanian2020IPDPS.pptx, slides},
year = {2020},
date = {2020-05-01},
booktitle = {Proceeding of the 34th IEEE International Parallel & Distributed Processing Symposium (IPDPS)},
keywords = {Accelerated Computing, HPC},
pubstate = {published},
tppubtype = {inproceedings}
}
Shail Dave; Aviral Shrivastava; Youngbin Kim; Sasikanth Avancha; Kyoungwoo Lee
dMazeRunner: Optimizing Convolutions on Dataflow Accelerators Proceedings Article
In: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, (Invited).
BibTeX | Tags: Accelerated Computing, Machine Learning, Machine Learning Accelerators | Links:
@inproceedings{DaveICASSP2020,
title = {dMazeRunner: Optimizing Convolutions on Dataflow Accelerators},
author = {Shail Dave and Aviral Shrivastava and Youngbin Kim and Sasikanth Avancha and Kyoungwoo Lee},
url = {https://mpslab-asu.github.io/publications/papers/Dave2020ICASSP.pdf, paper
https://mpslab-asu.github.io/publications/slides/Dave2020ICASSP.pptx, slides
https://github.com/MPSLab-ASU/dMazeRunner, code
https://www.youtube.com/watch?v=21F79Taelts, video},
year = {2020},
date = {2020-04-09},
urldate = {2020-04-09},
booktitle = {ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
note = {Invited},
keywords = {Accelerated Computing, Machine Learning, Machine Learning Accelerators},
pubstate = {published},
tppubtype = {inproceedings}
}
Shail Dave; Youngbin Kim; Sasikanth Avancha; Kyoungwoo Lee; Aviral Shrivastava
DMazeRunner: Executing Perfectly Nested Loops on Dataflow Accelerators Journal Article
In: ACM Transactions on Embedded Computing Systems (TECS), vol. 18, no. 5s, 2019, (Special Issue on ESWEEK 2019 - Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)).
BibTeX | Tags: Accelerated Computing, Machine Learning, Machine Learning Accelerators | Links:
@article{DaveTECS2019,
title = {DMazeRunner: Executing Perfectly Nested Loops on Dataflow Accelerators},
author = {Shail Dave and Youngbin Kim and Sasikanth Avancha and Kyoungwoo Lee and Aviral Shrivastava},
url = {https://mpslab-asu.github.io/publications/papers/Dave2019TECS.pdf, paper
https://mpslab-asu.github.io/publications/slides/Dave2019TECS.pptx, slides
https://mpslab-asu.github.io/publications/posters/Dave2019TECS.pdf, poster
https://github.com/MPSLab-ASU/dMazeRunner, code},
year = {2019},
date = {2019-01-01},
urldate = {2019-01-01},
journal = {ACM Transactions on Embedded Computing Systems (TECS)},
volume = {18},
number = {5s},
note = {Special Issue on ESWEEK 2019 - Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)},
keywords = {Accelerated Computing, Machine Learning, Machine Learning Accelerators},
pubstate = {published},
tppubtype = {article}
}
Shail Dave; Mahesh Balasubramanian; Aviral Shrivastava
RAMP: Resource-Aware Mapping for CGRAs Proceedings Article
In: Proceedings of the 55th Annual Design Automation Conference (DAC), 2018.
BibTeX | Tags: Accelerated Computing, CGRA | Links:
@inproceedings{DaveDAC2018,
title = {RAMP: Resource-Aware Mapping for CGRAs},
author = {Shail Dave and Mahesh Balasubramanian and Aviral Shrivastava},
url = {https://mpslab-asu.github.io/publications/papers/Dave2018DAC.pdf, paper
https://mpslab-asu.github.io/publications/slides/Dave2018DAC.pptx, slides
https://mpslab-asu.github.io/publications/posters/Dave2018DAC.pdf, poster
},
year = {2018},
date = {2018-01-01},
urldate = {2018-01-01},
booktitle = {Proceedings of the 55th Annual Design Automation Conference (DAC)},
keywords = {Accelerated Computing, CGRA},
pubstate = {published},
tppubtype = {inproceedings}
}
Shail Dave; Mahesh Balasubramanian; Aviral Shrivastava
URECA: A Compiler Solution to Manage Unified Register File for CGRAs Proceedings Article
In: Proceedings of the 21st International Conference on Design Automation and Test in Europe (DATE), 2018.
BibTeX | Tags: Accelerated Computing, CGRA | Links:
@inproceedings{DaveDATE2018,
title = {URECA: A Compiler Solution to Manage Unified Register File for CGRAs},
author = {Shail Dave and Mahesh Balasubramanian and Aviral Shrivastava},
url = {https://mpslab-asu.github.io/publications/papers/Dave2018DATE.pdf, paper
https://mpslab-asu.github.io/publications/slides/Dave2018DATE.pptx, slides},
year = {2018},
date = {2018-01-01},
booktitle = {Proceedings of the 21st International Conference on Design Automation and Test in Europe (DATE)},
keywords = {Accelerated Computing, CGRA},
pubstate = {published},
tppubtype = {inproceedings}
}
Mahesh Balasubramanian; Shail Dave; Aviral Shrivastava; Reiley Jeyapaul
LASER: A Hardware/Software Approach to Accelerate Complicated Loops on CGRAs Proceedings Article
In: Proceedings of the 21st International Conference on Design Automation and Test in Europe (DATE), 2018.
BibTeX | Tags: Accelerated Computing, CGRA | Links:
@inproceedings{BalasubramanianDATE2018,
title = {LASER: A Hardware/Software Approach to Accelerate Complicated Loops on CGRAs},
author = {Mahesh Balasubramanian and Shail Dave and Aviral Shrivastava and Reiley Jeyapaul},
url = {https://mpslab-asu.github.io/publications/papers/Balasubramanian2018DATE.pdf, paper
https://mpslab-asu.github.io/publications/slides/Balasubramanian2018DATE.pptx, slides},
year = {2018},
date = {2018-01-01},
booktitle = {Proceedings of the 21st International Conference on Design Automation and Test in Europe (DATE)},
keywords = {Accelerated Computing, CGRA},
pubstate = {published},
tppubtype = {inproceedings}
}
Shail Dave
Scalable Register File Architecture for CGRA Accelerators Masters Thesis
Arizona State University, 2016.
BibTeX | Tags: Accelerated Computing, CGRA | Links:
@mastersthesis{DaveMSTHESIS2016,
title = {Scalable Register File Architecture for CGRA Accelerators},
author = {Shail Dave},
url = {https://mpslab-asu.github.io/publications/papers/Dave2016THESIS.pdf, paper
https://mpslab-asu.github.io/publications/slides/Dave2016THESIS.pptx, slides},
year = {2016},
date = {2016-01-01},
school = {Arizona State University},
keywords = {Accelerated Computing, CGRA},
pubstate = {published},
tppubtype = {mastersthesis}
}
Shri Hari Rajendran Radhika; Aviral Shrivastava; Mahdi Hamzeh
Path Selection Based Acceleration of Conditionals in CGRAs Proceedings Article
In: Proceedings of the 2015 International Conference on Design Automation and Test in Europe (DATE), 2015.
BibTeX | Tags: Accelerated Computing, CGRA | Links:
@inproceedings{RadhikaDATE2015,
title = {Path Selection Based Acceleration of Conditionals in CGRAs},
author = {Shri Hari Rajendran Radhika and Aviral Shrivastava and Mahdi Hamzeh},
url = {https://mpslab-asu.github.io/publications/papers/Radhika2015DATE.pdf, paper
https://mpslab-asu.github.io/publications/slides/Radhika2015DATE.ppt, slides},
year = {2015},
date = {2015-01-01},
booktitle = {Proceedings of the 2015 International Conference on Design Automation and Test in Europe (DATE)},
keywords = {Accelerated Computing, CGRA},
pubstate = {published},
tppubtype = {inproceedings}
}