{"id":154,"date":"2011-02-01T17:48:42","date_gmt":"2011-02-01T17:48:42","guid":{"rendered":"http:\/\/aviral.lab.asu.edu\/?p=154"},"modified":"2020-01-29T17:46:06","modified_gmt":"2020-01-29T17:46:06","slug":"code-management-for-llm-multi-core-processor","status":"publish","type":"post","link":"https:\/\/labs.engineering.asu.edu\/mps-lab\/2011\/02\/code-management-for-llm-multi-core-processor\/","title":{"rendered":"Code Management for LLM multi-core processor"},"content":{"rendered":"<p>\t\t\t\tTo facilitate this code management, the IBM Cell processor provides an overlay mechanism. In a linker script, users can specify the number of regions and the mapping of functions into regions. Functions mapped to one region are mapped to the same physical location in the limited local memory, and replace each other when called. The size of region is equal to the size of the largest object mapped to the region, and the total code space required is the sum of the sizes of the regions. The goal of code mapping problem is then to generate a linker script, which minimizes the swapping of functions so that performance can be improved due to reduced data transfers between the global memory and local memory.<\/p>\n<p>The kernel problems of CODE MANAGEMENT is:<\/p>\n<ul>\n<li>What size should be distributed to code region?<\/li>\n<li>How many regions should be defined for a given CODE size?<\/li>\n<li>Which functions should be mapped to what regions?<\/li>\n<\/ul>\n<p><strong>Our Approach<\/strong><\/p>\n<p style=\"text-align: justify;\">We formulate our problem using <strong>Global Call Control Flow Graph (GCCFG)<\/strong>, which captures both aggregate and temporal information about the function callls.<\/p>\n<p style=\"text-align: center;\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" src=\"http:\/\/www.public.asu.edu\/%7Ekbai3\/figures\/gccfg.png\" alt=\"\" width=\"473\" height=\"204\" \/><\/p>\n<p>In the graph above, we show how to formulate  the GCCFG from the code. In the GCCFG, there is strict ordering among  the nodes, i.e. left child is called before the right child. L-node  means loop node and I-node means conditional node.The node weights  indicate execution  count of functions, e.g. 100 for node F6 means the  L2 loop will execute 100 times. In addition, GCCFG will also indetify  the recursive functions.<\/p>\n<p>We also proposed a cost  model, interfenrence cost graph,  to evaluate the performance based on the GCCFG of that application. Finally, two heuristics are proposed &#8211; <strong>FMUM<\/strong> Heuristic and <strong>FMUP<\/strong> Heuristic.<\/p>\n<p><strong>FMUM<\/strong> Heuristic: Start from non-overlap  code regions and unify different regions to decrese the total size of  the  code region until it meets the requirement.<\/p>\n<p><strong>FMUP<\/strong> Heuristic: Start from a unified  code region and partition it into different regions until it meets the  restriction of the total size of the  code region.<\/p>\n<p>The  details can be found in the paper   &#8220;<a href=\"https:\/\/129.219.108.14:2222\/CMD_FILE_MANAGER\/domains\/aviral.lab.asu.edu\/public_html\/wordpress3\/temp\/publications\/papers\/spm-code2.pdf\">Dynamic Code Mapping for Limited Local Memory Systems<\/a>&#8220;.<\/p>\n<p><strong>Publications<\/strong><\/p>\n<table style=\"height: 64px;\" border=\"0\" cellspacing=\"2\" cellpadding=\"2\" width=\"733\">\n<tbody>\n<tr>\n<td width=\"4%\" height=\"36\"><a href=\"https:\/\/129.219.108.14:2222\/CMD_FILE_MANAGER\/domains\/aviral.lab.asu.edu\/public_html\/wordpress3\/temp\/publications\/papers\/spm-code2.pdf\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.public.asu.edu\/%7Ekbai3\/figures\/pdf.png\" border=\"0\" alt=\"pdf\" width=\"36\" height=\"39\" \/><\/a><\/td>\n<td width=\"4%\"><a href=\"https:\/\/129.219.108.14:2222\/CMD_FILE_MANAGER\/domains\/aviral.lab.asu.edu\/public_html\/wordpress3\/temp\/publications\/papers\/spm-code2.pptx\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.public.asu.edu\/%7Ekbai3\/figures\/ppt.png\" border=\"0\" alt=\"\" width=\"35\" height=\"37\" \/><\/a><\/td>\n<td width=\"4%\"><a href=\"http:\/\/www.public.asu.edu\/%7Ekbai3\/code.rar\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.public.asu.edu\/%7Ekbai3\/figures\/rar.png\" border=\"0\" alt=\"\" width=\"39\" height=\"34\" \/><\/a><\/td>\n<td style=\"text-align: justify;\" width=\"90%\"><span style=\"text-decoration: underline;\">Dynamic Code Mapping for Limited Local Memory Systems<\/span><br \/>\n<a href=\"http:\/\/aviral.lab.asu.edu\/bibadmin\/show.php?author=Seungchul_Jung\">Seung chul Jung<\/a>, <a href=\"http:\/\/aviral.lab.asu.edu\/bibadmin\/show.php?author=Aviral_Shrivastava\">Aviral Shrivastava<\/a>, and <a href=\"http:\/\/aviral.lab.asu.edu\/bibadmin\/show.php?author=Ke_Bai\">Ke Bai<\/a><br \/>\n<strong>ASAP 2010 <\/strong>:<em>Proceedings of the 2010 International     Conference on Application-specific Systems, Architectures and      Processors<\/em><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n","protected":false},"excerpt":{"rendered":"<p>To facilitate this code management, the IBM Cell processor provides an overlay mechanism. In a linker script, users can specify the number of regions and the mapping of functions into regions. Functions mapped to one [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10],"tags":[],"class_list":["post-154","post","type-post","status-publish","format-standard","hentry","category-news"],"_links":{"self":[{"href":"https:\/\/labs.engineering.asu.edu\/mps-lab\/wp-json\/wp\/v2\/posts\/154","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/labs.engineering.asu.edu\/mps-lab\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/labs.engineering.asu.edu\/mps-lab\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/labs.engineering.asu.edu\/mps-lab\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/labs.engineering.asu.edu\/mps-lab\/wp-json\/wp\/v2\/comments?post=154"}],"version-history":[{"count":0,"href":"https:\/\/labs.engineering.asu.edu\/mps-lab\/wp-json\/wp\/v2\/posts\/154\/revisions"}],"wp:attachment":[{"href":"https:\/\/labs.engineering.asu.edu\/mps-lab\/wp-json\/wp\/v2\/media?parent=154"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/labs.engineering.asu.edu\/mps-lab\/wp-json\/wp\/v2\/categories?post=154"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/labs.engineering.asu.edu\/mps-lab\/wp-json\/wp\/v2\/tags?post=154"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}