

| <b>Institution:</b> University of Edinburgh |  |
|---------------------------------------------|--|
|---------------------------------------------|--|

| Unit of Assessment: 11                                                           |                           |                       |
|----------------------------------------------------------------------------------|---------------------------|-----------------------|
| Title of case study: Two billion devices enabled annually by optimised low power |                           |                       |
| embedded processor                                                               | -                         |                       |
| Period when the underpinning research was undertaken: 2006 – 2020                |                           |                       |
| Details of staff conducting the underpinning research from the submitting unit:  |                           |                       |
| Name(s):                                                                         | Role(s) (e.g. job title): | Period(s) employed by |
|                                                                                  |                           | submitting HEI:       |
| Björn Franke                                                                     | Reader                    | 2003 – present        |
| Tom Spink                                                                        | Senior Researcher         | 2016 – present        |
| Nigel Topham                                                                     | Professor                 | 2003 – present        |
| Period when the claimed impact occurred: August 2013 – 2020                      |                           |                       |
| Is this case study continued from a case study submitted in 2014? No             |                           |                       |

#### 1. Summary of the impact

Embedded processors are central to powering smart devices, from mobile phones to hard drives to smart cars. Research at the University of Edinburgh (UoE) into optimising and customising embedded processors has resulted in two prototypes: the first, an advanced and energy efficient processor; and the second, a simulator for enabling companies to custom-design their own processors. These technologies have been licensed and brought to market by leading IP developer Synopsys, boosting the company's innovative activity and business success, and passing onto its household name clients the opportunity to create their own high-functioning, innovative products, such as Solid State Drives, ultra-fast streaming services, and automated driving technology. As Synopsys clients, 10 of the world's top 15 semiconductor companies now use the UoE-based technology in their smart devices, and together they ship more than 2,000,000,000 of the processors based on the UoE research every year.

### 2. Underpinning research

Underpinning research was carried out by Professor Nigel Topham (Chair of Computer Systems) and Dr Björn Franke, together with Dr Tom Spink and a team of postdocs and research students in the School of Informatics at the University of Edinburgh (UoE). Topham led the EPSRC-funded PASTA (EP/D50399X/1) and PASTA-2 (EP/I013539/1) projects as principal investigator, with Franke as co-investigator. These ran from September 2006 to August 2010, and November 2010 to July 2014, respectively. The focus of the team was to investigate new and innovative methods of automating the design of embedded processors. The UoE researchers took a system-wide approach, and thus considered aspects ranging from low-level hardware implementation, through compiler-driven instruction-set customization, and on to processor simulation and compiler optimization.

The EnCore microprocessor and the ArcSim simulation software were initially created as research prototypes and enabled the UoE team to systematically explore several thematic research areas, each of which contributed over an extended period of time towards the overall impact of the EnCore embedded microprocessor and the ArcSim (and more recently Captive) simulators.

### **Embedded Processor Research**

One of the key requirements for low-power, area-efficient CPUs is to optimize hardware resource sharing. While expert human designers do this intuitively there are limits to the depth of search they can achieve, leading to missed opportunities and suboptimal designs. One of the PASTA project's research goals was to automate the inter-related tasks of hardware resource sharing and processor construction to producing smaller, lower-power

## Impact case study (REF3)



processors. Central to this research project was the facility to create industrial-strength research prototypes through which to evaluate innovations in high-level synthesis of microprocessor designs [3.1]. This included tools based on new compiler-driven methods of extending CPU designs [3.2], new parametric resource sharing heuristics [3.3], and eventually machine-learning based methods for finding optimal design-space trade-offs [3.4]. From this research, the team created the EnCore microprocessor, initially as an experimental proof-of-concept. In 2009 the first EnCore implementation was validated through the fabrication of Calton, a prototype silicon chip. Figure 1 shows this 1x1mm device, which was implemented in a 130nm CMOS process and was fully functional in first silicon.



**Figure 1**: Image of Calton, the first EnCore processor chip, from [3.1] (not to scale)



**Figure 2**: Image of the EnCore Castle chip. 4x4mm die, 90nm CMOS (not to scale)



**Figure 3**: Image of 32-core chip. 4x4mm die, 65nm CMOS (not to scale).

Shortly after the first publication mentioning EnCore [3.1], and the availability of working silicon, the EnCore design attracted commercial interest and was licensed to Virage Logic Inc (later acquired by Synopsys Inc). This limited further published research on EnCore. Further research into processor specialisation resulted in the fabrication of a second silicon chip, Castle, another realisation of the EnCore microprocessor design (Figure 2). The Castle chip incorporated some of the synthetic extensions enabled by the ideas developed in [3.3, 3.4], to evaluate their effectiveness in real silicon. This 4x4mm device was implemented in a 90nm CMOS process and was fully functional at 600 MHz in first silicon.

Following the licensing of EnCore, and in collaboration with the licensee Synopsys, the Edinburgh team incorporated further design space exploration ideas in a 32-core chip based on the first commercial incarnation of EnCore, the Synopsys EM4, which was fabricated in 65nm CMOS (Figure 3).

## **Processor Simulation Research**

A key requirement for processor design-space exploration is the ability to simulate new features of a microprocessor before it is actually implemented. To support this, the team developed ultra-high-speed processor simulation tools [3.5, 3.6]. ArcSim [3.5, 3.6] is arguably the fastest processor simulator in its class, due to its novel parallel JIT binary translation capability [3.4, 3.5]. It was separately licensed by Synopsys in 2012, while more recent research work [3.5, 3.6] was directly incorporated into its commercial offspring ("Synopsys Designware ARC nSim").

## 3. References to the research

3.1. Almer, O., Bennett, R., Böhm, I., Murray, A., Qu, X., Zuluaga, M., Franke, B., & Topham, N. P. (2009). An End-to-End Design Flow for Automated Instruction Set Extension and Complex Instruction Selection Based on GCC. In *GROW'09:* 



|      | Proceedings of the 1st International Workshop on GCC Research Opportunities (pp. 1-      |
|------|------------------------------------------------------------------------------------------|
|      | 12) (30 citations)                                                                       |
| 3.2. | Murray, A. C., Bennett, R. V., Franke, B., & Topham, N. (2009). Code transformation      |
|      | and instruction set extension. ACM Transactions on Embedded Computing                    |
|      | Systems, 8(4), [26]. https://doi.org/10.1145/1550987.1550989 (18 citations)              |
| 3.3. | Zuluaga, M., & Topham, N. (2009). Design-Space Exploration of Resource-Sharing           |
|      | Solutions for Custom Instruction Set Extensions. IEEE Transactions on Computer-Aided     |
|      | Design of Integrated Circuits and Systems, 28(12), 1788-                                 |
|      | 1801. https://doi.org/10.1109/TCAD.2009.2026355 (35 citations)                           |
| ~ 4  | Zuhrens M. Denille E. & Tenham N. (0040). Due disting heart designs trade offer a second |

- 3.4. Zuluaga, M., Bonilla, E., & Topham, N. (2012). Predicting best design trade-offs: a case study in processor customization. In *Proceedings of the Conference on Design, Automation and Test in Europe* (pp. 1030-1035). EDA Consortium. <u>https://doi.org/10.1109/DATE.2012.6176647</u> (8 citations)
- 3.5. Bohm, I., Edler von Koch, T. J. K., Kyle, S. C., Franke, B., & Topham, N. (2011). Generalized just-in-time trace compilation using a parallel task farm in a dynamic binary translator. In *Proceedings of the 32nd ACM SIGPLAN conference on Programming Language Design and Implementation (PLDI '11)* (pp. 74-85). ACM. <u>https://doi.org/10.1145/1993498.1993508</u> (67 citations; PLDI '11 acceptance rate: 23%)
- 3.6. Kristien, M., Spink, T., Wagstaff, H., Franke, B., Boehm, I., & Topham, N. (2019). Mitigating JIT Compilation Latency in Virtual Execution Environments. In Proceedings of the 15th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (pp. 101-107). Association for Computing Machinery (ACM). <u>https://doi.org/10.1145/3313808.3313818</u> (VEE '19 acceptance rate: 32%)

Citations based on Google Scholar, 2020-12-14.

# Key research grants:

EPSRC: PASTĂ (EP/D50399X/1, GBP874,124); PASTA-2 (EP/I013539/1, GBP 1,468,815)

# 4. Details of the impact

The EnCore processor and ArcSim simulator prototypes were developed initially by the University of Edinburgh (UoE) team between 2005 and 2010. In an ongoing process, these were highly refined by UoE researchers into commercially viable products, the first of which was licensed by Synopsys in 2010 and released as ARCv2EM in 2012 [5.1, para. 8]. Both prototypes have continued to evolve in a symbiotic relationship between the UoE PASTA group and Synopsys, and have laid the foundation for Synopsys to create novel innovations, bringing success to their own business, and passing down to their clients the chance to invent technology that would not have been possible without the original research.

Synopsys have confirmed the technologies licenced from UoE laid the foundation from which processor IP products with significant market impact were developed [5.1, paras. 13-14]. The EnCore prototype led to a wide range of Synopsys' DesignWare ARC processors, starting with the ARCv2EM in 2012. This was followed by a series of further product releases, including: the HS34 and HS36 (2013); the Linux-capable quad-core HS38 processor (2014); the EM 9D/11D signal processor (2015); the EV 6x processor, optimized for embedded vision (2016); and the superscalar HS4x (2017), which is the most recent commercial release [5.1, 5.2]. A 2017 report by the Linley Group compared the HS4x to its closest competitor manufactured by ARM, and concluded that the ARM processor delivered "less than half the peak performance of the ARC HS4x" [5.3, p. 7]. The latest ARCv3 processors deliver a threefold performance increase on their predecessors, and are flexible enough to also support architecture extensions which allow them to be customised by clients [5.1, para. 6].



The ArcSim prototype was developed by Synopsys into the DesignWare ARC nSIM, initially released in 2012, replacing all previous ARC simulation products. It too has evolved as the research has progressed, and has now become a cornerstone of the ARC product line. The company confirms it is "an essential tool enabling Synopsys to create all of its key strategic processor product families to date" [5.1, para. 11].

90% of Synopsys clients go on to customise their processors [5.4, final para.], demonstrating the commercial demand for the simulator. The nSIM's latest product feature – Near Cycle Accurate Simulation Mode (NCAM) – "is based directly on productised micro-architectural simulation technology developed in ArcSim during the PASTA research project at Edinburgh University" [5.1, para. 11].

In addition to providing Synopsys with raw technology, which the company then licensed, Edinburgh researcher Nigel Topham has enhanced Synopsys' commercial success through a specialist consultancy role (Senior Consulting Architect) [5.1, para. 4]. He continues to hold this post alongside his UoE chair, cementing the link between the PASTA research and Synopsys' ongoing business. The next generation of ARC processors – ARCv3 HS5x and HS6x – have been confirmed, based on the continuing research.

Synopsys have identified 265 customers worldwide who purchased their processors based on the UoE research [5.1, para. 1]. These range from multinational players in the semiconductor world to smaller specialised companies who make highly innovative products, covering a broad spectrum of domains such as ADAS, radar, SLAM (simultaneous location and mapping), robotics and augmented/virtual reality [5.5]. Together Synospys' client companies ship more than 2,000,000,000 ARC-based chips per year [5.1, para. 1]. 10 of the world's top 15 semiconductor businesses use Synopsys' products based on the research, including household names Qualcomm, Intel and Bosch [5.1, para. 1]. The latest features of these ARC processors and nSIM have enabled some of these companies to put into development innovations for which previously no facilitating technology existed.

Big data storage company Starblaze Technology Co. Ltd. used the ARC HS4x for their Enterprise Solid State Drive (SSD) Controller. The company states that the new chips reduce latencies on input/output, and also reduces power consumption by 50% over competitive alternatives [5.6, para. 2]. In line with Starblaze's "extremely aggressive design goals" and after extensive research [5.6, para. 3], Starblaze states that the processors allow them to achieve "new levels of performance" [5.1, table 2], which have helped them to meet these aggressive goals.

Semiconductor manufacturer Broadcom Inc. used the ARC processors to develop devices with advanced video compression capabilities. These can be used for HD and ultra HD streaming of entertainment and video conferencing. The company has cited the "power performance efficiency" as one of the most attractive features of the chips [5.1, table 2], confirming that taking "advantage of the ARC processors' power-performance efficiency in a wider range of products [enables them] to deliver more differentiated solutions to [their] customers" [5.7, para. 2]. Rambus Inc. Security Division cites the "area savings and performance efficiency" as facilitating their security innovations, as well as the flexibility given by the nSIM. "The combination of configurable hardware with a full suite of software development tools [i.e. including the simulator] helps accelerate the development of our advanced security platform." [5.8, para. 3] The flexibility of the design has been cited by multiple companies as giving them a market edge, including Peraso Technologies Inc. and



EZ Chip Semiconductor [5.1, table 2]. PL Sense Ltd., a company whose USP is energy reduction, states that the ARC tools have saved them time in their design process [5.1, table 2].

ARC processors continue to drive innovation in specialist technological fields. In September 2019, Synopsys and Infineon Technologies AG began a collaboration to integrate ARC EV processors into an AI accelerator within Infineon's AURIX microcontroller, for used in automotive systems including engine management, driver assistance and emission control [5.9]. The collaboration "will prepare the AURIX for data-hungry automotive applications", states Infineon [5.10, para. 3], while Synopsys has cited the specific safety features of ARC processors as contributing to driver safety [5.10, para. 5].

In an ongoing trajectory of PASTA research leading to commercial and innovative advancements, the period of mid-2013 to 2020 has been a significant one. Ultimately it has resulted, through the chain of Synopsys' business via their various clients, in a processor design and its simulator from the UoE powering billions of chips in varied devices.

## 5. Sources to corroborate the impact

- 5.1. Letter of corroboration from Synopsys
- 5.2. Selection of Synopsys press releases verifying release dates of HS38 processor, EM 9D/11D signal processors and the EV 6x processor
- 5.3. Demler, M. (2017, May). ARC HS4x and HS4xD CPUs: New Dual-Issue Architecture Boosts Embedded Processor Performance. Retrieved April 14, 2020, from http://linleygroup.com/synopsys/whitepaper/pdf/
- 5.4. McLellan, P. (2015, September 19). We're Number Two, We Try Harder. Retrieved May 6, 2020, from <u>https://semiwiki.com/eda/synopsys/5008-were-number-two-we-try-harder/</u>
- 5.5. Selection of statements from smaller Synopsys client companies (Arbe Robotics, FABU, Kudan, Calterah Semiconductor) from Synopsys press releases 2018-2019, confirming performance, reliability, and safety-compliance in wide-ranging specialist use of the ARC processor product family, specifically automotive innovation, SLAM, robotics, and augmented/virtual reality.
- 5.6. Synopsys, Inc. (2017, January 26). Starblaze Achieves First-Pass Silicon Success for Storage SoC with Synopsys ARC Processor and Interface IP. Retrieved September 23, 2020, from <u>https://www.prnewswire.com/news-releases/starblaze-achieves-first-passsilicon-success-for-storage-soc-with-synopsys-arc-processor-and-interface-ip-300397158.html</u>
- 5.7. Marmie, M. (2015, May 14). Synopsys and Broadcom Expand Collaboration to Deploy ARC Processors in Multimedia and Networking Solutions. Retrieved May 7, 2020, from <u>https://news.synopsys.com/2015-05-14-Synopsys-and-Broadcom-Expand-</u> Collaboration-to-Deploy-ARC-Processors-in-Multimedia-and-Networking-Solutions
- 5.8. Marmie, M. (2017, February 2). Rambus Selects Synopsys' ARC EM Processors for Embedded Security Platform. Retrieved May 6, 2020, from <u>https://news.synopsys.com/2017-02-02-Rambus-Selects-Synopsys-ARC-EM-</u> <u>Processors-for-Embedded-Security-Platform</u>
- 5.9. Simon, T. (2019, October 3). Synopsys and Infineon Prepare for Expanding AI Use in Automotive Applications. Retrieved October 7, 2020, from <a href="https://semiwiki.com/eda/synopsys/275555-synopsys-and-infineon-prepare-for-expanding-ai-use-in-automotive-applications/">https://semiwiki.com/eda/synopsys/275555-synopsys-and-infineon-prepare-for-expanding-ai-use-in-automotive-applications/</a>
- 5.10. Scharfenberg, O. (2019, September 17). Infineon and Synopsys Collaborate to Accelerate Artificial Intelligence in Automotive Applications. Retrieved May 11, 2020, from <u>https://www.infineon.com/cms/en/about-infineon/press/market-news/2019/INFATV201909-099.html</u>