Sitara Processors

June 29, 2017 | Autor: Jigar Soni | Categoria: Embedded Systems, Real Time Embedded Systems
Share Embed


Descrição do Produto

Sitara Processors Texas Instrument’s ARM® Cortex®-A Processors with Programmable Real Time Unit Jigar Soni Department of Electronics and Communication Engineering Institute of Technology Nirma University, Ahmedabad, Gujarat, India-382481 Email: [email protected]

Abstract—Many Electronic Products in Today’s World require Processing Unit That has High Performance Capacity and can run Complex Programs and also many times Parallel Processing of Multiple Programs using concept of dividing One Application Code into Tasks and Processes but that often done with ARM Architecture based Processors and High Level Operating Systems like Linux, Windows CE, and Android. But Main Drawback of using High Level Operating System is that they don’t have predictable Time Response to Input. Now System may have some inputs or Tasks that needs to Recognized and It’s Output should produce within some deadline in that case we need some Dedicated Hardware that recognize and Process Inputs which needed to serve within time limit. So In that case Texas Instrument’s Sitara Processor is Best Choice because it has Integral Capacity of Performing Hard Real Time Tasks and at Same Time Supports High Level Operating System to Other Non-Real Time Process for Efficient and Multitask Processing by Comprising ARM® Cortex®-A with Dual Core RISC Programmable Real Time Unit with Industrial Communication Subsystem that supports peripherals like UART, EtherCAT, Ethernet, PROFINET directly connected to PRU subsystem for Real Time Response. Keywords— ARM, Cortex-A, Sitara, PRU, Real Time System, Texas Instruments

I. INTRODUCTION In Today’s competitive world be it either Industrial or Consumer or Medical Field, Electronic Products need to fulfill basic criteria like High Resolution graphics, Real Time Response, Ability to Multitasking and Optimized to power consumption. Also Most of times This Equipment also need to communicate with external world to take Inputs and also use to show produced result. And to communicate external Peripherals Processor needs to support mostly used Protocols on chip in form of hardware in order to save need of external Peripheral Interfacing ICs. Texas Instruments Sitara Processors have all this capacity in Single Chip. Sitara Processors combines High Performance ARM® Cortex®-A with Real Time Programming Units (PRUs) and This PRUs also has Integrated Communication Subsystem that gives Real Time access to data that are transferring. With a PRU solution, system developer’s benefits common software code base which make simpler to upgrade

system code, and multi-protocol processing across systems or on the same system [1]. Sitara Processors Provides Rich Development Support from Open Source Development Community BeagleBoard [2]. And Sitara Processors have Also Rich Software Development Support from Sitara StarterWare and also supports Linux and Android [2].

Figure 1: Basic Sitara architecture with ARM Cores and PRUs. (Courtesy: Texas Instruments) [1]

As we can see from Figure 1 that Sitara Processors makes Efficient Use of ARM® Cortex®-A with PRUs. Earlier FPGAs were used for Hard Real Time System in High Performance System for performing Real Time Tasks but it was increasing cost, board area, power consumption and complexity so With a PRU solution, system developers also benefit from a common software code base and development of successive generations of systems is accelerated by simply migrating the code base to the next model in the product line [1]. II. ARM® CORTEX®-A CORE Sitara Processors Comes two versions of ARM® Cortex®A8 and ARM® Cortex®-A9 Core which are 32-bit Processor Architecture licensed by ARM. Both Cortex A8 and A9 have ARMv7-A Micro architecture. ARMv7-A has features to support of Thumb®-2 instruction set for higher code density, Thumb Execution Environment (ThumbEE) for accelerated execution of Thumb Instructions, Vector Floating Point v3 (VFPv3) architecture for floating-point computation, NEON technology for Multimedia Processing [3] TrustZone and

Jazelle [4]. Following Table 1 Contains Difference Between ARM Cortex A8 and ARM Cortex A9 Features. Table 1: Cortex A8 and Cortex A9 Major Feature Variance [5]

Feature

Cortex A8

Cortex A9

Out of Order Execution

No

Yes

Vector Floating Point Unit Pipelined VFP

VFPv3 No

VFPv3 optional Yes

L2 Cache Size

256 or 512 kb

1Mb

Speed per Core DMIPS/MHz

2.0

2.5

A. TrustZone TrustZone technology is included in order to ensure data privacy and DRM protection in consumer products that run Open OS. TrustZone technology protects peripherals and memory against a security attack. In the secure state, the processor run critical section code from a secure code block to handle security-sensitive tasks such as authentication and signature manipulation [6]. So this way some important code Segment like Kernel of System can be ensured safe from external threats. B. Vector Floating Point VFP engine carry out full execution of the VFPv3 dataprocessing instruction set that delivers hardware provision for floating point operations in half, single and double-precision floating point arithmetic. It is totally IEEE 754 obedient with full software library support [10]. Floating point capabilities of VFP provides improved performance in automotive powertrain, body control applications, imaging applications, printing, 3D transforms, FFT and filtering in graphics. C. Thumb®-2 Thumb-2 is an enhancement to the 16-bit Thumb instruction set [3]. Thumb-2 Instruction can be either of 16 or 32 in length. This additional 32-bit instructions enable Thumb2 comprises the functionality of the ARM instruction set. Significant difference between the Thumb-2 instruction set and the ARM’s instruction set is that most 32-bit Thumb instructions are unconditional, whereas most ARM instructions can be conditional but IT (if-then-else) instruction of Thumb-2 can make Instruction Conditional. Thumb-2 Instructions provides approximately 26% improvement in code density over ARM Instructions and approximately 25% improvement in code density over older Thumb Instructions [7].

D. ThumbEE Thumb Execution Environment is a target for languages such as Java, C#, Perl, and Python, and allows Java compilers to output small size compiled code without impacting performance. It is normally used for codes that compiles just before or during running of application from native bytecodes. Novel features provided by ThumbEE consist of automatic null pointer checks on each load and store instruction, an instruction to perform an array bounds check, and special instructions that call a handler [8]. ThumbEE delivers improved code density for the compiled binary compared with compiled ARM or Thumb-2 instruction set and for that ThumbEE use a new processor state called as ThumbEE state established by setting T and J bit in the CPSR. E. NEON NEON is a combined 64- and 128-bit SIMD (Single Instruction Multiple Data) instruction set that provides standardized acceleration for media and signal processing applications [8]. It comprise inclusive instruction set, dedicated register files and independent processing hardware. NEON supports 8, 16, 32 and 64-bit integer and singleprecision (32-bit) floating-point data and instructions for handling audio, video graphics and gaming processing. F. Jazelle Jazelle technique permits Java Bytecode to be performed directly in the ARM architecture as Instructions alongside the existing ARM and Thumb-mode. The prominent use of Jazelle is to escalate the execution speed of Java ME games and applications. A Jazelle-aware Java Virtual Machine (JVM) will attempt to run Java bytecodes in hardware ARM claims that approximately 95% of Bytecode in typical program usage ends up being directly processed in the hardware [9]. III. PROGRAMMABLE REAL TIME UNIT The Reason we need PRUs with some system that has some Hard Real Time Constrain is that suppose we are dealing with Real Time Tasks in Linux using Real Time Scheduler. Even after using Scheduler Developer Can’t provide Exact Response Time on micro or nano second level. For providing exact response time we have to use RTOS in the system. But RTOS cannot provide UI, connectivity and file system support like General Purpose OS Linux. So Inclusion of PRU with High Performance Processor ARM Cortex A makes Sitara Processor best choice for High Performance Real Time Systems. PRU in Sitara Processor consists 

Two PRU0 and PRU1 with their associated data and catch memory



Interrupt Controller & Switched Central Resource

Two PRU can operate independently or co-ordinate with each other depend upon nature of instructions [11]. Many Times General Purpose System also require some tasks to be completed within some time constrains. For Example Data Receiving on Communication bus must be taken into our processing buffer before another data comes otherwise there would be loss of some data as receiving unit did not responded within some time limit or it has taken more time due to improper management of tasks by OS, In other case like Online Video Calling Audio and Video Must be processed fast enough in real time interactive system otherwise there would be out of sync of Audio and Video or May be both are delay by some time to do sync but that will no more real time. Same in case of Updating data of Stock Exchange and delaying this kind of data creates huge financial loss.

faster than real time software scheduler. Here in Figure 2 Toggling Pin takes around average 200ns where in Figure 3 using PRU it takes just 5ns every time This significant change in PRU happen because there is almost direct connection of PRU processing unit to Port Pin. Also Real time Scheduler’s Response time will vary event by event depending upon statues of Processing Unit and Number of interconnect where Real Time Unit’s response is in guaranteed time so it would become easier for system designer to manage real time tasks with Processor that has Real Time Processing unit hardware. A. PRU Sub System

Consider Case when we are Using Real Time Task Scheduler in our High Level Operating System like Linux and our High Priority Task is to Toggle one Port Pin of Processor when some x event Occurs. Then In case of General Purpose System without any Real Time Processing Hardware, Response Time will depends on Factor like which process running on GPP (Some Critical Functions like Memory management of Stack Management also has highest Priority), Number of interconnects to reach port Address which makes unpredictable time response although we can say maximum how much it will but we can’t say with guarantee exact time.

Figure 4: PRU Subsystem. (Courtesy: Texas Instruments) [1]

Figure 2: Toggling Port Pin with Real Time Schedular. [12]

Figure 4 Shows Programmable Real Time Unit Subsystem that is deployed in Texas Instruments Sitara Processor alongside with ARM Cortex-A core. This PRU Subsystem Consists two independent PRU execution functional unit each capable to run at 200MHz (5ns per Instruction) and they are Pure RISC with no pipelining Each Block has up to 30 General Purpose Input and up to 32 General Purpose Output Pins for Fast Input/ Output Response. Alongside with it has Dedicated Instruction and Data RAM, 32 bit General Purpose Register and Shared RAM. PRU Subsystem also includes Interrupt Controller a Vital Part of Subsystem is useful in managing Events and Resources to Provide Smooth Real Time Response with Task Management. Interrupt Controller is very important part of PLU Subsystem supports 64 system events (32 externally generated and 32 generated by PRU Sub System) It supports 10 Interrupt Channels and Nesting and Prioritizing of Interrupts It gives facility to map system event to Interrupt Channel with each channel can map multiple events but same event cannot be mapped to multiple channel.

Figure 3: Toggling Port Pin with PRU. [12]

The Experiment of Toggling Pin with and without Real time hardware unit shows that with PRU task is almost 40x

PRU can also connect to any peripheral on SoC chip using 32-bit Interconnect Bus.

B. PRU Function Block Figure 5 shows Functional Block Diagram of PRU Execution Unit here with Execution Unit there is 32 General Purpose Registers Attached with Register R30 and R31 can perform additional functions like write into GPO registers read from GPI registers and generate INTC event. Instruction RAM is of 4KB in size and update with providing necessary signal on PRU reset pin and update through main code from General Purpose Processor (here ARM Cortex A).

Figure 5: PRU Functional Block Diagram (Courtesy: Texas Instruments) [12]

There is also one Constant Table which is included for storing most frequent constants and Base address of peripheral’s Ports for easy and fast access to them although some part of constant table is programmable also [12]. The Main Part of PRU is Execution Unit which executes Instruction from Instruction RAM and Produce output on its Fast IO Ports. In PRU all instructions divided into three class Logical, Arithmetic and Data Flow Control instructions. Execution Unit execute instruction in real time because it doesn’t do any optimizing like pipelining and all instructions are completed in 1 clock cycle that is around in 5ns [1], Addressing Modes are Load Immediate and Load/Store to Memory. C. Industrial Control Subsystem PRU Subsystem supports various Communication Peripherals that are useful in industries for real time data communications. From Figure 4 we can see that PRU System has UART, eCAP, MDIO, IEP, MII_RT [12] for Real Time Industrial Ethernet. This Peripherals will process data in real time because they are directly connected to PRU Control Unit where as in General Purpose Processors Peripherals has to pass through various Interfaces (L2 and L3 Cache) which increase latency and in turn response time. D. PRU Software Architecture As Sitara Processor divided into two main Parts ARM Subsystem and PRU Subsystem. Normally ARM Subsystem will have high level OS like Linux and PRU Subsystem will not have any OS and that Code is called as Firmware. Linux

will have main task of Loading Firmware starting and stopping system, managing resources like CPU and Memory, sending and receiving data from shared resources and synchronization of interrupts [12]. IV. CORTEX-A WITH PRU-ICSS

Figure 6: Sitara Processor Block Diagram (Courtesy: Texas Instruments) [1]

Figure 6 shows block diagram of Sitara Processor that is ARM Cortex-A Processor System on Left Side and PRU Subsystem on right side. From figure we can see that PRU unit has direct access to its peripherals and PRU I/O ports. Where in case of ARM Subsystem peripherals and GPIO signals has to pass through L3 and L4 interconnect to make communication with core which defiantly increases Latency when first time it called, and then using some Algorithm Smart ARM processor may create short path for fast access to that peripheral. But still that will not enough for data that has real time constrains. So in that type of application where High Performance and Real Time Response needed Sitara Processor is good choice because System Developer can divide its application based on Hard Real Time Section and Soft Real Time System and decide which portion of hardware to use and when to use, Then making use of High Level OS he can manage Resources from system level view i.e. when to start system, which firmware to load, taking care of shared resources etc. One of best application of PRU is that we use them as soft peripherals also (new peripheral implemented using existing SoC Hardware and Software), for example we can implement DMA that take care of data for some peripherals without including ARM Core in picture this way in some nonreal time system also PRU can be very useful in order to reduce work load from main processor so that it performs fast and at same time it gives user to virtually make peripherals that didn’t exists in form of Hardware in our chip [13].

CONCLUSION So it can be seen that ARM Cortex A with Features of Pipelining of Instructions, Vector Floating Point, Thumb-2 Instructions, ThumbEE, Java Bytecode support, TrustZone Security support, NEON or Single Instruction Multiple Data support with conventional features of Load Store Architecture, Barrel Shifter and support for high level operating system like Linux and Ubuntu makes it ideal for High Performance and Multi Task System. And at the same time Dual Core Pure RISC PRU with capacity to run at maximum 200MHz with each has Dedicated Fast I/O and 32 General Purpose Registers and Interrupt Controller, Independent Instruction and Data RAM and Industrial Peripherals Support for Real time Data Transfer makes Sitara Processors Ideal for High Level System with capacity to perform and manage Real Time Tasks like RTOS and also sometimes useful to make soft peripherals to reduce work load of main processor. REFERENCES [1]

[2] [3]

[4]

Melissa Watkins and Carlos Betancourt, “Ensuring real-time Predictability,” Leveraging TI’s Sitara™ Processors Programmable Real-Time Unit http://www.ti.com/lit/pdf/spry264 Sitara™ Processors Brochure (Rev. A). http://www.ti.com/lit/pdf/sprt674 Cortex™ -A8 Revision: r3p2 Technical Reference Manual. http://infocenter.arm.com/help/topic/com.arm.doc.ddi0344k/DDI0344K _cortex_a8_r3p2_trm.pdf ARM_Cortex-A9 https://en.wikipedia.org/wiki/ARM_Cortex-A9 Date: 03/10/2015 Time: 03:47 PM

[5]

[6]

[7] [8] [9] [10]

[11]

[12]

[13]

Comparison of ARMv7-A cores https://en.wikipedia.org/wiki/Comparison_of_ARMv7-A_cores Date: 03/10/2015 Time: 02:06 PM Architecture and Implementation of the ARM® Cortex™-A8 Microprocessor https://pixhawk.ethz.ch/_media/software/optimization/neon_whitepaper. pdf The ARM Architecture With a focus on v7A and Cortex-A8 https://www.arm.com/files/pdf/ARM_Arch_A8.pdf ARM Architecture https://en.wikipedia.org/wiki/ARM_architecture Date:03/10/2015 Time: 08:01 PM Jazelle https://en.wikipedia.org/wiki/Jazelle Date: 03/10/2015 Time:8:45PM Floating Point https://www.arm.com/products/processors/technologies/vector-floatingpoint.php Date: 03/10/2015 Time: 9:52 PM Programmable Real Time Unit Sub System http://processores.wiki.ti.com/index.php/Programmable_Realtime_Unit_ Subsystem Date: 02/10/2015 Time: 9:02 AM Ron Birkett “Enhancing Real-Time Capabilities with the PRU” October 2014 http://events.linuxfoundation.org/sites/events/files/slides/Enhancing%20 RT%20Capabilities%20with%20the%20PRU%20ELCW%20Submitted. pdf Gustavo Martinez and Gagan Maur “Programmable Real-Time Unit (PRU): Extending Functionality of Existing SoCs” http://www.ti.com/lit/wp/spry136a/spry136a.pdf

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.