Developing processor-compatible C-code for FPGA hardware acceleration

The three common processor implementation models used in FPGA cores are the microprocessor, microcontroller, and specialty processor. A microprocessor is generally a stand-alone core with limited peripherals. Microprocessors are usually implemented with at least a 32-bit or 64-bit architecture.

They are generally targeted toward advanced computing applications. Microprocessors may include advanced performance architectural elements, SIMD units to provide vector-based math functionality commonly used in math-intensive applications.

The microprocessor design model is based on the implementation of an optimized, high-performance processor core with limited on-chip peripherals. This allows the design team to choose and implement the required peripheral functionality externally. The interface to these external peripherals is generally implemented via a high-throughput interface bus such as PCI-X.

In contrast to the microprocessor model, microcontrollers generally include significant on-chip peripheral functionality. Microcontrollers are generally targeted toward specific application markets such as motor-control or PDA devices.

The target application influences the peripheral set mix. Microcontrollers follow the system on-a-chip (SoC) design philosophy. This philosophy encourages the implementation of as many peripherals on-chip as possible, ideally working toward a single-chip solution. Common peripheral block examples include Ethernet and USB communication and LCD controllers. Microcontrollers span a wide range of performance.

Specialty processors target very specific applications including audio processing, software defined radio, or the implementation of network protocols at the highest possible speed. While they may be categorized as either microprocessors or microcontrollers, they are listed as a separate category here because they possess specialized architectures, resources and capabilities. Examples include network processors and digital signal processors (DSPs).

Each of these processor implementation models are targeted toward different applications. The selection of a processor model to implement the specific requirements of a project requires many considerations.

The primary trade-off areas include target application, performance, architecture, integration, power and cost. A primary FPGA embedded processor implementation advantage is the ability to repartition hardware functionality to potentially create new processor implementations without board re-spins.

With the incorporation of the processor and the circuitry it controls, the design team has control over more of the design elements since software and hardware functionality may be implemented using programming languages. The flexibility of software and hardware re-configuration allows the design team to determine the optimal mix for hardware and software functionality.

The ability to repartition an embedded FPGA processor design increases the number of potential design implementation options. Some functional design implementation options are presented in the following list.

Design Functional Implementation Options

1) Single processor
2) Multiple processors
3) Floating-point unit
4) State machine
5) Coprocessor
6) Dedicated FPGA logic implementation
7) Off-chip peripherals

There are several broad processor IP categories. Some example processor-related IP cores are presented in Table 14.1 below.

Developing  processor-compatible C-code for FPGA hardware acceleration

Table 14.1. Typical processor IP cores

Picking the right processor core & peripherals

The processor selection affects all aspects of the system design, budget, and schedule for a project. It is typically one of the most critical decisions made by a development team because of the broad impact it has on the performance of a project.

For this reason, the selection of a processor will typically be a collaborative effort between the system, hardware and software teams. The interactions between these decisions can become complex. Some factors to consider when selecting a processor core are presented in the following list.

Processor Selection Factors 

1) Target application
2) Optimization for specific architectures or highest possible performance
3) Resource utilization
4) Simulation support
5) Testbench coverage
6) Support for individual simulation tool sets
7) Availability of real-world application-oriented simulation results
8) Documentation completeness and accuracy
9) Access to original core developers or qualified experts
10) Number and competence of IP vendor staff
11) System, hardware and software tools
12) Operating system

To conduct a processor trade-off study, the comparison of the processor core architectural features such as the pipeline, memory interface, and core speeds must be taken into account. The combination of architectural features provides the details in understanding the true performance of the processor.

As discussed previously, a deeper pipeline may be leveraged for higher performance provided that branching is limited. Large register files reduce the number of load/store operations. Cache implementation can improve overall performance significantly by reducing the number of external memory accesses. Some architectural factors to consider when evaluating processor cores are presented in the following list.

Processor Architectural Factors
1) Type, size, &implementation of the memory/peripheral bus
2) Error detection and correction mechanisms
3) Bus transaction types such as bursting
4) Size and model of address space
5) Type and size of cache (instruction/data)
6) Type of controllers such as DMA and MMU
7) Functional elements such as the register files/execution units
8) Type of pipeline and strategies to prevent stalls; for example, branch prediction
9) Write buffers for external memory
10) Interrupt response & structure; i.e. shadow registers

Other factors to consider during a processor trade study include development tools, IP availability, supported RTOSs, and any other critical items that impact performance or development efficiency. A spreadsheet is a good tool for summarizing design options.

Consider the use of tools that support code optimization while implementing proactive measures early in the design effort to offset any significant software issues that could require software redesign. To better understand these trade-offs, the trade study shown below presents an overview of some important processor selection criteria.

Processor Selection Criteria
1) Performance
2) Architecture
3) RTOS support
4) IP availability
5) Processor category
6) Tool features
7) Technical support
8) Reference code/examples
9) Evaluation boards