Competition 2024

Competition: Hardware Implementation

Accelerated Tiny-Transformer IP

FPGA-Powered Acceleration for NLP Tasks

Project Overview:

Natural Language Processing (NLP) transforms how machines understand and interact with human language. Whether predicting the next word in a sentence, translating languages in real-time, or understanding contextual information from a body of text, NLP applications are increasingly prevalent in various fields such as virtual assistants, translation services, and automated customer support. To meet the growing demand for efficient and real-time NLP processing in embedded systems, we propose designing and implementing a Tiny Transformer Intellectual Property (IP) core. This core will be integrated with an ARM Cortex IP, leveraging the strengths of both the processor system (PS) and programmable logic (PL) parts of a System on Chip (SoC) to create a highly efficient solution for real-time NLP tasks.

Objectives:

1. Design and Implementation of Tiny Transformer IP:
- Develop a compact and efficient transformer IP core using high-level synthesis (HLS), tailored for resource-constrained environments.
- Include essential components such as an encoder, decoder, attention blocks, normalization layers, and feed-forward neural networks.

2. Integration with ARM Cortex IP:
- Utilize the ARM Cortex IP as the processing system (PS) for handling high-level control and preprocessing tasks.
- Integrate the Tiny Transformer IP as the programmable logic (PL) part to accelerate computationally intensive transformer operations.
- Establish seamless communication between the PS and PL using the AXI interface.

3. System Architecture Development:
- Implement a host CPU that interacts with the Tiny Transformer IP via PCIe and manages data flow.
- Integrate BRAM for intermediate storage and a DDR controller for main memory access.
- Optimize the data path and memory hierarchy to ensure low-latency and high-throughput processing.

4. Performance Evaluation:
- Benchmark the integrated system against conventional CPU-only implementations to demonstrate improved performance.
- Assess power consumption and resource utilization to validate the efficiency of the Tiny Transformer IP in embedded scenarios.

Expected Outcomes:

The successful completion of this project will result in a highly optimized Tiny Transformer IP core integrated with an ARM Cortex IP. The project will generate a complete RTL to GDSII flow, enabling the tape-out of our accelerator on a 65nm technology node. This integration will provide a robust solution for deploying transformer-based models in resource-constrained devices, enabling real-time processing of NLP tasks with significantly reduced latency and power consumption. This advancement will pave the way for sophisticated applications in IoT devices, edge computing, and mobile platforms, making advanced NLP capabilities more accessible and efficient.

Project Milestones

Post Silicon
Architectural Design	Getting Started	Specifying a SoC	data model	IP Selection	Verification Methodology
Behavioural Design	Behavioural Modelling	Generate RTL	RTL Verification	Simulation
Logical Design	Technology Selection	Synthesis	Design for Test	Logical verification
Physical Design	Floor Planning	Preperation	Clock Tree Synthesis	Routing	Timing closure	Physical Verification	Tape Out

Complete

In Progress

Not Started

Not Needed

Click on any milestone above for details

Do you want to view information on how to complete the work stage ""

View

or update the work stage for this project?

Architectural Design

Design Flow

Architectural Design

Target Date

March 1, 2024

Completed Date

July 31, 2024
Project Kickoff:
- Define project objectives and scope.
- Review existing technologies and research relevant to Tiny Transformers and ARM Cortex integration.
Behavioural Design

Design Flow

Behavioural Design

Target Date

April 1, 2024

Completed Date

July 31, 2024
Design Phase:
- Develop initial architecture for Tiny Transformer IP.
- Begin high-level synthesis (HLS) of essential transformer components (encoder, decoder, attention blocks).
Behavioural Design

Design Flow

Behavioural Design

Target Date

May 1, 2024

Completed Date

July 31, 2024
Implementation Phase:
- Complete HLS of Tiny Transformer block components of input embedding.
- Develop communication protocols between PS and PL parts of the SoC.
Behavioural Design

Design Flow

Behavioural Design

Target Date

June 1, 2024

Completed Date

July 31, 2024
System Architecture Development:
- Implement attention block and normalization block.
Accelerator Design Flow

Design Flow

Accelerator Design Flow

Target Date

July 1, 2024

Completed Date

July 31, 2024

implemented encoder block with hardware utilization as 22 % LUTS and 7 % BRAM.
Milestone #6

Target Date

August 1, 2024
Milestone #7

Target Date

September 1, 2024
Milestone #8

Target Date

October 1, 2024
Milestone #9

Target Date

November 1, 2024
Milestone #10

Target Date

December 1, 2024
Milestone #11

Target Date

January 1, 2025
Milestone #12

Target Date

February 1, 2025
Milestone #13

Target Date

March 1, 2025

Team

Name

Abhishek

Comments

Tiny-Trans

Team Members:

Abhishek Yadav (yadav.49@iitj.ac.in)

Ayush Dixit (m23eev006@iitj.ac.in)

Binod Kumar (binod@iitj.ac.in)

Add new comment

To post a comment on this article, please log in to your account. New users can create an account.

Project Creator

Abhishek

at Indian Institute of Technology Jodhpur

Technology

Accelerators

Interests

Hardware design

Design Flow

Accelerator Design Flow

Submitted on Wed, 31/07/2024 - 09:57

Actions

Log-in to Join the Team

Project Milestones

Architectural Design

Behavioural Design

Behavioural Design

Behavioural Design

Accelerator Design Flow

Milestone #6

Milestone #7

Milestone #8

Milestone #9

Milestone #10

Milestone #11

Milestone #12

Milestone #13

Team

Comments

Tiny-Trans

Add new comment

Actions