blog: high-performance-computing notebook #17

Merged
jackfiled merged 4 commits from feat-hpc into master 2025-08-31 13:54:09 +08:00
75 changed files with 58 additions and 62 deletions
Showing only changes of commit 4972d862b3 - Show all commits

View File

@ -1,10 +0,0 @@
---
title: High Performance Computing 25 SP Quantum Computing
date: 2025-06-12T19:26:24.6668760+08:00
tags:
- 高性能计算
- 学习资料
---
<!--more-->

View File

@ -1,12 +1,13 @@
--- ---
title: High Performance Computing 25 SP NVIDIA title: High Performance Computing 25 SP NVIDIA
date: 2025-04-24T19:02:36.1077330+08:00 date: 2025-08-31T13:50:42.8639950+08:00
tags: tags:
- 高性能计算 - 高性能计算
- 学习资料 - 学习资料
--- ---
Fxxk you, NVIDIA! Fxxk you, NVIDIA!
<!--more--> <!--more-->

View File

@ -1,11 +1,13 @@
--- ---
title: High Performance Computing 2025 SP Non Stored Program Computing title: High Performance Computing 25 SP Non Stored Program Computing
date: 2025-05-29T18:29:28.6155560+08:00 date: 2025-08-31T13:51:17.5260660+08:00
tags: tags:
- 高性能计算 - 高性能计算
- 学习资料 - 学习资料
--- ---
No Von Neumann Machines. No Von Neumann Machines.
<!--more--> <!--more-->
@ -60,7 +62,7 @@ There are two types of semi-custom ASICs:
The Standard cell based ASICs is also called as **Cell-based ASIC(CBIC)**. The Standard cell based ASICs is also called as **Cell-based ASIC(CBIC)**.
![image-20250815093113115](./hpc-2025-non-stored-program-computing/image-20250815093113115.png) ![image-20250815093113115](./hpc-2025-non-stored-program-computing/image-20250815093113115.webp)
> The *gate* is used a unit to measure the ability of semiconductor to store logical elements. > The *gate* is used a unit to measure the ability of semiconductor to store logical elements.
@ -84,7 +86,7 @@ Depending on the structure, the standard PLD can be divided into:
- Programmable Logic Array(PLA): A programmable array of AND gates feeding a programmable of OR gates. - Programmable Logic Array(PLA): A programmable array of AND gates feeding a programmable of OR gates.
- Complex Programmable Logic Device(CPLD) and Field Programmable Gate Array(FPGA): complex enough to be called as *architecture*. - Complex Programmable Logic Device(CPLD) and Field Programmable Gate Array(FPGA): complex enough to be called as *architecture*.
![image-20250817183832472](./hpc-2025-non-stored-program-computing/image-20250817183832472.png) ![image-20250817183832472](./hpc-2025-non-stored-program-computing/image-20250817183832472.webp)
@ -96,7 +98,7 @@ Depending on the structure, the standard PLD can be divided into:
### FPGA Architecture ### FPGA Architecture
![image-20250817184419856](./hpc-2025-non-stored-program-computing/image-20250817184419856.png) ![image-20250817184419856](./hpc-2025-non-stored-program-computing/image-20250817184419856.webp)
#### Configurable Logic Block(CLB) Architecture #### Configurable Logic Block(CLB) Architecture
@ -116,7 +118,7 @@ LUT is a ram with data width of 1 bit and the content is programmed at power up.
The below figure shows LUT working: The below figure shows LUT working:
![image-20250817185111521](./hpc-2025-non-stored-program-computing/image-20250817185111521.png) ![image-20250817185111521](./hpc-2025-non-stored-program-computing/image-20250817185111521.webp)
The configuration memory holds the output of truth table entries, so that when the FPGA is restarting it will run with the same *program*. The configuration memory holds the output of truth table entries, so that when the FPGA is restarting it will run with the same *program*.
@ -126,7 +128,7 @@ And as the truth table entries are just bits, the program of FPGA is called as *
Let the input signal as address, the LUT will be configured as a RAM. Normally, LUT mode performs read operations, the address decoders can generate clock signal to latches for writing operation. Let the input signal as address, the LUT will be configured as a RAM. Normally, LUT mode performs read operations, the address decoders can generate clock signal to latches for writing operation.
![image-20250817185859510](./hpc-2025-non-stored-program-computing/image-20250817185859510.png) ![image-20250817185859510](./hpc-2025-non-stored-program-computing/image-20250817185859510.webp)
#### Routing Architecture #### Routing Architecture
@ -134,7 +136,7 @@ The logic blocks are connected to each though programmable routing network. And
Horizontal and vertical mesh or wire segments interconnection by programmable switches called programmable interconnect points(PIPs). Horizontal and vertical mesh or wire segments interconnection by programmable switches called programmable interconnect points(PIPs).
![image-20250817192006784](./hpc-2025-non-stored-program-computing/image-20250817192006784.png) ![image-20250817192006784](./hpc-2025-non-stored-program-computing/image-20250817192006784.webp)
These PIPs are implemented using a transmission gate controlled by a memory bits from the configuration memory. These PIPs are implemented using a transmission gate controlled by a memory bits from the configuration memory.
@ -146,7 +148,7 @@ Several types of PIPs are used in the FPGA:
- Non-decoded MUX: n wire segments each with a configuration bit. - Non-decoded MUX: n wire segments each with a configuration bit.
- Compound cross-point: 6 breakpoint PIPs and can isolate two isolated signal nets. - Compound cross-point: 6 breakpoint PIPs and can isolate two isolated signal nets.
![image-20250817194355228](./hpc-2025-non-stored-program-computing/image-20250817194355228.png) ![image-20250817194355228](./hpc-2025-non-stored-program-computing/image-20250817194355228.webp)
#### Input/Output Architecture #### Input/Output Architecture
@ -158,7 +160,7 @@ The programmable Input/Output cells consists of three parts:
- Routing resources. - Routing resources.
- Programmable I/O voltage and current levels. - Programmable I/O voltage and current levels.
![image-20250817195139631](./hpc-2025-non-stored-program-computing/image-20250817195139631.png) ![image-20250817195139631](./hpc-2025-non-stored-program-computing/image-20250817195139631.webp)
#### Fine-grained and Coarse-grained Architecture #### Fine-grained and Coarse-grained Architecture
@ -186,9 +188,9 @@ Three types of interconnected devices have been commonly used to connect there w
### FPGA Design Flow ### FPGA Design Flow
![image-20250817195714935](./hpc-2025-non-stored-program-computing/image-20250817195714935.png) ![image-20250817195714935](./hpc-2025-non-stored-program-computing/image-20250817195714935.webp)
![image-20250817200350750](./hpc-2025-non-stored-program-computing/image-20250817200350750.png) ![image-20250817200350750](./hpc-2025-non-stored-program-computing/image-20250817200350750.webp)
The FPGA configuration techniques contains: The FPGA configuration techniques contains:
@ -222,7 +224,7 @@ The OpenCL is not an traditional hardare description language. And OpenCL needs
The follow figure shows how the OpenCL-FPGA compiler turns an vector adding function into the circuit. The follow figure shows how the OpenCL-FPGA compiler turns an vector adding function into the circuit.
![image-20250829210329225](./hpc-2025-non-stored-program-computing/image-20250829210329225.png) ![image-20250829210329225](./hpc-2025-non-stored-program-computing/image-20250829210329225.webp)
The compiler generates three stages for this function: The compiler generates three stages for this function:

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -1,11 +1,12 @@
--- ---
title: High Performance Computing 2025 SP OpenCL Programming title: High Performance Computing 25 SP OpenCL Programming
date: 2025-05-29T18:29:14.8444660+08:00 date: 2025-08-31T13:51:02.0181970+08:00
tags: tags:
- 高性能计算 - 高性能计算
- 学习资料 - 学习资料
--- ---
Open Computing Language. Open Computing Language.
<!--more--> <!--more-->

View File

@ -1,11 +1,12 @@
--- ---
title: High Performance Computing 25 SP Potpourri title: High Performance Computing 25 SP Potpourri
date: 2025-06-12T18:45:49.2698190+08:00 date: 2025-08-31T13:51:29.8809980+08:00
tags: tags:
- 高性能计算 - 高性能计算
- 学习资料 - 学习资料
--- ---
Potpourri has a good taste. Potpourri has a good taste.
<!--more--> <!--more-->

View File

@ -1,11 +1,12 @@
--- ---
title: High Performance Computing 2025 SP Programming CUDA title: High Performance Computing 25 SP Programming CUDA
date: 2025-05-15T19:13:48.8893010+08:00 date: 2025-08-31T13:50:53.6891520+08:00
tags: tags:
- 高性能计算 - 高性能计算
- 学习资料 - 学习资料
--- ---
Compute Unified Device Architecture Compute Unified Device Architecture
<!--more--> <!--more-->