blog: hpc cpu architecture
Some checks failed
Build blog docker image / Build-Blog-Image (push) Failing after 5m36s
Some checks failed
Build blog docker image / Build-Blog-Image (push) Failing after 5m36s
This commit is contained in:
parent
dcad453eb1
commit
05ea729950
100
YaeBlog/source/posts/hpc-2025-cpu-architecture.md
Normal file
100
YaeBlog/source/posts/hpc-2025-cpu-architecture.md
Normal file
|
@ -0,0 +1,100 @@
|
|||
---
|
||||
title: High Performance Computing 2025 SP CPU Architecture
|
||||
date: 2025-03-13T23:59:08.8167680+08:00
|
||||
tags:
|
||||
- 学习资料
|
||||
- 高性能计算
|
||||
---
|
||||
|
||||
How to use the newly available transistors?
|
||||
|
||||
<!--more-->
|
||||
|
||||
Parallelsim:
|
||||
|
||||
Instruction Level Parallelism(ILP):
|
||||
|
||||
- **Implicit/transparent** to users/programmers.
|
||||
- Instruction pipelining.
|
||||
- Superscalar execution.
|
||||
- Out of order execution.
|
||||
- Register renaming.
|
||||
- Speculative execution.
|
||||
- Branch prediction.
|
||||
|
||||
Task Level Parallelism(TLP):
|
||||
|
||||
- **Explicit** to users/programmers.
|
||||
- Multiple threads or processes executed simultaneously.
|
||||
- Multi-core processors.
|
||||
|
||||
Data Parallelism:
|
||||
|
||||
- Vector processors and SIMD.
|
||||
|
||||
Von Neumann Architecture: the **stored-program** concept. Three components: processor, memory and data path.
|
||||
|
||||
Bandwidth: the gravity of modern computer system.
|
||||
|
||||
## Instruction Pipelining
|
||||
|
||||
Divide incoming instructions into a series of sequential steps performed by different processor unit to keep every part of the processor busy.
|
||||
|
||||
Superscalar execution can execute more than one instruction during a clock cycle.
|
||||
|
||||
Order of order execution.
|
||||
|
||||
Very long instruction word(VLIW): allows programs to explicitly specify instructions to execute at the same time.
|
||||
|
||||
EPIC: Explicit parallel instruction computing.
|
||||
|
||||
Move the complexity of instruction scheduling from the CPU hardware to the software compiler:
|
||||
|
||||
- Check dependencies between instructions.
|
||||
- Assign instructions to the functional units.
|
||||
- Determine when instructions are initiated placed together into a single word.
|
||||
|
||||

|
||||
|
||||
Comparisons between different architecture:
|
||||
|
||||

|
||||
|
||||
## Multi-Core Processor Gala
|
||||
|
||||
Symmetric multiprocessing(SMP): a multiprocessor computer hardware and software architecture.
|
||||
|
||||
Two or more identical processors are connected to a **single shared main memory** and have full access to all input and output devices.
|
||||
|
||||
> Current trend: computer clusters, SMP computers connected with network.
|
||||
|
||||
Multithreading: exploiting thread-level parallelism.
|
||||
|
||||
Multithreading allows multiple threads to share the functional units of a single processor in an overlapping fashion **duplicating only private state**. A thread switch should be much more efficient than a process switch.
|
||||
|
||||
Hardware approaches to multithreading:
|
||||
|
||||
**fine-grained multithreading**:
|
||||
|
||||
- Switches between threads on each clock.
|
||||
- Hide the throughput losses that arise from the both short and long stalls.
|
||||
- Disadvantages: slow down the execution of an individual thread.
|
||||
|
||||
**Coarse-grained multithreading**:
|
||||
|
||||
- Switch threads only on costly stalls.
|
||||
- Limited in its ability to overcome throughput losses
|
||||
|
||||
**Simultaneous multithreading(SMT)**:
|
||||
|
||||
- A variation on fine-grained multithreading
|
||||
|
||||

|
||||
|
||||
## Data Parallelism: Vector Processors
|
||||
|
||||
Provides high-level operations that work on vectors.
|
||||
|
||||
Length of the array also varies depending on hardware.
|
||||
|
||||
SIMD and its generalization in vector parallelism approach improved efficiency by the same operation be performed on multiple data elements.
|
BIN
YaeBlog/source/posts/hpc-2025-cpu-architecture/image-20250313184421305.png
(Stored with Git LFS)
Normal file
BIN
YaeBlog/source/posts/hpc-2025-cpu-architecture/image-20250313184421305.png
(Stored with Git LFS)
Normal file
Binary file not shown.
BIN
YaeBlog/source/posts/hpc-2025-cpu-architecture/image-20250313184732892.png
(Stored with Git LFS)
Normal file
BIN
YaeBlog/source/posts/hpc-2025-cpu-architecture/image-20250313184732892.png
(Stored with Git LFS)
Normal file
Binary file not shown.
BIN
YaeBlog/source/posts/hpc-2025-cpu-architecture/image-20250313190913475.png
(Stored with Git LFS)
Normal file
BIN
YaeBlog/source/posts/hpc-2025-cpu-architecture/image-20250313190913475.png
(Stored with Git LFS)
Normal file
Binary file not shown.
Loading…
Reference in New Issue
Block a user