feat: rewrite about page for 2026. (#21)
Some checks failed
Build blog docker image / Build-Blog-Image (push) Failing after 14s
Some checks failed
Build blog docker image / Build-Blog-Image (push) Failing after 14s
Signed-off-by: jackfiled <xcrenchangjun@outlook.com> Reviewed-on: #21
This commit is contained in:
106
source/posts/hpc-2025-program-smp-platform.md
Normal file
106
source/posts/hpc-2025-program-smp-platform.md
Normal file
@@ -0,0 +1,106 @@
|
||||
---
|
||||
title: High Performance Computing 25 SP Programming SMP Platform
|
||||
date: 2025-05-10T00:17:26.5784020+08:00
|
||||
tags:
|
||||
- 高性能计算
|
||||
- 学习资料
|
||||
---
|
||||
|
||||
|
||||
|
||||
Sharing address space brings simplification.
|
||||
|
||||
<!--more-->
|
||||
|
||||
### Shared Address Space Programming Paradigm
|
||||
|
||||
Vary on mechanism for data sharing, concurrency models and support for synchronization.
|
||||
|
||||
- Process based model
|
||||
- Lightweight processes and threads.
|
||||
|
||||
### Thread
|
||||
|
||||
A thread is a single stream of control in the flow of a program.
|
||||
|
||||
**Logical memory model of a thread**:
|
||||
|
||||
All memory is globally accessible to every thread and threads are invoked as function calls.
|
||||
|
||||

|
||||
|
||||
Benefits of threads:
|
||||
|
||||
- Take less time to create a new thread than a new process.
|
||||
- Less time to terminate a thread than a process.
|
||||
|
||||
The taxonomy of thread:
|
||||
|
||||
- User-Level Thread(ULT): The kernel is not aware of the existence of threads. All thread management is done by the application using a thread library. Thread switching does not require kernel mode privileges and scheduling is application specific.
|
||||
- Kernel-Level Thread(KLT): All thread management is done by kernel. No thread library but an API to the kernel thread facility. Switching between threads requires the kernel and schedule on a thread basis.
|
||||
- Combined ULT and KLT approaches. Combines the best of both approaches.
|
||||
|
||||

|
||||
|
||||
## PThread Programming
|
||||
|
||||
Potential problems with threads:
|
||||
|
||||
- Conflicting access to shared memory.
|
||||
- Race condition occur.
|
||||
- Starvation
|
||||
- Priority inversion
|
||||
- Deadlock
|
||||
|
||||
### Mutual Exclusion
|
||||
|
||||
Mutex locks: implementing critical sections and atomic operations.
|
||||
|
||||
Two states: locked and unlocked. At any point of time, only one thread can lock a mutex lock.
|
||||
|
||||
### Producer-Consumer Work Queues
|
||||
|
||||
The producer creates tasks and inserts them into a work-queue.
|
||||
|
||||
The consumer threads pick up tasks from the task queue and execute them.
|
||||
|
||||
Locks represent serialization points since critical sections must be executed by threads one after the other.
|
||||
|
||||
**Important**: Minimize the size of critical sections.
|
||||
|
||||
### Condition Variables for Synchronization
|
||||
|
||||
The `pthread_mutex_trylock` alleviates the idling time but introduce the overhead of polling for availability of locks.
|
||||
|
||||
An interrupt driven mechanism as opposed to a polled mechanism as the availability is signaled.
|
||||
|
||||
A **condition variable**: a data object used for synchronizing threads. Block itself until specified data reaches a predefined state.
|
||||
|
||||
When a thread performs a condition wait, it's not runnable as not use any CPU cycles but a mutex lock consumes CPU cycles as it polls for the locks.
|
||||
|
||||
**Common Errors**: One cannot assume any order of execution, must be explicitly established by mutex, condition variables and joins.
|
||||
|
||||
## MPI Programming
|
||||
|
||||
Low cost message passing architecture.
|
||||
|
||||

|
||||
|
||||
Mapping of MPI Processes:
|
||||
|
||||
MPI views the processes as a one-dimensional topology. But in parallel programs, processes are arranged in higher-dimensional topologies. So it is required to map each MPI process to a process in the higher dimensional topology.
|
||||
|
||||
Non-blocking Send and Receive:
|
||||
|
||||
`MPI_ISend` and `MPI_Irecv` functions allocate a request object and return a pointer to it.
|
||||
|
||||
## OpenMP
|
||||
|
||||
A standard for directive based parallel programming.
|
||||
|
||||
Thread based parallelism and explicit parallelism.
|
||||
|
||||
Use fork-join model:
|
||||
|
||||

|
||||
|
||||
Reference in New Issue
Block a user