From 0f346d9ded8c84a026697a148157320dd5f6444b Mon Sep 17 00:00:00 2001 From: jackfiled Date: Sat, 10 May 2025 01:15:02 +0800 Subject: [PATCH] blog: three blogs: blog: hpc-2025-distributed-system blog: hpc-2025-heterogeneous-system blog: hpc-2025-program-smp-platform --- .../posts/hpc-2025-distributed-system.md | 224 ++++++++++++++++++ .../image-20250410193527994.webp | 3 + .../image-20250417184421464.webp | 3 + .../image-20250417185200176.webp | 3 + .../image-20250417190247790.webp | 3 + .../image-20250417191509682.webp | 3 + .../image-20250417191526416.webp | 3 + .../image-20250417192453944.webp | 3 + .../image-20250424183610157.webp | 3 + .../image-20250424183629681.webp | 3 + .../image-20250424183645210.webp | 3 + .../posts/hpc-2025-heterogeneous-system.md | 80 +++++++ .../image-20250417195644624.webp | 3 + .../image-20250417200241703.webp | 3 + .../image-20250424184701573.webp | 3 + .../image-20250424185022360.webp | 3 + .../image-20250424185048036.webp | 3 + .../image-20250424185152081.webp | 3 + .../image-20250424185219673.webp | 3 + .../image-20250424185322963.webp | 3 + .../image-20250424185354247.webp | 3 + .../image-20250424185449577.webp | 3 + .../image-20250424185541483.webp | 3 + .../image-20250424190159059.webp | 3 + .../posts/hpc-2025-program-smp-platform.md | 106 +++++++++ .../image-20250327200344104.webp | 3 + .../image-20250403183104279.webp | 3 + .../image-20250403191254323.webp | 3 + .../image-20250403195750934.webp | 3 + 29 files changed, 488 insertions(+) create mode 100644 YaeBlog/source/posts/hpc-2025-distributed-system.md create mode 100644 YaeBlog/source/posts/hpc-2025-distributed-system/image-20250410193527994.webp create mode 100644 YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417184421464.webp create mode 100644 YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417185200176.webp create mode 100644 YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417190247790.webp create mode 100644 YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417191509682.webp create mode 100644 YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417191526416.webp create mode 100644 YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417192453944.webp create mode 100644 YaeBlog/source/posts/hpc-2025-distributed-system/image-20250424183610157.webp create mode 100644 YaeBlog/source/posts/hpc-2025-distributed-system/image-20250424183629681.webp create mode 100644 YaeBlog/source/posts/hpc-2025-distributed-system/image-20250424183645210.webp create mode 100644 YaeBlog/source/posts/hpc-2025-heterogeneous-system.md create mode 100644 YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250417195644624.webp create mode 100644 YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250417200241703.webp create mode 100644 YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424184701573.webp create mode 100644 YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185022360.webp create mode 100644 YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185048036.webp create mode 100644 YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185152081.webp create mode 100644 YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185219673.webp create mode 100644 YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185322963.webp create mode 100644 YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185354247.webp create mode 100644 YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185449577.webp create mode 100644 YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185541483.webp create mode 100644 YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424190159059.webp create mode 100644 YaeBlog/source/posts/hpc-2025-program-smp-platform.md create mode 100644 YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250327200344104.webp create mode 100644 YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250403183104279.webp create mode 100644 YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250403191254323.webp create mode 100644 YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250403195750934.webp diff --git a/YaeBlog/source/posts/hpc-2025-distributed-system.md b/YaeBlog/source/posts/hpc-2025-distributed-system.md new file mode 100644 index 0000000..c3df497 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-distributed-system.md @@ -0,0 +1,224 @@ +--- +title: High Performance Computing 25 SP Distributed System +date: 2025-05-10T00:31:39.3109950+08:00 +tags: +- 高性能计算 +- 学习资料 +--- + + +The motivation of distributed system is resource sharing. + + + +### Definition of a Distributed System + +- A collection of independent computers that appears to its users as a single coherent system. +- A system in which hardware and software components located at networked computers communicated and coordinate their actions on by message passing. + +Important aspects: + +- Components are autonomous. +- Virtually single systems as **transparency**. + +### Kinds of Systems + +**Clustering**: + +A cluster is a group of independent resources that are interconnected and work as a single system. + +A general prerequisite of hardware clustering is that its component systems have reasonably identical hardware and operating system to provide similar performance levels when one failed component is to be replaced by another. + +**Peer-to-Peer Network**: + +P2P system are quite popular for file sharing, content distribution and Internet telephony. + +**Grid Computing**: + +A computing grid (or *computational grid* ) provides a platform in which computing resources are organized into one or more logical pools. + +**Cloud Computing**: + +Enables clients to outsource their software usage, data storage and even the computing infrastructure to remote data centers. + +![image-20250410193527994](./hpc-2025-distributed-system/image-20250410193527994.webp) + +**Fog Computing**: + +Fog computing focuses processing efforts at the local area network end of the chain. + +**Edge Computing**: + +Edge computing takes localized processing a bit farther, push these efforts closer to the data sources. + +**Near Resources Computing**: + +While CPU becomes powerful, I/O devices too. So offload CPU for domain-specific computing. + +### Features of Distributed System + +**Transparency**: + +- Access transparency. +- Location transparency. +- Migration transparency. +- Relocation transparency. +- Replication transparency. +- Concurrency transparency. +- Failure transparency. + +**Openness**: + +- Open distributed systems: offer services according to standard rules that describe the syntax and semantics of those services. +- Services are specified through *interfaces*. + +**Scalability**: + +Size scalability: more users and more resources. + +- Centralized services: a single server for all users. +- Centralized data: a single on-line telephone book. +- Centralized algorithms: doing routing based on completed information. + +### Common Problems is Distributed Systems + +1. Leader Election +2. Mutual Exclusion +3. Time Synchronization +4. Global State +5. Multicasting +6. Replica Management + +### Time in Distributed Systems + +Atomic clocks: modern timekeepers use atomic clocks as a de facto primary standard of time. + +**Happened Before Relationship**: + +Three basic rules about the causal ordering of events, and they collectively define the *happened before* a.k.a the *causally ordered before* relationship. + +- Rule 1: Let each process have a physical clock whose value is monotonically increasing. +- Rule 2: If *a* is the event of sending a message by process *P*, and *b* is the event of receiving the same message by another process *Q*, so the a < b. +- Rule 3: a < b and b < c can lead to a < c. + +The space time diagrams show such relationship: + +![image-20250417184421464](./hpc-2025-distributed-system/image-20250417184421464.webp) + +**Logical Clocks**: + +A logical clock is an event counter that respects causal ordering. + +**Vector Clocks**: + +The primary goal of vector clocks is to detect causality, which is the major weakness of logical clocks. + +![image-20250424183610157](./hpc-2025-distributed-system/image-20250424183610157.webp) + +![image-20250424183629681](./hpc-2025-distributed-system/image-20250424183629681.webp) + +![image-20250424183645210](./hpc-2025-distributed-system/image-20250424183645210.webp) + +**Synchronization Classification**: + +Types of synchronization: + +- External synchronization +- Internal synchronization +- Phase synchronization + +> Types of clocks: +> +> - Unbounded +> - Bounded +> +> Unbounded clocks are not realistic but are easier to deal with in the design of algorithms. Real clocks are always bounded. + +**External Synchronization**: + +To maintain the reading of each clock as close to the UTC as possible. + +The NTP is an external synchronization protocol. + +**Internal Synchronization**: + +To keep the readings of a system of autonomous clocks closely synchronized with one another, despite the failure or malfunction of one or more clocks. + +Of course external synchronization implies internal synchronization. + +**Phase Synchronization**: + +Many distributed computations run in phases: in a given phase all processes execute some actions which are followed by the next phase. + +## Data Center Organization + +A data center is a facility used to house computer systems and associated components. + +![image-20250417185200176](./hpc-2025-distributed-system/image-20250417185200176.webp) + +## Cloud Computing + +Cloud computing is a specialized form of distributed computing that introduces utilization models for remotely provisioning scalable and measured resources. + +>**NIST definition**: +> +>Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model is composed of five essential characteristics, three service models, and four deployment models. + +![image-20250417190247790](./hpc-2025-distributed-system/image-20250417190247790.webp) + +**Cloud Characteristics**: + +- On-demand Usage +- Ubiquitous Access +- Multitenancy +- Elasticity +- Measure Usage +- Resiliency + +**Cloud Delivery Models**: + +A cloud service delivery model represents a specific pre-packaged combination of IT resources offered by a cloud provider. + +- Infrastructure as a Service `IaaS` +- Platform as a a Service `PaaS` +- Software as a Service `SaaS` + +**Hypervisor**: + +Type 1 hypervisor: + +![image-20250417191509682](./hpc-2025-distributed-system/image-20250417191509682.webp) + +Type 2 hypervisor: + +![image-20250417191526416](./hpc-2025-distributed-system/image-20250417191526416.webp) + +**CPU Virtualization**: + +Inter VT-X and AMD SVM: + +- Introduce virtualization technology processors with an extra instruction set called Virtual Machine Extensions or VMX. +- Add additional operating model for host and guest. +- Support for swapping state between guest and host. +- Support for hiding privileged state. + +![image-20250417192453944](./hpc-2025-distributed-system/image-20250417192453944.webp) + +## Big Data Processing + +**MapReduce Programming Model** + +MapReduce is based on a very simple idea for parallel processing of data-intensive applications supporting arbitrarily divisible load sharing. + +> The so-called same process multiple data (SPMD) paradigm. + +**MapReduce Logical Data Flow**: + +The input data and output data of both the Map and reduce functions has a particular structure. + +Sending computation toward data rather than sending data toward computation. + +**Resilient Distributed Dataset** + +An RDD is a read-only partitioned collection of records. + diff --git a/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250410193527994.webp b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250410193527994.webp new file mode 100644 index 0000000..aab3726 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250410193527994.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:87b35cff45f7a9236e4f7ba5420bb55e10089a94c902d429e3d49028acd992ff +size 40522 diff --git a/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417184421464.webp b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417184421464.webp new file mode 100644 index 0000000..bb2b191 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417184421464.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:deee98ae1d92208ce8eb558b683b874d2591ae7e638a4aa7d8f3f39364cb069e +size 25654 diff --git a/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417185200176.webp b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417185200176.webp new file mode 100644 index 0000000..210a797 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417185200176.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:32fef176f981353f1e6e1a55c3775343c1d22ae1b268a89c2788e992f90f8298 +size 28724 diff --git a/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417190247790.webp b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417190247790.webp new file mode 100644 index 0000000..e6cc808 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417190247790.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:2a853bc4d218f2b55f0a46200c0e1489f8b37f55ff6927cd012a90b58b7c7341 +size 22476 diff --git a/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417191509682.webp b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417191509682.webp new file mode 100644 index 0000000..dea8277 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417191509682.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:005a0b076f873a12008ff8e629a17fc0a50c27e709f0f49e385b36c0f9de0e7a +size 18788 diff --git a/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417191526416.webp b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417191526416.webp new file mode 100644 index 0000000..f71cb05 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417191526416.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f9982b1fdb5e51d856bb16201811ae39218885363aa80e9f1b655b22c5c49fc4 +size 47384 diff --git a/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417192453944.webp b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417192453944.webp new file mode 100644 index 0000000..f0ab3aa --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250417192453944.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a43f217d5bc520e84d9e3696734b2a3b8a356b0d6270fc33e772e0a051bab0f9 +size 69182 diff --git a/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250424183610157.webp b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250424183610157.webp new file mode 100644 index 0000000..370c235 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250424183610157.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:32c0a5f07df25bb3f3365a6eda2e1a4d9a0b68dbc10526741d1e80c5e9f8d56b +size 36830 diff --git a/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250424183629681.webp b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250424183629681.webp new file mode 100644 index 0000000..a98926b --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250424183629681.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:61b577a989170b2a17598c37df0f8675bb4688c9ea3643de360d88366abf4266 +size 65378 diff --git a/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250424183645210.webp b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250424183645210.webp new file mode 100644 index 0000000..bcbdfe6 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-distributed-system/image-20250424183645210.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:ee1f195051b08d90e9155ea6a1ea9d4ed2cb35f2284e7522da481400f667a4e3 +size 29944 diff --git a/YaeBlog/source/posts/hpc-2025-heterogeneous-system.md b/YaeBlog/source/posts/hpc-2025-heterogeneous-system.md new file mode 100644 index 0000000..54ded24 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-heterogeneous-system.md @@ -0,0 +1,80 @@ +--- +title: High Performance Computing 25 SP Heterogeneous Computing +date: 2025-05-10T00:36:20.5391570+08:00 +tags: +- 高性能计算 +- 学习资料 +--- + + +Heterogeneous Computing is on the way! + + + +## GPU Computing Ecosystem + +CUDA: NVIDIA's Architecture for GPU computing. + +![image-20250417195644624](./hpc-2025-heterogeneous-system/image-20250417195644624.webp) + +## Internal Buses + +**HyperTransport**: + +Primarily a low latency direct chip to chip interconnect, supports mapping to board to board interconnect such as PCIe. + +**PCI Expression** + +Switched and point-to-point connection. + +**NVLink** + +![image-20250417200241703](./hpc-2025-heterogeneous-system/image-20250417200241703.webp) + +**OpenCAPI** + +Heterogeneous computing was in the professional world mostly limited to HPC, in the consumer world is a "nice to have". + +But OpenCAPI is absorbed by CXL. + +## CPU-GPU Arrangement + +![image-20250424184701573](./hpc-2025-heterogeneous-system/image-20250424184701573.webp) + +#### First Stage: Intel Northbrige + +![image-20250424185022360](./hpc-2025-heterogeneous-system/image-20250424185022360.webp) + +### Second Stage: Symmetric Multiprocessors: + +![image-20250424185048036](./hpc-2025-heterogeneous-system/image-20250424185048036.webp) + +### Third Stage: Nonuniform Memory Access + +And the memory controller is integrated directly in the CPU. + +![image-20250424185152081](./hpc-2025-heterogeneous-system/image-20250424185152081.webp) + +So in such context, the multiple CPUs is called NUMA: + +![image-20250424185219673](./hpc-2025-heterogeneous-system/image-20250424185219673.webp) + +And so there can be multi GPUs: + +![image-20250424185322963](./hpc-2025-heterogeneous-system/image-20250424185322963.webp) + +### Fourth Stage: Integrated PCIe in CPU + +![image-20250424185354247](./hpc-2025-heterogeneous-system/image-20250424185354247.webp) + +And there is such team *integrated CPU*, which integrated a GPU into the CPU chipset. + +![image-20250424185449577](./hpc-2025-heterogeneous-system/image-20250424185449577.webp) + +And the integrated GPU can work with discrete GPUs: + +![image-20250424185541483](./hpc-2025-heterogeneous-system/image-20250424185541483.webp) + +### Final Stage: Multi GPU Board + +![image-20250424190159059](./hpc-2025-heterogeneous-system/image-20250424190159059.webp) diff --git a/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250417195644624.webp b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250417195644624.webp new file mode 100644 index 0000000..88ea13c --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250417195644624.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:1c900e50ff87390ebda85f9d67e3a38ca4cf44af8e05738bdab048a01eb9922a +size 20734 diff --git a/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250417200241703.webp b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250417200241703.webp new file mode 100644 index 0000000..0caecc5 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250417200241703.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:0dae80ace7bc2918887f3c343ce2642c1ef98197a02d66ee679f767a769ef242 +size 49858 diff --git a/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424184701573.webp b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424184701573.webp new file mode 100644 index 0000000..4ab29af --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424184701573.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:c674ca7e69cbcf22fad70f3d50eb74ac00d087f3963f0efcf989d33057877670 +size 55266 diff --git a/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185022360.webp b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185022360.webp new file mode 100644 index 0000000..01e51f2 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185022360.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:facf46bba579b084af9c30bbbfa67ca4861091179b4dcdac625140eba063dece +size 13764 diff --git a/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185048036.webp b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185048036.webp new file mode 100644 index 0000000..0b76a77 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185048036.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:464e7414f5a4dc821561b985aab944e633dd4dab693fc6c75da245c5a969b7c6 +size 11604 diff --git a/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185152081.webp b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185152081.webp new file mode 100644 index 0000000..99e5944 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185152081.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5c8fb5d6381ab77461c0e8ea2fcc25887d1e80d8034687cc98b91a93bee0d03e +size 8946 diff --git a/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185219673.webp b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185219673.webp new file mode 100644 index 0000000..771fa6b --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185219673.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8466f02dbf307374fd2b16d3ceef32759af45016808af2d06347be05a6aed9c2 +size 11440 diff --git a/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185322963.webp b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185322963.webp new file mode 100644 index 0000000..43577d0 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185322963.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f1ab29cfe58e3cb5137c5e7d7fe167f82a5a3c9331cbb29019b6614a5ae2dffc +size 14300 diff --git a/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185354247.webp b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185354247.webp new file mode 100644 index 0000000..4e97e9a --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185354247.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:180ce40db7a628338e349c5443133b5448d8e90c676b0a6cdc60e7ec49ed4922 +size 11668 diff --git a/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185449577.webp b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185449577.webp new file mode 100644 index 0000000..9e05280 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185449577.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8aa7937e9ecc5eefda100ab6eef58bbb02aa34e89d551d7d6113732fb93e2db3 +size 7570 diff --git a/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185541483.webp b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185541483.webp new file mode 100644 index 0000000..54a0fdb --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424185541483.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:a1203d511a25cc66168356ff8356dc0d5527651b6f38acd2124e1284408e3277 +size 10742 diff --git a/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424190159059.webp b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424190159059.webp new file mode 100644 index 0000000..2118571 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-heterogeneous-system/image-20250424190159059.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:f847c355799225bdf76a8a6c52bcbff06d937042ea3de1a41fd31f4ad10b43dd +size 17618 diff --git a/YaeBlog/source/posts/hpc-2025-program-smp-platform.md b/YaeBlog/source/posts/hpc-2025-program-smp-platform.md new file mode 100644 index 0000000..1a4c19e --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-program-smp-platform.md @@ -0,0 +1,106 @@ +--- +title: High Performance Computing 25 SP Programming SMP Platform +date: 2025-05-10T00:17:26.5784020+08:00 +tags: +- 高性能计算 +- 学习资料 +--- + + + +Sharing address space brings simplification. + + + +### Shared Address Space Programming Paradigm + +Vary on mechanism for data sharing, concurrency models and support for synchronization. + +- Process based model +- Lightweight processes and threads. + +### Thread + +A thread is a single stream of control in the flow of a program. + +**Logical memory model of a thread**: + +All memory is globally accessible to every thread and threads are invoked as function calls. + +![image-20250327200344104](./hpc-2025-program-smp-platform/image-20250327200344104.webp) + +Benefits of threads: + +- Take less time to create a new thread than a new process. +- Less time to terminate a thread than a process. + +The taxonomy of thread: + +- User-Level Thread(ULT): The kernel is not aware of the existence of threads. All thread management is done by the application using a thread library. Thread switching does not require kernel mode privileges and scheduling is application specific. +- Kernel-Level Thread(KLT): All thread management is done by kernel. No thread library but an API to the kernel thread facility. Switching between threads requires the kernel and schedule on a thread basis. +- Combined ULT and KLT approaches. Combines the best of both approaches. + +![image-20250403183104279](./hpc-2025-program-smp-platform/image-20250403183104279.webp) + +## PThread Programming + +Potential problems with threads: + +- Conflicting access to shared memory. +- Race condition occur. +- Starvation +- Priority inversion +- Deadlock + +### Mutual Exclusion + +Mutex locks: implementing critical sections and atomic operations. + +Two states: locked and unlocked. At any point of time, only one thread can lock a mutex lock. + +### Producer-Consumer Work Queues + +The producer creates tasks and inserts them into a work-queue. + +The consumer threads pick up tasks from the task queue and execute them. + +Locks represent serialization points since critical sections must be executed by threads one after the other. + +**Important**: Minimize the size of critical sections. + +### Condition Variables for Synchronization + +The `pthread_mutex_trylock` alleviates the idling time but introduce the overhead of polling for availability of locks. + +An interrupt driven mechanism as opposed to a polled mechanism as the availability is signaled. + +A **condition variable**: a data object used for synchronizing threads. Block itself until specified data reaches a predefined state. + +When a thread performs a condition wait, it's not runnable as not use any CPU cycles but a mutex lock consumes CPU cycles as it polls for the locks. + +**Common Errors**: One cannot assume any order of execution, must be explicitly established by mutex, condition variables and joins. + +## MPI Programming + +Low cost message passing architecture. + +![image-20250403191254323](./hpc-2025-program-smp-platform/image-20250403191254323.webp) + +Mapping of MPI Processes: + +MPI views the processes as a one-dimensional topology. But in parallel programs, processes are arranged in higher-dimensional topologies. So it is required to map each MPI process to a process in the higher dimensional topology. + +Non-blocking Send and Receive: + +`MPI_ISend` and `MPI_Irecv` functions allocate a request object and return a pointer to it. + +## OpenMP + +A standard for directive based parallel programming. + +Thread based parallelism and explicit parallelism. + +Use fork-join model: + +![image-20250403195750934](./hpc-2025-program-smp-platform/image-20250403195750934.webp) + diff --git a/YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250327200344104.webp b/YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250327200344104.webp new file mode 100644 index 0000000..258b9c4 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250327200344104.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:81a2610afcd3913bfeb523eb861d63680fa81b7a9741c1cf10c4ecca16081558 +size 25110 diff --git a/YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250403183104279.webp b/YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250403183104279.webp new file mode 100644 index 0000000..4519ed0 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250403183104279.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:305c571375a81ada7dbaf9e3c32afffecad3a9b2a60d7a148a94ecf93336555d +size 33248 diff --git a/YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250403191254323.webp b/YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250403191254323.webp new file mode 100644 index 0000000..67a253b --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250403191254323.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:061682bb2fbc7c9eab831d262dc6419a1d8d58a0c40bc012be7d8a5a091c3bf2 +size 63130 diff --git a/YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250403195750934.webp b/YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250403195750934.webp new file mode 100644 index 0000000..48a67d5 --- /dev/null +++ b/YaeBlog/source/posts/hpc-2025-program-smp-platform/image-20250403195750934.webp @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:874702176e18c0e122750042b0985c52e2eccb0d3444323d3d4601318535937c +size 24408