















































| Туре                                                            | Latency                                          | Bandwidth                                                                                           | Cost                                              |
|-----------------------------------------------------------------|--------------------------------------------------|-----------------------------------------------------------------------------------------------------|---------------------------------------------------|
| Gigabit Ethernet                                                | ~1 msec                                          | 0.1 GigaByte/sec                                                                                    | ~ 50USD / port                                    |
| 10 Gigabit Ethernet                                             | ~100 <i>µ</i> sec                                | 1.0 GigaByte/sec                                                                                    | ~ 500USD / port                                   |
| QDR InfiniBand                                                  | ~1 <i>µ</i> sec                                  | 3.6 GigaByte/sec                                                                                    | ~ 1000USD / port                                  |
|                                                                 |                                                  |                                                                                                     | Mellanox 36-port                                  |
|                                                                 |                                                  |                                                                                                     | Mellanox 36-port<br>InfiniBand switch             |
| Notes about TCP/IP (window base                                 | d):                                              |                                                                                                     |                                                   |
| Protocol settings can greatly a<br>At 10 Gbps network speed, ne | ffect actual throughpu<br>w packets arrive faste | tt! (e.g. only using some %)<br>er than current standard system<br>east she value of providing grea | InfiniBand switch<br>s can process a packet. This |

















|            | Changing Conventional Wisdom                                                                                            |
|------------|-------------------------------------------------------------------------------------------------------------------------|
| Powe       | r                                                                                                                       |
| ⊳          | Was: Power is free, Transistors expensive                                                                               |
| ⊳          | Now: "Power wall" Power is expensive, (can put more on chip than can afford to turn on)                                 |
| ILP        |                                                                                                                         |
| ⊳          | Was: Sufficiently increasing Instruction Level Parallelism via compilers, innovation (Out-of-order, speculation, VLIW,) |
| ⊳          | Now: "ILP wall" law of diminishing returns on more HW for ILP                                                           |
| Memo       | bry                                                                                                                     |
| ⊳          | Was: Multiplies are slow, Memory access is fast                                                                         |
| ⊳          | Now: "Memory wall" Memory slow, multiplies fast                                                                         |
|            | (200 clock cycles to DRAM memory, 4 clocks for multiply)                                                                |
|            |                                                                                                                         |
| Tim Conrad | 34                                                                                                                      |
|            |                                                                                                                         |







"Ironically, as numerical analysis is applied to larger and more complex problems, non-numerical issues play a larger role. Mesh generation is an excellent example of this phenomenon. Solving current problems in structural mechanics or fluid dynamics with finite difference of finite element methods *depends on the construction of high-quality meshes of surfaces and volumes. Geometric design and construction of these meshes are typically much more time-consuming than the simulations that are performed with them.*"

¥\$

Tim Conrad

John Guckenheimer, "Numerical Computation in the Information Age" in June 1998 issue of SIAM News.















































































#### Summary

#### Part I: Introduction to HPC

(

Tim Conrad

 We are hitting a brick wall (= clock/memory/ILP wall) – new concepts for algorithmic design and their implementation are needed

# Part II: Illustrative Example

▷ Communication is expensive

 Octrees can be an alternative data structure for meshing and multi grid methods (if done right)

#### Part III: Data Storage

- ▷ We are producing more data than we can store
- Parallel file systems are not the only answer
- Need hierarchies / load-balancing even on file system level
- Light-weight DB approaches can be an option











