From 4d9d76afd3f47ef50cfe636e1ae76b7233d971a7 Mon Sep 17 00:00:00 2001 From: Adrian Kummerlaender Date: Wed, 10 Jul 2019 00:02:24 +0200 Subject: Add README.md --- README.md | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..6d0f8e5 --- /dev/null +++ b/README.md @@ -0,0 +1,50 @@ +# Symbolically generated GPU-based LBM + +Experimental generation of OpenCL kernels using SymPy, Mako and PyOpenCL. + +* Implements a straight forward AB pattern +* All memory offsets are statically resolved +* Underlying symbolic formulation is optimized using CSE +* Characteristic constants of D2Q9 and D3Q27 are transparently recovered using only discrete velocities + +## Performance + +Theoretical maximum performance on tested hardware: + +| GPU | Bandwidth | D2Q9 |   | D3Q19 |   | D3Q27 |   | +| ------ | ----------- | ------ | ------ | ------ | ------ | ------ | ------ | +|   |   | single | double | single | double | single | double | +| K2200 | 63.2 GiB/s | 893 | 459 | 435 | 220 | 308 | 156 | +| P100 | 512.6 GiB/s | 7242 | 3719 | 3528 | 1787 | 2502 | 1262 | + +### Maximum measured performance... + +| GPU | D2Q9 |   | D3Q19 |   | D3Q27 |   | +| ------ | ------ | ------ | ------ | ------ | ------ | ------ | +|   | single | double | single | double | single | double | +| K2200 | 843.4 | 326.4 | 423.2 | 163.8 | 303.0 | 116.0 | +| P100 | 6957.4 | 3585.0 | 3420.2 | 1763.8 | 2374.6 | 1259.6 | + +### ...relative to theoretical maximum + +| GPU | D2Q9 |   | D3Q19 |   | D3Q27 |   | +| ------ | ------ | ------ | ------ | ------ | ------ | ------ | +|   | single | double | single | double | single | double | +| K2200 | 94.4% | 71.1% | 97.3% | 74.5% | 98.4% | 74.4% | +| P100 | 96.1% | 96.4% | 96.9% | 98.7% | 94.9% | 99.8% | + +### CSE impact on P100 + +| CSE | D2Q9 |   | D3Q19 |   | D3Q27 |   | +| ------ | ------ | ------ | ------ | ------ | ------ | ------ | +|   | single | double | single | double | single | double | +| No | 6957.4 | 2814.4 | 2581.8 | 998.8 | 1576.4 | 647.4 | +| Yes | 6922.4 | 3585.0 | 3420.2 | 1763.8 | 2374.6 | 1259.6 | + +| CSE | D2Q9 |   | D3Q19 |   | D3Q27 |   | +| ------ | ------ | ------ | ------ | ------ | ------ | ------ | +|   | single | double | single | double | single | double | +| No | 96.1% | 75.7% | 73.2% | 55.9% | 63.0% | 51.3% | +| Yes | 95.6% | 96.4% | 96.9% | 98.7% | 94.9% | 99.8% | + +For more details see the `results/` and `notebook/` directories. -- cgit v1.2.3