aboutsummaryrefslogtreecommitdiff
path: root/codegen_lbm.py
AgeCommit message (Collapse)Author
2019-06-12Collect moments outside of the lattice classAdrian Kummerlaender
2019-06-12Move kernel template into separate fileAdrian Kummerlaender
2019-06-12Allocate moments buffer only on deviceAdrian Kummerlaender
2019-06-11Move equilibrization to kernelAdrian Kummerlaender
2019-06-11Move D2Q9 codegen into separate fileAdrian Kummerlaender
2019-06-11Preshift population field pointerAdrian Kummerlaender
Now averaging ~ 820 MLUPS again
2019-06-11Statically resolve indices as far as possibleAdrian Kummerlaender
Interestingly this seems to lose up to 10 MLUPS at first glance. On the other hand such a small difference could also be a temporary load issue.
2019-06-11Move index calculation to compile timeAdrian Kummerlaender
2019-06-11Templatize assignment loopsAdrian Kummerlaender
2019-06-11Start to use codegen for actual kernel generationAdrian Kummerlaender
2019-06-10Improve plot generationAdrian Kummerlaender
* Only update moment field when it is actually needed * => ~825 MLUPS * Defer plot generation until the actual simulation is done
2019-06-10Reduce thread block sizeAdrian Kummerlaender
=> ~780 MLUPS
2019-06-10Improve plot outputAdrian Kummerlaender
2019-06-10Add fixed velocity boundaries to generated LBM kernelAdrian Kummerlaender
Interestingly this increased performance to ~750 MLUPS compared to ~665 MLUPS.
2019-06-09First test of partially generated LBM kernelAdrian Kummerlaender
A kernel extracted from `lbn_codegen.ipynb` yields ~665 MLUPS compared to the ~600 MLUPS produced by a manually optimized kernel. Note that this new kernel currently doesn't handle boundary conditions (but dropping in a density condition doesn't impact performance).