Age | Commit message (Collapse) | Author |
|
* Only update moment field when it is actually needed
* => ~825 MLUPS
* Defer plot generation until the actual simulation is done
|
|
=> ~780 MLUPS
|
|
|
|
Interestingly this increased performance to ~750 MLUPS compared to ~665 MLUPS.
|
|
A kernel extracted from `lbn_codegen.ipynb` yields ~665 MLUPS compared
to the ~600 MLUPS produced by a manually optimized kernel.
Note that this new kernel currently doesn't handle boundary conditions (but
dropping in a density condition doesn't impact performance).
|