Age | Commit message (Collapse) | Author |
|
Note how this basically required no changes besides generalizing cell indexing
and adding the symbolic formulation of a D3Q19 BGK collision step.
Increasing the neighborhood communication from 9 to 19 cells leads to a
significant performance "regression": The 3D kernel yields ~ 360 MLUPS
compared to the 2D version's ~ 820 MLUPS.
|
|
|
|
|
|
|
|
|
|
|
|
A kernel extracted from `lbn_codegen.ipynb` yields ~665 MLUPS compared
to the ~600 MLUPS produced by a manually optimized kernel.
Note that this new kernel currently doesn't handle boundary conditions (but
dropping in a density condition doesn't impact performance).
|
|
Notice that the indexing order of numpy arrays follows matrix conventions.
|