Age | Commit message (Collapse) | Author |
|
This way the expanded call to pow2 is resolved into a common subexpression.
|
|
Currently only for the SSS streaming pattern.
CudaCodePrinter in `utility/printer.py` is required to add a 'f' suffix
to all single precision floating point literals. If this is not done
(when targeting single precision) most calculations happen in double
precision which destroys performance. (In OpenCL this is not necessary
as we can simply set the `-cl-single-precision-constant` flag. Sadly
such a flag doesn't seem to exist for nvcc.)
|
|
|
|
This paves the way for dropping in other LBM collision models.
As a side benefit the default momenta calulcation is now fully inlined where possible.
|
|
|
|
No guarantee for correctness - I mostly fiddled this together in order
to use common nixpkgs python package functions for including boltzgen
in other shell environments.
|
|
It's time to extract the generator-part of my GPU LBM playground and turn it
into a nice reusable library. The goal is to produce a framework that can be
used to generate collision and streaming programs from symbolic descriptions.
i.e. it should be possible to select a LB model, the desired boundary
conditions as well as a data structure / streaming model and use this
information to automatically generate matching OpenCL / CUDA / C++
programs.
|