Age | Commit message (Collapse) | Author |
|
Note that special care has to be taken to provide ghost cells around
active cells so the algorithm has somewhere to stream to and from.
This is also the case for the AB pattern but there they only have to
be equilibrilized once instead of after every other time step.
Even when such an equilibrilization is performed there is still a
potential bug as inbound populations at the outer boundary are never
streamed to (this is not a problem for AB using pull-only streaming).
A vectorizable solution may require direction-specific ghost cell
equilibrization.
|
|
|
|
This should allow for plugging in e.g. a AA pattern implementation
without without touching any file but `AA.$target.mako`.
OpenCL and C++ target templates now look basically the same and could
potentially be merged. However this would decrease flexibility should
more differences appear in the future. Maintaining separate template
files is an acceptable overhead to preserve flexibility.
|
|
|
|
|
|
This paves the way for dropping in other LBM collision models.
As a side benefit the default momenta calulcation is now fully inlined where possible.
|
|
|
|
|
|
SOA and AOS should not be target specific, neighbor offset calculation /
bijection between gid and cell coordinates should be customizable.
|
|
Requires different function naming as OpenCL 1.2 doesn't support overloads.
The OpenCL kernel code generated using this commit was successfully tested
on an actual GPU. Time to set up some automatic validation.
|
|
It is more flexible to place OpenCL thread ID dependent dispatching in a separate function.
|
|
|
|
|
|
Selection of the desired templates is possible via a new `functions` parameter.
|
|
Yields ~160 MLUPs on a Xeon E3-1241 for D2Q9 double precision lid driven cavity.
Obviously not anywhere near what is possible on GPUs but respectable for a CPU implementation.
Especially considering how simple it is.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
It's time to extract the generator-part of my GPU LBM playground and turn it
into a nice reusable library. The goal is to produce a framework that can be
used to generate collision and streaming programs from symbolic descriptions.
i.e. it should be possible to select a LB model, the desired boundary
conditions as well as a data structure / streaming model and use this
information to automatically generate matching OpenCL / CUDA / C++
programs.
|