<feed xmlns='http://www.w3.org/2005/Atom'>
<title>boltzgen, branch master</title>
<subtitle>Symbolic generation of LBM kernels</subtitle>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/'/>
<entry>
<title>Implement basic multi-cuboid communication for CUDA target</title>
<updated>2020-02-02T21:43:41+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2020-02-02T21:43:41+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=a90f1faedac705242a674ba94ea7bc5438cab078'/>
<id>a90f1faedac705242a674ba94ea7bc5438cab078</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Rename 'collide_and_stream' to 'collide'</title>
<updated>2020-02-02T20:52:05+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2020-02-02T20:52:05+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=d58914164adc5876a5911edec85e2ebd43288ec9'/>
<id>d58914164adc5876a5911edec85e2ebd43288ec9</id>
<content type='text'>
Streaming is only implicit depending on the selected propagation pattern.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Streaming is only implicit depending on the selected propagation pattern.
</pre>
</div>
</content>
</entry>
<entry>
<title>Implement AA for CUDA target</title>
<updated>2020-01-17T20:05:40+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2020-01-17T20:05:23+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=25c210daa7c45d937bcc336ca887bfba71000a23'/>
<id>25c210daa7c45d937bcc336ca887bfba71000a23</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Implement SSS for OpenCL target</title>
<updated>2020-01-10T23:15:08+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2020-01-10T23:11:28+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=b5a24f31871d900342a3c47398cc75e22bad0b6f'/>
<id>b5a24f31871d900342a3c47398cc75e22bad0b6f</id>
<content type='text'>
Sadly OpenCL kernels don't accept pointer-to-pointer arguments which
complicates the control structure implementation.

A workaround is to cast them into `uintptr_t` which is guaranteed to be large
enough to fit any pointer on the device. Special care has to be taken to always
perform the pointer shifts on actual floating point pointers and not on
type-less pointers.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Sadly OpenCL kernels don't accept pointer-to-pointer arguments which
complicates the control structure implementation.

A workaround is to cast them into `uintptr_t` which is guaranteed to be large
enough to fit any pointer on the device. Special care has to be taken to always
perform the pointer shifts on actual floating point pointers and not on
type-less pointers.
</pre>
</div>
</content>
</entry>
<entry>
<title>Match OpenCL and CUDA cell list dispatch templates</title>
<updated>2019-11-12T21:54:11+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-12T21:54:11+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=aa509dd4ebbb9d1d8ad6ebfe05111228fd9ae7c0'/>
<id>aa509dd4ebbb9d1d8ad6ebfe05111228fd9ae7c0</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix order of CSE and pow2 expansion</title>
<updated>2019-11-12T17:57:27+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-12T17:57:27+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=a93535c32231d98ef8d080adac626f88b18f9db5'/>
<id>a93535c32231d98ef8d080adac626f88b18f9db5</id>
<content type='text'>
This way the expanded call to pow2 is resolved into a common subexpression.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This way the expanded call to pow2 is resolved into a common subexpression.
</pre>
</div>
</content>
</entry>
<entry>
<title>Implement basic CUDA target</title>
<updated>2019-11-10T20:18:57+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-10T20:14:07+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=4a2885ad3ae0396486d288df94339d0c45e6db8b'/>
<id>4a2885ad3ae0396486d288df94339d0c45e6db8b</id>
<content type='text'>
Currently only for the SSS streaming pattern.

CudaCodePrinter in `utility/printer.py` is required to add a 'f' suffix
to all single precision floating point literals. If this is not done
(when targeting single precision) most calculations happen in double
precision which destroys performance. (In OpenCL this is not necessary
as we can simply set the `-cl-single-precision-constant` flag. Sadly
such a flag doesn't seem to exist for nvcc.)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently only for the SSS streaming pattern.

CudaCodePrinter in `utility/printer.py` is required to add a 'f' suffix
to all single precision floating point literals. If this is not done
(when targeting single precision) most calculations happen in double
precision which destroys performance. (In OpenCL this is not necessary
as we can simply set the `-cl-single-precision-constant` flag. Sadly
such a flag doesn't seem to exist for nvcc.)
</pre>
</div>
</content>
</entry>
<entry>
<title>Add support for population padding to SOA layout</title>
<updated>2019-11-09T22:46:14+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-09T22:46:14+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=d136bb30bc8a9393372ec905aea500a0b61000e3'/>
<id>d136bb30bc8a9393372ec905aea500a0b61000e3</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Implement basic version of the SSS pattern for C++ target</title>
<updated>2019-11-09T19:40:33+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-09T19:21:27+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=27ce855378a80dff680c2989800af1f4e69975fe'/>
<id>27ce855378a80dff680c2989800af1f4e69975fe</id>
<content type='text'>
An interesting extension of the AA pattern.  The main advantage of this is
that updating pointers in a control structure is  much more elegant than
duplicating all function implementations as is required by the normal AA
pattern. For more details see [1].

Only works for the SOA layout.

On a pure memory access level this pattern is equivalent to the AA pattern.
The difference is how the memory locations are calculated (by pointer swap
&amp; shift or by different indexing functions for odd and even time steps).

[1]: "An auto-vectorization friendly parallel lattice Boltzmann streaming
      scheme for direct addressing" by Mohrhard et al. (2019)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
An interesting extension of the AA pattern.  The main advantage of this is
that updating pointers in a control structure is  much more elegant than
duplicating all function implementations as is required by the normal AA
pattern. For more details see [1].

Only works for the SOA layout.

On a pure memory access level this pattern is equivalent to the AA pattern.
The difference is how the memory locations are calculated (by pointer swap
&amp; shift or by different indexing functions for odd and even time steps).

[1]: "An auto-vectorization friendly parallel lattice Boltzmann streaming
      scheme for direct addressing" by Mohrhard et al. (2019)
</pre>
</div>
</content>
</entry>
<entry>
<title>Add optional OpenGL interop helper function for OpenCL target</title>
<updated>2019-11-09T15:19:40+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-09T15:19:40+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=286e243a171c8bcdfc91b5b6dcdd937ac95b0b7b'/>
<id>286e243a171c8bcdfc91b5b6dcdd937ac95b0b7b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Mark equilibrilize, momenta result values as const</title>
<updated>2019-11-08T23:08:36+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-08T23:08:36+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=cb818d5a55361b6aea2af3d6713ff98886c400bc'/>
<id>cb818d5a55361b6aea2af3d6713ff98886c400bc</id>
<content type='text'>
Doesn't change the outcome but is more in line how the rest of the
generated code looks like.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Doesn't change the outcome but is more in line how the rest of the
generated code looks like.
</pre>
</div>
</content>
</entry>
<entry>
<title>Rename OpenCL cell list wrapper functions</title>
<updated>2019-11-08T22:48:45+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-08T22:48:45+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=d3c24f497f29ba9f26a87c3099f1e46688d0414b'/>
<id>d3c24f497f29ba9f26a87c3099f1e46688d0414b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Check whether template for requested streaming pattern exists</title>
<updated>2019-11-06T20:32:26+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-06T20:32:26+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=7253ffa4b7755e211ffde2bed25652477ea33e5d'/>
<id>7253ffa4b7755e211ffde2bed25652477ea33e5d</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Update README</title>
<updated>2019-11-05T22:46:47+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-05T22:46:47+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=814e6253475c7955eb6a46d814e5a86974e58613'/>
<id>814e6253475c7955eb6a46d814e5a86974e58613</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Implement AA pattern for OpenCL target</title>
<updated>2019-11-05T22:34:14+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-05T22:33:47+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=62e3d5708470415b9ea2f0a737acaf4e2d00bb21'/>
<id>62e3d5708470415b9ea2f0a737acaf4e2d00bb21</id>
<content type='text'>
Works well but function naming is getting kind of clunky, e.g. "velocity_momenta_boundary_tick_cells"

This could be hidden to a degree by proving branching wrappers for
the odd and even time step implementations. However this would not
vectorize when targeting Intel via OpenCL.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Works well but function naming is getting kind of clunky, e.g. "velocity_momenta_boundary_tick_cells"

This could be hidden to a degree by proving branching wrappers for
the odd and even time step implementations. However this would not
vectorize when targeting Intel via OpenCL.
</pre>
</div>
</content>
</entry>
<entry>
<title>Add cell index generator method to Geometry class</title>
<updated>2019-11-05T22:22:36+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-05T22:22:36+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=019e8d57c6266ce0b26d8eacab984f303442a184'/>
<id>019e8d57c6266ce0b26d8eacab984f303442a184</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix OpenCL vector indexing</title>
<updated>2019-11-05T22:20:44+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-05T22:20:44+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=f3035c9d8d5a643ee7e9e86a58688c4b2f86319c'/>
<id>f3035c9d8d5a643ee7e9e86a58688c4b2f86319c</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Implement AA pattern for C++ target</title>
<updated>2019-11-05T18:57:17+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-05T18:57:17+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=02cb01c94fe26d425371ab74feeb50e8a9bf6bf6'/>
<id>02cb01c94fe26d425371ab74feeb50e8a9bf6bf6</id>
<content type='text'>
Note that special care has to be taken to provide ghost cells around
active cells so the algorithm has somewhere to stream to and from.

This is also the case for the AB pattern but there they only have to
be equilibrilized once instead of after every other time step.

Even when such an equilibrilization is performed there is still a
potential bug as inbound populations at the outer boundary are never
streamed to (this is not a problem for AB using pull-only streaming).
A vectorizable solution may require direction-specific ghost cell
equilibrization.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Note that special care has to be taken to provide ghost cells around
active cells so the algorithm has somewhere to stream to and from.

This is also the case for the AB pattern but there they only have to
be equilibrilized once instead of after every other time step.

Even when such an equilibrilization is performed there is still a
potential bug as inbound populations at the outer boundary are never
streamed to (this is not a problem for AB using pull-only streaming).
A vectorizable solution may require direction-specific ghost cell
equilibrization.
</pre>
</div>
</content>
</entry>
<entry>
<title>Drop AB suffix from streaming pattern definition names</title>
<updated>2019-11-04T22:45:02+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-04T22:45:02+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=78f5edec8151db38ebf933e915fcca5f65b1cad5'/>
<id>78f5edec8151db38ebf933e915fcca5f65b1cad5</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Extract streaming pattern into Mako definitions</title>
<updated>2019-11-04T22:38:36+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-04T22:38:36+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=5828235f806c3e87a5b1eed34ef69ef317a110bd'/>
<id>5828235f806c3e87a5b1eed34ef69ef317a110bd</id>
<content type='text'>
This should allow for plugging in e.g. a AA pattern implementation
without without touching any file but `AA.$target.mako`.

OpenCL and C++ target templates now look basically the same and could
potentially be merged. However this would decrease flexibility should
more differences appear in the future.  Maintaining separate template
files is an acceptable overhead to preserve flexibility.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This should allow for plugging in e.g. a AA pattern implementation
without without touching any file but `AA.$target.mako`.

OpenCL and C++ target templates now look basically the same and could
potentially be merged. However this would decrease flexibility should
more differences appear in the future.  Maintaining separate template
files is an acceptable overhead to preserve flexibility.
</pre>
</div>
</content>
</entry>
<entry>
<title>Improve lattice, model selection error reporting</title>
<updated>2019-11-02T16:32:41+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-02T16:32:41+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=05e74fb112f5b5f645b649c587d18052c7b7f9df'/>
<id>05e74fb112f5b5f645b649c587d18052c7b7f9df</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Import `sympy.ccode` inside templates instead of as argument</title>
<updated>2019-11-02T16:29:56+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-02T16:29:56+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=f233beddfc74d5933f46684adab5298e03c08871'/>
<id>f233beddfc74d5933f46684adab5298e03c08871</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Restructure LBM model / lattice distinction</title>
<updated>2019-11-02T16:18:32+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-11-02T16:18:32+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=24847cbb2567f508a7c30b39c6fb7ba6379d1adc'/>
<id>24847cbb2567f508a7c30b39c6fb7ba6379d1adc</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Call symbolic generator inside code templates</title>
<updated>2019-10-31T12:13:00+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-31T12:13:00+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=e2b00f4ec963060be98939c7b0d12d6c00e50a02'/>
<id>e2b00f4ec963060be98939c7b0d12d6c00e50a02</id>
<content type='text'>
This paves the way for dropping in other LBM collision models.
As a side benefit the default momenta calulcation is now fully inlined where possible.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This paves the way for dropping in other LBM collision models.
As a side benefit the default momenta calulcation is now fully inlined where possible.
</pre>
</div>
</content>
</entry>
<entry>
<title>Move C++ example to boltzgen_examples repository</title>
<updated>2019-10-30T15:12:20+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-30T15:12:20+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=c82b38122cc3ab7717cb0ba9ec530b4658bd03e4'/>
<id>c82b38122cc3ab7717cb0ba9ec530b4658bd03e4</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Don't try to reuse population layout for moment array indexing</title>
<updated>2019-10-29T19:33:30+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-29T19:33:30+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=09f45c1d1da37bf4f6fa4094eb2d2ea18e8aaf21'/>
<id>09f45c1d1da37bf4f6fa4094eb2d2ea18e8aaf21</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Return cell id as string expression when required</title>
<updated>2019-10-29T18:53:05+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-29T18:53:05+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=1ce3b58dabba59741343cbd9e7e4c9f58f10f91b'/>
<id>1ce3b58dabba59741343cbd9e7e4c9f58f10f91b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Add support for generating custom templates in boltzgen's context</title>
<updated>2019-10-29T18:38:32+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-29T18:38:32+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=61d77cf8aa815b456d873ff3c01e54ad21a8fce9'/>
<id>61d77cf8aa815b456d873ff3c01e54ad21a8fce9</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Rename target module to memory</title>
<updated>2019-10-29T18:30:50+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-29T18:30:50+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=a0438d434a8dde45e6fdab38d44181f0cd0cb2c6'/>
<id>a0438d434a8dde45e6fdab38d44181f0cd0cb2c6</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Move further generator arguments into its constructor</title>
<updated>2019-10-29T18:25:38+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-29T18:25:38+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=a6dcb57cff9a1dd9de7e5fafdc87230489be87b9'/>
<id>a6dcb57cff9a1dd9de7e5fafdc87230489be87b9</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Extract cell indexing function</title>
<updated>2019-10-29T15:05:04+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-29T15:05:04+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=d801f538a090838a15e74282239369b73723c4f4'/>
<id>d801f538a090838a15e74282239369b73723c4f4</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Unify AOS, SOA specific cell preshift between targets</title>
<updated>2019-10-29T09:56:41+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-29T09:56:41+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=dbd9a340a7809a770d52d10154712278431acdc3'/>
<id>dbd9a340a7809a770d52d10154712278431acdc3</id>
<content type='text'>
SOA and AOS should not be target specific, neighbor offset calculation /
bijection between gid and cell coordinates should be customizable.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
SOA and AOS should not be target specific, neighbor offset calculation /
bijection between gid and cell coordinates should be customizable.
</pre>
</div>
</content>
</entry>
<entry>
<title>Set default order for custom ndindex overload</title>
<updated>2019-10-28T21:33:53+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-28T21:29:50+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=15c0cba693646269c04f245ca52f405ddfdb4a07'/>
<id>15c0cba693646269c04f245ca52f405ddfdb4a07</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Use order-accepting ndindex method for generating cell indices</title>
<updated>2019-10-28T21:00:05+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-28T21:00:05+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=72e7097073b03e6584f603b905b5a4fc236d7def'/>
<id>72e7097073b03e6584f603b905b5a4fc236d7def</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Add basic setup.py</title>
<updated>2019-10-28T20:52:34+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-28T20:52:34+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=091811762b7f2cbb6575139276ea16bf54d3456b'/>
<id>091811762b7f2cbb6575139276ea16bf54d3456b</id>
<content type='text'>
No guarantee for correctness - I mostly fiddled this together in order
to use common nixpkgs python package functions  for including boltzgen
in other shell environments.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
No guarantee for correctness - I mostly fiddled this together in order
to use common nixpkgs python package functions  for including boltzgen
in other shell environments.
</pre>
</div>
</content>
</entry>
<entry>
<title>Optionally generate cell-list-based OpenCL dispatch functions</title>
<updated>2019-10-27T21:22:24+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-27T21:22:24+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=536b25e2c2b742c17d023d2b3386fed4dc60a339'/>
<id>536b25e2c2b742c17d023d2b3386fed4dc60a339</id>
<content type='text'>
Requires different function naming as OpenCL 1.2 doesn't support overloads.

The OpenCL kernel code generated using this commit was successfully tested
on an actual GPU. Time to set up some automatic validation.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Requires different function naming as OpenCL 1.2 doesn't support overloads.

The OpenCL kernel code generated using this commit was successfully tested
on an actual GPU. Time to set up some automatic validation.
</pre>
</div>
</content>
</entry>
<entry>
<title>Verify precision parameter</title>
<updated>2019-10-27T21:09:27+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-27T21:09:27+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=1b9ac6e7aee3cf63495a65c2d7dbf79a0be23d7d'/>
<id>1b9ac6e7aee3cf63495a65c2d7dbf79a0be23d7d</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Add README</title>
<updated>2019-10-27T19:11:36+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-27T19:11:36+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=89f60f39a226bef5ccd8c52dbd57891c3a4d74c7'/>
<id>89f60f39a226bef5ccd8c52dbd57891c3a4d74c7</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Accept cell id as parameter in OpenCL functions</title>
<updated>2019-10-27T18:50:21+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-27T18:50:21+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=a99edaaa0e9a95354f68614cd2f4ab801179b946'/>
<id>a99edaaa0e9a95354f68614cd2f4ab801179b946</id>
<content type='text'>
It is more flexible to place OpenCL thread ID dependent dispatching in a separate function.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It is more flexible to place OpenCL thread ID dependent dispatching in a separate function.
</pre>
</div>
</content>
</entry>
<entry>
<title>Add bounce back boundary condition</title>
<updated>2019-10-27T18:40:56+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-27T18:40:56+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=dcd162aef328cde66082a3333740ec6f58298a4c'/>
<id>dcd162aef328cde66082a3333740ec6f58298a4c</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Tidy up shell environment</title>
<updated>2019-10-27T18:18:20+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-27T18:18:20+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=6e99286c9a756272e972fe21212b094de95d36f6'/>
<id>6e99286c9a756272e972fe21212b094de95d36f6</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Use Mako defines to generate momenta boundaries</title>
<updated>2019-10-27T18:11:26+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-27T18:11:26+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=4315b5c62fe862dcacbe2a8958732c8a66bdb6e1'/>
<id>4315b5c62fe862dcacbe2a8958732c8a66bdb6e1</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Extract target-dependent floating point type name</title>
<updated>2019-10-27T15:11:17+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-27T15:11:17+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=30f262287bd43015b155b0e882e6478cfae2780c'/>
<id>30f262287bd43015b155b0e882e6478cfae2780c</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Move layout implementations into separate folder</title>
<updated>2019-10-27T15:10:33+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-27T15:03:07+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=6938758f6c1754f0ee49d0709dd0ca376a146010'/>
<id>6938758f6c1754f0ee49d0709dd0ca376a146010</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Disable bytecode caching</title>
<updated>2019-10-27T15:02:31+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-27T15:02:31+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=74c60dcbe56862d73b000d569423cb298fb06686'/>
<id>74c60dcbe56862d73b000d569423cb298fb06686</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Separate functions into separate template files</title>
<updated>2019-10-27T13:05:21+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-27T13:05:21+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=18c54d79699db7554faa851c87d7113db67a8a08'/>
<id>18c54d79699db7554faa851c87d7113db67a8a08</id>
<content type='text'>
Selection of the desired templates is possible via a new `functions` parameter.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Selection of the desired templates is possible via a new `functions` parameter.
</pre>
</div>
</content>
</entry>
<entry>
<title>Add extra toggle for OpenMP in C++ test function</title>
<updated>2019-10-26T21:00:50+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-26T21:00:50+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=7fa72d8718d96727bcfd60cc3bcb1609526d3c9b'/>
<id>7fa72d8718d96727bcfd60cc3bcb1609526d3c9b</id>
<content type='text'>
Yields ~160 MLUPs on a Xeon E3-1241 for D2Q9 double precision lid driven cavity.
Obviously not anywhere near what is possible on GPUs but respectable for a CPU implementation.
Especially considering how simple it is.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Yields ~160 MLUPs on a Xeon E3-1241 for D2Q9 double precision lid driven cavity.
Obviously not anywhere near what is possible on GPUs but respectable for a CPU implementation.
Especially considering how simple it is.
</pre>
</div>
</content>
</entry>
<entry>
<title>Change C++ test function to LDC with optional VTK output</title>
<updated>2019-10-26T20:44:59+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-26T20:44:59+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=2a976c2c60565ea3f904feaf4ea573b2769e3084'/>
<id>2a976c2c60565ea3f904feaf4ea573b2769e3084</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Generate primitive velocity momenta BC for C++ target</title>
<updated>2019-10-26T20:44:09+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-26T20:44:09+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=820ed4b674a199c252e1b77ee7013f330ef284bb'/>
<id>820ed4b674a199c252e1b77ee7013f330ef284bb</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Support passing additional string arguments to the generator</title>
<updated>2019-10-26T13:01:39+00:00</updated>
<author>
<name>Adrian Kummerlaender</name>
</author>
<published>2019-10-26T13:01:39+00:00</published>
<link rel='alternate' type='text/html' href='https://code.kummerlaender.eu/boltzgen/commit/?id=1bf577b1c5e606ac2c0553857297ce8c0c04ccb7'/>
<id>1bf577b1c5e606ac2c0553857297ce8c0c04ccb7</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
</feed>
