diff options
author | Adrian Kummerlaender | 2019-11-10 21:14:07 +0100 |
---|---|---|
committer | Adrian Kummerlaender | 2019-11-10 21:18:57 +0100 |
commit | 4a2885ad3ae0396486d288df94339d0c45e6db8b (patch) | |
tree | 1a0b5aa000bbcde65fa020381a02b19bb452e284 /boltzgen/utility/__init__.py | |
parent | d136bb30bc8a9393372ec905aea500a0b61000e3 (diff) | |
download | boltzgen-4a2885ad3ae0396486d288df94339d0c45e6db8b.tar boltzgen-4a2885ad3ae0396486d288df94339d0c45e6db8b.tar.gz boltzgen-4a2885ad3ae0396486d288df94339d0c45e6db8b.tar.bz2 boltzgen-4a2885ad3ae0396486d288df94339d0c45e6db8b.tar.lz boltzgen-4a2885ad3ae0396486d288df94339d0c45e6db8b.tar.xz boltzgen-4a2885ad3ae0396486d288df94339d0c45e6db8b.tar.zst boltzgen-4a2885ad3ae0396486d288df94339d0c45e6db8b.zip |
Implement basic CUDA target
Currently only for the SSS streaming pattern.
CudaCodePrinter in `utility/printer.py` is required to add a 'f' suffix
to all single precision floating point literals. If this is not done
(when targeting single precision) most calculations happen in double
precision which destroys performance. (In OpenCL this is not necessary
as we can simply set the `-cl-single-precision-constant` flag. Sadly
such a flag doesn't seem to exist for nvcc.)
Diffstat (limited to 'boltzgen/utility/__init__.py')
-rw-r--r-- | boltzgen/utility/__init__.py | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/boltzgen/utility/__init__.py b/boltzgen/utility/__init__.py index fa9c760..5905c36 100644 --- a/boltzgen/utility/__init__.py +++ b/boltzgen/utility/__init__.py @@ -1,5 +1,6 @@ from . import optimizations from . import ndindex +from . import printer from sympy.codegen.ast import Assignment |