aboutsummaryrefslogtreecommitdiff
path: root/shell.nix
diff options
context:
space:
mode:
authorAdrian Kummerlaender2019-06-08 23:08:28 +0200
committerAdrian Kummerlaender2019-06-08 23:08:28 +0200
commit5ac924371a7e53641a2f726a9f431ab8cb99f9fb (patch)
tree7e62a64ca253c8c18a85c2131b141125eb115b33 /shell.nix
parent4a6b0bb928db91d57eaa09b656d296e79eafe7ed (diff)
downloadsymlbm_playground-5ac924371a7e53641a2f726a9f431ab8cb99f9fb.tar
symlbm_playground-5ac924371a7e53641a2f726a9f431ab8cb99f9fb.tar.gz
symlbm_playground-5ac924371a7e53641a2f726a9f431ab8cb99f9fb.tar.bz2
symlbm_playground-5ac924371a7e53641a2f726a9f431ab8cb99f9fb.tar.lz
symlbm_playground-5ac924371a7e53641a2f726a9f431ab8cb99f9fb.tar.xz
symlbm_playground-5ac924371a7e53641a2f726a9f431ab8cb99f9fb.tar.zst
symlbm_playground-5ac924371a7e53641a2f726a9f431ab8cb99f9fb.zip
Performance optimizations
Starting point: ~200 MLUPS on a NVidia K2200 Changes that did not noticeably impact performance: * Memory layout AOS vs. SOA (weird, probably highly platform dependent) * Propagate on read * Tagging pointers as read / write only * Manual code inlining Changes that made things worse: * Bad thread block sizes The actual issue: * Hidden double precision computations => Code now yields ~600 MLUPS
Diffstat (limited to 'shell.nix')
-rw-r--r--shell.nix1
1 files changed, 1 insertions, 0 deletions
diff --git a/shell.nix b/shell.nix
index c431426..0022724 100644
--- a/shell.nix
+++ b/shell.nix
@@ -23,6 +23,7 @@ pkgs.stdenvNoCC.mkDerivation rec {
local-python = custom-python.withPackages (python-packages: with python-packages; [
numpy
+ sympy
pyopencl
pyopengl
pygobject3