aboutsummaryrefslogtreecommitdiff log msg author committer range
blob: 98780e0f72ffa462d223aa64310337b9571a9997 (plain)
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500  .highlight .hll { background-color: #ffffcc } .highlight .c { color: #888888 } /* Comment */ .highlight .err { color: #a61717; background-color: #e3d2d2 } /* Error */ .highlight .k { color: #008800; font-weight: bold } /* Keyword */ .highlight .ch { color: #888888 } /* Comment.Hashbang */ .highlight .cm { color: #888888 } /* Comment.Multiline */ .highlight .cp { color: #cc0000; font-weight: bold } /* Comment.Preproc */ .highlight .cpf { color: #888888 } /* Comment.PreprocFile */ .highlight .c1 { color: #888888 } /* Comment.Single */ .highlight .cs { color: #cc0000; font-weight: bold; background-color: #fff0f0 } /* Comment.Special */ .highlight .gd { color: #000000; background-color: #ffdddd } /* Generic.Deleted */ .highlight .ge { font-style: italic } /* Generic.Emph */ .highlight .gr { color: #aa0000 } /* Generic.Error */ .highlight .gh { color: #333333 } /* Generic.Heading */ .highlight .gi { color: #000000; background-color: #ddffdd } /* Generic.Inserted */ .highlight .go { color: #888888 } /* Generic.Output */ .highlight .gp { color: #555555 } /* Generic.Prompt */ .highlight .gs { font-weight: bold } /* Generic.Strong */ .highlight .gu { color: #666666 } /* Generic.Subheading */ .highlight .gt { color: #aa0000 } /* Generic.Traceback */ .highlight .kc { color: #008800; font-weight: bold } /* Keyword.Constant */ .highlight .kd { color: #008800; font-weight: bold } /* Keyword.Declaration */ .highlight .kn { color: #008800; font-weight: bold } /* Keyword.Namespace */ .highlight .kp { color: #008800 } /* Keyword.Pseudo */ .highlight .kr { color: #008800; font-weight: bold } /* Keyword.Reserved */ .highlight .kt { color: #888888; font-weight: bold } /* Keyword.Type */ .highlight .m { color: #0000DD; font-weight: bold } /* Literal.Number */ .highlight .s { color: #dd2200; background-color: #fff0f0 } /* Literal.String */ .highlight .na { color: #336699 } /* Name.Attribute */ .highlight .nb { color: #003388 } /* Name.Builtin */ .highlight .nc { color: #bb0066; font-weight: bold } /* Name.Class */ .highlight .no { color: #003366; font-weight: bold } /* Name.Constant */ .highlight .nd { color: #555555 } /* Name.Decorator */ .highlight .ne { color: #bb0066; font-weight: bold } /* Name.Exception */ .highlight .nf { color: #0066bb; font-weight: bold } /* Name.Function */ .highlight .nl { color: #336699; font-style: italic } /* Name.Label */ .highlight .nn { color: #bb0066; font-weight: bold } /* Name.Namespace */ .highlight .py { color: #336699; font-weight: bold } /* Name.Property */ .highlight .nt { color: #bb0066; font-weight: bold } /* Name.Tag */ .highlight .nv { color: #336699 } /* Name.Variable */ .highlight .ow { color: #008800 } /* Operator.Word */ .highlight .w { color: #bbbbbb } /* Text.Whitespace */ .highlight .mb { color: #0000DD; font-weight: bold } /* Literal.Number.Bin */ .highlight .mf { color: #0000DD; font-weight: bold } /* Literal.Number.Float */ .highlight .mh { color: #0000DD; font-weight: bold } /* Literal.Number.Hex */ .highlight .mi { color: #0000DD; font-weight: bold } /* Literal.Number.Integer */ .highlight .mo { color: #0000DD; font-weight: bold } /* Literal.Number.Oct */ .highlight .sa { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Affix */ .highlight .sb { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Backtick */ .highlight .sc { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Char */ .highlight .dl { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Delimiter */ .highlight .sd { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Doc */ .highlight .s2 { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Double */ .highlight .se { color: #0044dd; background-color: #fff0f0 } /* Literal.String.Escape */ .highlight .sh { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Heredoc */ .highlight .si { color: #3333bb; background-color: #fff0f0 } /* Literal.String.Interpol */ .highlight .sx { color: #22bb22; background-color: #f0fff0 } /* Literal.String.Other */ .highlight .sr { color: #008800; background-color: #fff0ff } /* Literal.String.Regex */ .highlight .s1 { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Single */ .highlight .ss { color: #aa6600; background-color: #fff0f0 } /* Literal.String.Symbol */ .highlight .bp { color: #003388 } /* Name.Builtin.Pseudo */ .highlight .fm { color: #0066bb; font-weight: bold } /* Name.Function.Magic */ .highlight .vc { color: #336699 } /* Name.Variable.Class */ .highlight .vg { color: #dd7700 } /* Name.Variable.Global */ .highlight .vi { color: #3333bb } /* Name.Variable.Instance */ .highlight .vm { color: #336699 } /* Name.Variable.Magic */ .highlight .il { color: #0000DD; font-weight: bold } /* Literal.Number.Integer.Long */# Expressive meta templates for flexible handling of compile-time constants So [we](http://www.lbrg.kit.edu/) recently released a new version of [OpenLB](http://www.openlb.net) which includes a major refactoring of the central datastructure used to handle various kinds of compile-time constants required by the simulation. This article will summarize the motivation and design of this new concept as well as highlight a couple of tricks and pitfalls in the context of template metaprogramming. ## What is a descriptor? Every simulation based on Lattice Boltzmann Methods can be characterized by a set of constants such as the modelled spatial dimension, the number of neighbors in the underlying regular grid, the weights used to compute equilibrium distributions or the lattice speed of sound. Due to OpenLB's goal of offering a wide variety of LB models to address many different kinds of flow problems, the constants are not hardcoded throughout the codebase but rather maintained in compile-time data structures. Any usage of these constants can then refer to the characterizing descriptor data structure. cpp /// Old equilibrium implementation using descriptor data static T equilibrium(int iPop, T rho, const T u[DESCRIPTOR::d], const T uSqr) { T c_u = T(); for (int iD=0; iD < DESCRIPTOR::d; ++iD) { c_u += DESCRIPTOR::c[iPop][iD]*u[iD]; } return rho * DESCRIPTOR::t[iPop] * ( (T)1 + DESCRIPTOR::invCs2 * c_u + DESCRIPTOR::invCs2 * DESCRIPTOR::invCs2 * (T)0.5 * c_u * c_u - DESCRIPTOR::invCs2 * (T)0.5 * uSqr ) - DESCRIPTOR::t[iPop]; }  As many parts of the code do not actually care which specific descriptor is used, most classes and functions are templates that accept any user-defined descriptor type. This allows us to e.g. select descriptor specific optimizations[^1] via plain template specializations. To continue, the descriptor concept is tightly coupled to the definition of the cells that make up the simulation lattice. The reason for this connection is that we require some place to store the essential per-direction population fields for each node of the lattice. In OpenLB this place is currently the Cell class[^2] which locally maintains the population data and as such implements a collision-optimized _array of structures_ memory layout. As a side note this was the initial motivation for rethinking the descriptor concept as we require more flexible structures to turn this into a more efficient _structures of arrays_ situation[^3]. [^1]: e.g. collision steps where all generic code is resolved using common subexpression elimination in order to minimze the number of floating point operations [^2]: see src/core/cell.h for further reading [^3]: The performance LBM codes is in general not bound by the available processing power but but rather by how well we utilize the available memory bandwidth. i.e. we want to optimize memory throughput as much as possible which leads us to the need for more efficient streaming steps that in turn require changes to the memory layout. ## What was used prior to refactoring? To better appreciate the new concept we should probably first take a closer look at how stuff this stuff was implemented previously. As a starting point all descriptors were derived from a descriptor base type such as D2Q9DescriptorBase for two dimensional lattices with nine discrete velocities: cpp template struct D2Q9DescriptorBase { typedef D2Q9DescriptorBase BaseDescriptor; enum { d = 2, q = 9 }; ///< number of dimensions/distr. functions static const int vicinity; ///< size of neighborhood static const int c[q][d]; ///< lattice directions static const int opposite[q]; ///< opposite entry static const T t[q]; ///< lattice weights static const T invCs2; ///< inverse square of speed of sound };  As we can see this is a plain struct template with some static member constants to store the data. This in itself is not problematic and worked just fine since the project's inception. Note that the template allows for specification of the floating point type used for all non-integer data. This is required to e.g. use automatic differentiation types that allow for taking the derivative of the whole simulation in order to apply optimization techniques. cpp template const int D2Q9DescriptorBase::vicinity = 1; template const int D2Q9DescriptorBase::c [D2Q9DescriptorBase::q][D2Q9DescriptorBase::d] = { { 0, 0}, {-1, 1}, {-1, 0}, {-1,-1}, { 0,-1}, { 1,-1}, { 1, 0}, { 1, 1}, { 0, 1} }; template const int D2Q9DescriptorBase::opposite[D2Q9DescriptorBase::q] = { 0, 5, 6, 7, 8, 1, 2, 3, 4 }; template const T D2Q9DescriptorBase::t[D2Q9DescriptorBase::q] = { (T)4/(T)9, (T)1/(T)36, (T)1/(T)9, (T)1/(T)36, (T)1/(T)9, (T)1/(T)36, (T)1/(T)9, (T)1/(T)36, (T)1/(T)9 }; template const T D2Q9DescriptorBase::invCs2 = (T)3;  The actual data was stored in a separate header src/dynamics/latticeDescriptors.hh. All in all this very straight forward approach worked as expected and could be fully resolved at compile time to avoid unnecessary run time jumps inside critical code sections as far as the descriptor concept is concerned. The real issue starts when we take a look at the so called _external fields_: cpp struct Force2dDescriptor { static const int numScalars = 2; static const int numSpecies = 1; static const int forceBeginsAt = 0; static const int sizeOfForce = 2; }; struct Force2dDescriptorBase { typedef Force2dDescriptor ExternalField; }; template struct ForcedD2Q9Descriptor : public D2Q9DescriptorBase, public Force2dDescriptorBase { };  Some LBM models require additional per-cell data such as external force vectors or values to model chemical properties. As we can see the declaration of these _external fields_ is another task of the descriptor data structure and _the_ task that was solved the ugliest in our original implementation. cpp // Set force vectors in all cells of material number 1 sLattice.defineExternalField( superGeometry, 1, DESCRIPTOR::ExternalField::forceBeginsAt, DESCRIPTOR::ExternalField::sizeOfForce, force );  For example this is a completely unsafe access to raw memory as forceBeginsAt and sizeOfForce define arbitrary memory offsets. And while we might not care about security in this context you can probably imagine the kinds of obscure bugs caused by potentially faulty and inconsistent handling of such offsets. To make things worse the naming of external field indices and size constants was inconsistent between different fields and stuff only worked as long as a unclear set of naming and layout conventions was followed. If you want to risk an even closer look[^4] you can download [version 1.2 or earlier](https://www.openlb.net/download/) and start your dive in src/dynamics/latticeDescriptors.h. Otherwise we are going to continue with a description of the new approach. [^4]: Note that this examination of the issues with the previous descriptor concept is not aimed to be a strike at its original developers but rather as an example of how things can get out of hand when expanding a initial concept to cover more and more stuff. As far as legacy code is concerned this is still relatively tame and obviously the niceness of such scaffolding for the actual simulation is a side show when one first and foremost wants to generate new results. ## What is a meta descriptor? The initial spark for the development of the new meta descriptor concept was the idea to define external fields as the parametrization of a multilinear function on the foundational D and Q constants of each descriptor[^5]. Lists of such functions could then be passed around via variadic template argument lists. This would then allow for handling external fields in a manner that is both flexible and consistent across all descriptors. [^5]: i.e. each field describes its size as a function $f : \mathbb{N}_0^3 \to \mathbb{N}_0, (a,b,c) \mapsto a + b D + c Q$ Before we delve into the details if how these expectations were implemented let us first take a look at how the basic D2Q9 descriptor is defined in the latest OpenLB release: cpp template struct D2Q9 : public DESCRIPTOR_BASE<2,9,POPULATION,FIELDS...> { typedef D2Q9 BaseDescriptor; D2Q9() = delete; }; namespace data { template <> constexpr int vicinity<2,9> = 1; template <> constexpr int c<2,9>[9][2] = { { 0, 0}, {-1, 1}, {-1, 0}, {-1,-1}, { 0,-1}, { 1,-1}, { 1, 0}, { 1, 1}, { 0, 1} }; template <> constexpr int opposite<2,9>[9] = { 0, 5, 6, 7, 8, 1, 2, 3, 4 }; template <> constexpr Fraction t<2,9>[9] = { {4, 9}, {1, 36}, {1, 9}, {1, 36}, {1, 9}, {1, 36}, {1, 9}, {1, 36}, {1, 9} }; template <> constexpr Fraction cs2<2,9> = {1, 3}; }  These few compact lines[^51] describe the whole structure including all of its data. The various functions to access this data are auto-generated in a generic fashion using template metaprogramming and the previously verbose definition of a forced LB model reduces to a single self-explanatory type alias: [^51]: See src/dynamics/latticeDescriptors.h cpp using ForcedD2Q9Descriptor = D2Q9;  Descriptor data is now exposed via an adaptable set of free functions templated on the descriptor type. This was required to satisfy a secondary goal of decoupling descriptor data definitions and accesses in order to add support for both transparent auto-generation and platform adaptation (i.e. adding workarounds for porting the code to the GPU). cpp /// Refactored generic equilibrium implementation static T equilibrium(int iPop, T rho, const T u[DESCRIPTOR::d], const T uSqr) { T c_u = T{}; for (int iD = 0; iD < DESCRIPTOR::d; ++iD) { c_u += descriptors::c(iPop,iD) * u[iD]; } return rho * descriptors::t(iPop) * ( T{1} + descriptors::invCs2() * c_u + descriptors::invCs2() * descriptors::invCs2() * T{0.5} * c_u * c_u - descriptors::invCs2() * T{0.5} * uSqr ) - descriptors::t(iPop); }  The inclusion of the descriptors namespace slightly increases the verbosity of functions such as the one above. If things get to bad we can use local namespace inclusion as a workaround. But even if this was not possible the transparent extensibility (i.e. the ability to customize the underlying implementation without changing all call sites) more than makes up for increasing the character count of some sections. ## Implementation Back in 2013 I experimented with [_mapping binary structures as tuples using template metaprogramming_](/article/mapping_binary_structures_as_tuples_using_template_metaprogramming/) in order to develop the foundations for a graph database. Surprisingly there were quite a few parallels between what I was doing then to what I am describing in this article. While I neither used the resulting [BinaryMapping](https://github.com/KnairdA/BinaryMapping) library for the development of [GraphStorage](https://github.com/KnairdA/GraphStorage) nor ever used this then LevelDB-based graph _database_ for more than a couple of basic examples, it was a welcome surprise to think back to my first steps doing more template-centered C++ programming. cpp /// Base descriptor of a D-dimensional lattice with Q directions and a list of additional fields template struct DESCRIPTOR_BASE { /// Deleted constructor to enforce pure usage as type and prevent implicit narrowing conversions DESCRIPTOR_BASE() = delete; /// Number of dimensions static constexpr int d = D; /// Number of velocities static constexpr int q = Q; /* [...] */ };  As the description of any LBM model includes at least a number of spatial dimensions D and a number of discrete velocities Q these two constants are the required template arguments of the new DESCRIPTOR_BASE class template[^61]. Until we finally get concepts in C++, the members of the FIELDS list are by convention expected to offer a size and getLocalIndex template methods accepting these two foundational constants. [^61]: See src/dynamics/descriptorBase.h cpp /// Base of a descriptor field whose size is defined by A*D + B*Q + C template struct DESCRIPTOR_FIELD_BASE { /// Deleted constructor to enforce pure usage as type and prevent implicit narrowing conversions DESCRIPTOR_FIELD_BASE() = delete; /// Evaluates the size function template static constexpr unsigned size() { return A * D + B * Q + C; } /// Returns global index from local index and provides out_of_range safety template static constexpr unsigned getLocalIndex(const unsigned localIndex) { return localIndex < (A*D+B*Q+C) ? localIndex : throw std::out_of_range("Index exceeds data field"); } };  Most[^62] fields use the DESCRIPTOR_FIELD_BASE template as a base class. This template parametrizes the previously mentioned multilinear size function and allows for sharing field definitions between all descriptors. cpp // Field types need to be distinct (i.e. not aliases) in order for DESCRIPTOR_BASE::index to work // (Field size parametrized by: Cs + Ds*D + Qs*Q) Cs Ds Qs struct POPULATION : public DESCRIPTOR_FIELD_BASE<0, 0, 1> { }; struct FORCE : public DESCRIPTOR_FIELD_BASE<0, 1, 0> { }; struct SOURCE : public DESCRIPTOR_FIELD_BASE<1, 0, 0> { }; /* [...] */  Let us take the FORCE field as an example[^63]: This field represents a cell-local force vector and as such requires exactly D floating point values worth of storage. Correspondingly its base class is DESCRIPTOR_FIELD_BASE<0,1,0> which yields a size of 2 for two-dimensional and 3 for three-dimensional descriptors. [^62]: e.g. there is also a TENSOR base template that encodes the size of a tensor of order D (which is not a linear function) [^63]: Common field definitions are collected in src/dynamics/descriptorField.h Building upon this common field structure allows us to write down a getIndexFromFieldList helper function template that automatically calculates the starting offset of any element in an arbitrary list of fields: cpp template < unsigned D, unsigned Q, typename WANTED_FIELD, typename CURRENT_FIELD, typename... FIELDS, // WANTED_FIELD equals the head of our field list, terminate recursion std::enable_if_t::value, int> = 0 > constexpr unsigned getIndexFromFieldList() { return 0; } template < unsigned D, unsigned Q, typename WANTED_FIELD, typename CURRENT_FIELD, typename... FIELDS, // WANTED_FIELD doesn't equal the head of our field list std::enable_if_t::value, int> = 0 > constexpr unsigned getIndexFromFieldList() { // Break compilation when WANTED_FIELD is not provided by list of fields static_assert(sizeof...(FIELDS) > 0, "Field not found."); // Add size of current field to implicit offset and continue search // for WANTED_FIELD in the tail of our field list return CURRENT_FIELD::template size() + getIndexFromFieldList(); }  As far as template metaprogramming is concerned this code is quite basic -- we simply recursively traverse the variadic field list and sum up the field sizes along the way. This function is wrapped by the DESCRIPTOR_BASE::index method template that exposes the memory offset of a given field. We are left with a generic interface that replaces our previous inconsistent and hard to maintain field offsets in the vein of DESCRIPTOR::ExternalField::forceBeginsAt. cpp /// Returns index of WANTED_FIELD /** * Fails compilation if WANTED_FIELD is not contained in FIELDS. * Branching that depends on this information can be realized using provides. **/ template static constexpr int index(const unsigned localIndex=0) { return getIndexFromFieldList() + WANTED_FIELD::template getLocalIndex(localIndex); }  As we will see in the section on _improved field access_ this method is not commonly used in user code but rather as a building block for self-documenting field accessors. One might notice that the abstraction layers are starting to pile up -- luckily all of them are by themselves rather plain constexpr function templates and can as such still be fully collapsed during compile time. ### Fraction types The alert reader might have noticed that the type of per-direction weight constants descriptors::data::t was changed to Fraction in our new meta descriptor. The reason for this is that we use variable templates to store these values and C++ sadly doesn't allow partial specializations in this context. To elaborate, we are not allowed to write: cpp template constexpr Fraction t[9] = { T{4}/T{9}, T{1}/T{36}, T{1}/T{9}, T{1}/T{36}, T{1}/T{9}, T{1}/T{36}, T{1}/T{9}, T{1}/T{36}, T{1}/T{9} };  To work around this issue I wrote a small floating-point independent fraction type: cpp class Fraction { private: const int _numerator; const int _denominator; public: /* [...] */ template constexpr T as() const { return T(_numerator) / T(_denominator); } template constexpr T inverseAs() const { return _numerator != 0 ? T(_denominator) / T(_numerator) : throw std::invalid_argument("inverse of zero is undefined"); } };  This works well for both integral and automatically differentiable floating point types and even yields a more pleasant syntax for defining fractional descriptor values due to C++'s implicit constructor calls. One remaining hiccup is the representation of values such as square roots that are not easily expressed as readable rational numbers. Such weights are required by some more exotic LB models and currently stored by explicit specialization for any required type. A slightly surprising fact in this context is that the C++ standard doesn't require some functions such as std::sqrt to be constexpr. This problem remained undetected for quite a while as e.g. GCC fixes this issue in a non-standard extension. So in the long term we are going to have to invest some more effort into adding compile-time math functions in the vein of [GCEM](https://www.kthohr.com/gcem.html). ### Tagging free functions As I hinted previously one major change besides the refactoring of the actual descriptor structure was the introduction of an abstraction layer between data and call sites. i.e. where we previously wrote DESCRIPTOR::t[i] to directly access the ith weight we now call a free function descriptors::t(i). The advantage of this additional layer is the ability to transparently switch out the underlying data source. Furthermore we can easily expand such free functions to distinguish between various descriptor specializations at compile time via tagging. cpp template constexpr T t(unsigned iPop, tag::DEFAULT) { return data::t[iPop].template as(); } template constexpr T t(unsigned iPop) { return t(iPop, typename DESCRIPTOR::category_tag()); }  This powerful concept uses C++'s function overload resolution to transparently call different implementations based on the given template arguments in a very compact fashion. As an example we can mark a descriptor using some non-default tag tag::SPECIAL and implement a function T t(unsigned iPop, tag::SPECIAL) to do some _special_ stuff for this descriptor -- the definition of both the tag and its function overload can be written anywhere in the codebase and will be automatically resolved by the generic implementation. This adds a whole new level of extensibility to OpenLB and is currently used to e.g. handle the special requirements of MRT LBM models. ### Extracting tags from lists One might have noticed that we accessed a DESCRIPTOR::category_tag typedef to select the correct function overload. While the canonical way to do function tagging is to simply define this type on a case by case basis in any tagged structure, I chose to develop something slightly more sophisticated: Tags are represented as special zero-size fields[^7] and passed to the descriptor specialization alongside any other fields. This feels quite nice and results in a very expressive and self-documenting interface for defining new descriptors. [^7]: See src/dynamics/descriptorTag.h cpp /// Base of a descriptor tag struct DESCRIPTOR_TAG { template static constexpr unsigned size() { return 0; // a tag doesn't have a size } };  As such DESCRIPTOR_BASE is the only place where the category_tag type is defined. To do this we filter the given list of fields and select the first _tag-field_ that is derived from our desired _tag-group_ tag::CATEGORY. cpp template using field_with_base = typename std::conditional< std::is_void::type>::value, FALLBACK, typename utilities::meta::list_item_with_base::type >::type; /* [...] */ using category_tag = tag::field_with_base< tag::CATEGORY, tag::DEFAULT, FIELDS...>;  In order to implement the utilities::meta::list_item_with_base meta template I referred back to the [_Scheme metaphor for template metaprogramming_](/article/using_scheme_as_a_metaphor_for_template_metaprogramming/) which results in a readable filtering operation based on the tools offered by the standard library's type traits: cpp /// Get first type based on BASE contained in a given type list /** * If no such list item exists, type is void. **/ template < typename BASE, typename HEAD = void, // Default argument in case the list is empty typename... TAIL > struct list_item_with_base { using type = typename std::conditional< std::is_base_of::value, HEAD, typename list_item_with_base::type >::type; }; template struct list_item_with_base { using type = typename std::conditional< std::is_base_of::value, HEAD, void >::type; };  ### Improved field access The last remaining cornerstone of OpenLB's new meta descriptor concept is the introduction of a set of convenient functions to access a cell's field values via the field's name. By taking this final step we get the ability to write simulation code that doesn't handle any raw memory offsets in addition to being more compact. Furthermore we can now in theory completely modify the underlying field storage structures without forcing the user code to change. cpp /// Return pointer to FIELD of cell template std::enable_if_t(), T*> getFieldPointer() { const int offset = DESCRIPTOR::template index(); return &(this->data[offset]); } template std::enable_if_t(), T*> getFieldPointer() { throw std::invalid_argument("DESCRIPTOR does not provide FIELD."); return nullptr; }  The foundation of all field accessors is a new Cell::getFieldPointer method template that resolves the field location using the DESCRIPTOR_BASE::index and DESCRIPTOR_BASE::size functions we defined previously. Note that we had to loosen our newly gained compile-time guarantee of a field's existence in favour of generating runtime exception code. The reason for this is that most current builds include code that depends on a certain set of fields even if those fields are not actually provided by a given descriptor. While we are going to resolve this unsatisfying situation in the future, this workaround offered an acceptable compromise. cpp /// Set value of FIELD from a vector template std::enable_if_t<(X::template size() > 1), void> setField(const Vector()>& field) { std::copy_n( field.data, DESCRIPTOR::template size(), getFieldPointer()); } /// Set value of FIELD from a scalar template std::enable_if_t<(X::template size() == 1), void> setField(T value) { getFieldPointer()[0] = value; }  Note that disabling a member function specialization depending on its parent's template arguments is only possible with some indirection: The parent template argument DESCRIPTOR is passed as the default value to the member function's X argument. This parameter can then be used by std::enable_if as one would expect. ## Two days of merging It is probably clear that the set of changes summarized so far mark a far reaching revamp of the existing codebase -- in fact there was scarcely a file untouched after I got everything to work again. As we do not live in an ideal world where I could have developed this in isolation while any other developments are stopped, both the initial prototype and the following rollout to all of OpenLB had to be developed on a seperate branch. Due to the additional hindrance that I am not actually working anywhere close to full-time on this[^8] these changes took quite a few months from inception to full realization. Correspondingly the meta descriptor and master branch had diverged significantly by the time we felt ready to merge -- you can imagine how unpleasant it was to fiddle this back together. [^8]: After all I am still primarily a mathematics student I found the three-way merge functionality offered by [Meld](https://meldmerge.org/) to be a most useful tool during this endeavour. My fingers were still twitching in a rhythmic pattern after two days of using this utility to more or less manually merge everything back together but it was still worlds better than the alternative of e.g. resolving the conflicts in a normal text editor. Sadly even in retrospect I can not think of a better alternative to letting the branches diverge this far: A significant chunk of all lines had to be changed in randomly non-trivial ways and there was no discrete point in between where you could push these changes to the rest of the team with a good conscience. At least further changes to e.g. the foundational cell data structures should now prove to be significantly easier than they would have been without this refactor. ## Summary All in all I am quite satisfied with how this new concept turned out in practice: The code is smaller and more self-documenting while growing in extensibility and consistency. The internally increased complexity is restricted to a set of classes and meta templates that the ordinary user that just wants to write a simulation should never come in contact with. Some listings in this article might look cryptic at first but as far as template metaprogramming goes this is still reasonable -- we did not run into any serious portability issues and everything works as expected in GCC, Clang and Intel's C++ compiler[^9]. [^9]: I was surprised to learn how big of an advantage the Intel compiler can provide: In some settings the generated code runs up to 20 percent faster compared to what GCC or Clang produce. To conclude things I want to encourage everyone to check out the latest [OpenLB](https://www.openlb.net/news/openlb-release-1-3-available/) release to see these and other interesting new features in practive. Should this article have awoken any interest in CFD using Lattice Boltzmann Methods, a fun introduction is provided by my [previous article](/article/fun_with_compute_shaders_and_fluid_dynamics/) on just this topic.