|
Intrepid2
|
Implementation of a general sum factorization algorithm, using a novel approach developed by Roberts, for integration. Uses hierarchical parallelism. More...
#include <Intrepid2_IntegrationToolsDef.hpp>
Public Member Functions | |
| F_IntegratePointValueCache (Data< Scalar, DeviceType > integralData, TensorData< Scalar, DeviceType > leftComponent, Data< Scalar, DeviceType > composedTransform, TensorData< Scalar, DeviceType > rightComponent, TensorData< Scalar, DeviceType > cellMeasures, int a_offset, int b_offset, int leftFieldOrdinalOffset, int rightFieldOrdinalOffset) | |
| template<size_t maxComponents, size_t numComponents = maxComponents> | |
| KOKKOS_INLINE_FUNCTION int | incrementArgument (Kokkos::Array< int, maxComponents > &arguments, const Kokkos::Array< int, maxComponents > &bounds) const |
| KOKKOS_INLINE_FUNCTION int | incrementArgument (Kokkos::Array< int, Parameters::MaxTensorComponents > &arguments, const Kokkos::Array< int, Parameters::MaxTensorComponents > &bounds, const int &numComponents) const |
| runtime-sized variant of incrementArgument; gets used by approximate flop count. | |
| template<size_t maxComponents, size_t numComponents = maxComponents> | |
| KOKKOS_INLINE_FUNCTION int | nextIncrementResult (const Kokkos::Array< int, maxComponents > &arguments, const Kokkos::Array< int, maxComponents > &bounds) const |
| KOKKOS_INLINE_FUNCTION int | nextIncrementResult (const Kokkos::Array< int, Parameters::MaxTensorComponents > &arguments, const Kokkos::Array< int, Parameters::MaxTensorComponents > &bounds, const int &numComponents) const |
| runtime-sized variant of nextIncrementResult; gets used by approximate flop count. | |
| template<size_t maxComponents, size_t numComponents = maxComponents> | |
| KOKKOS_INLINE_FUNCTION int | relativeEnumerationIndex (const Kokkos::Array< int, maxComponents > &arguments, const Kokkos::Array< int, maxComponents > &bounds, const int startIndex) const |
| template<int rank> | |
| KOKKOS_INLINE_FUNCTION enable_if_t< rank==3 &&rank==integralViewRank, Scalar & > | integralViewEntry (const IntegralViewType &integralView, const int &cellDataOrdinal, const int &i, const int &j) const |
| template<int rank> | |
| KOKKOS_INLINE_FUNCTION enable_if_t< rank==2 &&rank==integralViewRank, Scalar & > | integralViewEntry (const IntegralViewType &integralView, const int &cellDataOrdinal, const int &i, const int &j) const |
| KOKKOS_INLINE_FUNCTION void | runSpecialized3 (const TeamMember &teamMember) const |
| Hand-coded 3-component version. | |
| template<size_t numTensorComponents> | |
| KOKKOS_INLINE_FUNCTION void | run (const TeamMember &teamMember) const |
| KOKKOS_INLINE_FUNCTION void | operator() (const TeamMember &teamMember) const |
| long | approximateFlopCountPerCell () const |
| returns an estimate of the number of floating point operations per cell (counting sums, subtractions, divisions, and multiplies, each of which counts as one operation). | |
| int | teamSize (const int &maxTeamSizeFromKokkos) const |
| returns the team size that should be provided to the policy constructor, based on the Kokkos maximum and the amount of thread parallelism we have available. | |
| size_t | team_shmem_size (int numThreads) const |
| Provide the shared memory capacity. | |
Private Types | |
| using | ExecutionSpace = typename DeviceType::execution_space |
| using | TeamPolicy = Kokkos::TeamPolicy<DeviceType> |
| using | TeamMember = typename TeamPolicy::member_type |
| using | IntegralViewType = Kokkos::View<typename RankExpander<Scalar, integralViewRank>::value_type, DeviceType> |
Private Attributes | |
| IntegralViewType | integralView_ |
| TensorData< Scalar, DeviceType > | leftComponent_ |
| Data< Scalar, DeviceType > | composedTransform_ |
| TensorData< Scalar, DeviceType > | rightComponent_ |
| TensorData< Scalar, DeviceType > | cellMeasures_ |
| int | a_offset_ |
| int | b_offset_ |
| int | leftComponentSpan_ |
| int | rightComponentSpan_ |
| int | numTensorComponents_ |
| int | leftFieldOrdinalOffset_ |
| int | rightFieldOrdinalOffset_ |
| size_t | fad_size_output_ = 0 |
| Kokkos::Array< int, Parameters::MaxTensorComponents > | leftFieldBounds_ |
| Kokkos::Array< int, Parameters::MaxTensorComponents > | rightFieldBounds_ |
| Kokkos::Array< int, Parameters::MaxTensorComponents > | pointBounds_ |
| int | maxFieldsLeft_ |
| int | maxFieldsRight_ |
| int | maxPointCount_ |
Implementation of a general sum factorization algorithm, using a novel approach developed by Roberts, for integration. Uses hierarchical parallelism.
Whereas F_Integrate, and Mora and Demkowicz, and all others we are aware of, cache partial sums at intermediate component levels — the cached values are indexed by component basis ordinals — we integrate the first component in its dimension(s) and store values for integration points in the remaining dimensions, so that our caches are indexed by point ordinals. If there are L_x, L_y, and L_z quadrature points in dimensions x,y,z, we require a cache of size L_y * L_z +1 for a 3D, 3-component integral. The standard approach requires a cache of size (p_x+1)*(p_y+1). So long as one is not over-integrating by too much, these sizes are about the same. The real advantage of our approach here is (we expect) that it improves data locality.
Definition at line 1025 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1027 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1031 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1029 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1028 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 1057 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
returns an estimate of the number of floating point operations per cell (counting sums, subtractions, divisions, and multiplies, each of which counts as one operation).
Definition at line 1630 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 1109 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
runtime-sized variant of incrementArgument; gets used by approximate flop count.
Definition at line 1126 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 1205 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 1197 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 1144 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
runtime-sized variant of nextIncrementResult; gets used by approximate flop count.
Definition at line 1159 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 1611 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 1175 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 1472 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Hand-coded 3-component version.
Definition at line 1212 of file Intrepid2_IntegrationToolsDef.hpp.
References Intrepid2::Data< DataScalar, DeviceType >::extent_int(), and Intrepid2::Data< DataScalar, DeviceType >::rank().
|
inline |
Provide the shared memory capacity.
Definition at line 1786 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
returns the team size that should be provided to the policy constructor, based on the Kokkos maximum and the amount of thread parallelism we have available.
Definition at line 1777 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1037 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1038 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1036 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1034 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1045 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1032 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1033 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1039 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1049 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1042 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1053 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1054 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1055 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1041 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1051 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1035 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1040 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1050 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 1043 of file Intrepid2_IntegrationToolsDef.hpp.