Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
88 commits
Select commit Hold shift + click to select a range
930ae9d
Parallel reconstruction
dhyan1272 Aug 22, 2025
e0452c1
Templated parameter not required
dhyan1272 Aug 22, 2025
5c4d42b
Timers updated
dhyan1272 Aug 25, 2025
adbff4f
Sending cell counts to neighbouring procs, collecting cell countsfrom…
dhyan1272 Aug 29, 2025
5f1045f
Sending and receiving 4 ints
dhyan1272 Sep 2, 2025
ac55908
Some more updates
dhyan1272 Sep 4, 2025
45030a1
Missing bracket
dhyan1272 Sep 5, 2025
8352108
Send gloablIds of haloCells To Owners
dhyan1272 Sep 8, 2025
6410755
HaloToOwners gets localId of the owners
dhyan1272 Sep 8, 2025
1fe2f41
OwnerToHalo Mapping
dhyan1272 Sep 9, 2025
0fdfcd7
One time arrays created
dhyan1272 Sep 9, 2025
25b6fd4
Gather working
dhyan1272 Sep 11, 2025
5aef1c5
Gather and scatter operations
dhyan1272 Sep 12, 2025
d7d466b
Couple of more specific debugs for the 2562 mesh
dhyan1272 Sep 12, 2025
685349e
Working 4 proc 2562 mesh case
dhyan1272 Sep 15, 2025
7e48e47
Removed Duplicate meshField
dhyan1272 Sep 15, 2025
238b37f
Modularized assembly per process, then communicate, and then take con…
dhyan1272 Sep 17, 2025
e28a1fb
Matrix formualtion within MPI communication within polyMPO
dhyan1272 Sep 18, 2025
92937fc
Cleanup
dhyan1272 Sep 18, 2025
9f8bbdd
Keep debugging statements using polyMPO::MP_DEBUG
dhyan1272 Sep 18, 2025
501a5ed
Removed pair<int, int>, instead 2 vectors storing ownerOwnerLocalIDs …
dhyan1272 Sep 19, 2025
b50a5ed
std::copy instead of for loop, intilaize to 0 and some formatting
dhyan1272 Sep 19, 2025
b406e85
Added timers
dhyan1272 Sep 22, 2025
e1555a2
Timers with processID
dhyan1272 Sep 23, 2025
1aac185
Fence before timers
dhyan1272 Sep 24, 2025
9e7a093
Bug in tracking for multi-CPU, out of bounds error
dhyan1272 Oct 3, 2025
89adb70
Starting gnomonic projection in polyMPO
dhyan1272 Oct 7, 2025
d359c5b
Checked gnomonic projection of MPs, vertices
dhyan1272 Oct 8, 2025
bb7da4e
Gnomonic Projection - Advection
dhyan1272 Oct 13, 2025
c0fbe6a
Clean up and option to use 2D/3D basis functions
dhyan1272 Oct 14, 2025
abc23be
New regularization and matrix inversion
dhyan1272 Oct 28, 2025
b45c37b
With Debugging statements
dhyan1272 Nov 6, 2025
da48034
Matrix inversion, Chad's fork
dhyan1272 Nov 19, 2025
a5ea31a
Removed old routines that does reconstruction assembly in MPAS
dhyan1272 Nov 21, 2025
36837b3
RotatedFlag in p_mesh
dhyan1272 Nov 26, 2025
2010d21
Removed warnings
dhyan1272 Nov 26, 2025
1939c58
Remove cudaDeviceSynchronize check for github automatic testing
dhyan1272 Nov 26, 2025
9d039ff
Replace max by Kokkos::max inside kernels
dhyan1272 Nov 26, 2025
dd91008
All ctests pass
dhyan1272 Nov 26, 2025
f881586
All ctests pass (MP rec test bot done)
dhyan1272 Nov 26, 2025
5b3d76e
Strain Rate at MP calcualtion
dhyan1272 Dec 15, 2025
9cdbde5
CleanUp_v1
dhyan1272 Dec 15, 2025
f58ddb6
CleanUp_v2
dhyan1272 Dec 15, 2025
533c8ab
CleanUp_v3
dhyan1272 Dec 16, 2025
8ea47ec
CleanUp_v4
dhyan1272 Dec 16, 2025
59f056a
CleanUp_v5
dhyan1272 Dec 18, 2025
189d00a
Ctests passing in remus
dhyan1272 Dec 22, 2025
3935c5b
A couple of comments in the tests
dhyan1272 Dec 22, 2025
44d1f40
CleanUp for PR_v1
dhyan1272 Dec 22, 2025
5856e02
CleanUp for PR_v2
dhyan1272 Dec 22, 2025
7635543
CleanUp for PR_v3
dhyan1272 Dec 23, 2025
0ca0073
CleanUp for PR_v4
dhyan1272 Dec 23, 2025
5ff1728
StrainRate Stress
dhyan1272 Jan 9, 2026
8c917ff
StressDivergence added
dhyan1272 Jan 22, 2026
cf5afec
Stress Divergence calculation updated
dhyan1272 Feb 2, 2026
f525ddb
StressDivergence within Elastic Time Loop compared
dhyan1272 Feb 4, 2026
036fa3d
Stress Divergence Verified
dhyan1272 Feb 9, 2026
d8260f4
Renamed APIs, removed debug comments
dhyan1272 Feb 11, 2026
12eaa25
Removed debug comments
dhyan1272 Feb 17, 2026
b695be2
MPI verified EVP 16 proc, 2562
dhyan1272 Feb 18, 2026
ed0dcaf
Two fields instead of 4 while doing stress divergence
dhyan1272 Feb 18, 2026
ee2f65e
Minor formatiing
dhyan1272 Feb 18, 2026
bc2061f
Grid Solve single proc smallest mesh check
dhyan1272 Mar 2, 2026
42e4c6d
Passed nVerticesOwned from MPAS
dhyan1272 Mar 3, 2026
2f2fe5e
Constitutive model number passed
dhyan1272 Mar 3, 2026
22249f0
Setting MPStrainRate
dhyan1272 Apr 2, 2026
b380e26
Removed debug statements
dhyan1272 Apr 2, 2026
2c095bb
Some more cleaning
dhyan1272 Apr 2, 2026
5b68b7d
More cleanup of writing vtk files
dhyan1272 Apr 2, 2026
3d90b53
Remove print statement
dhyan1272 Apr 6, 2026
639b1fb
Stress Transport consistent with MPAS
dhyan1272 May 8, 2026
59bb2ca
Stress Divergence Consistent with MPAS
dhyan1272 May 8, 2026
dc25517
Stress Divergence two way communication restored
dhyan1272 May 10, 2026
20c3582
Porting more stuff to polyMPO
dhyan1272 May 11, 2026
1e14123
Disp increment in polyMPO
dhyan1272 May 15, 2026
bba4699
deluDyn aggregate in polyMPO
dhyan1272 May 18, 2026
070cf15
polyMPO changes to do increments completely on GPUs
dhyan1272 May 27, 2026
3b0c12b
Removed un-necessary halo contributions, Added timer for stressDiv co…
dhyan1272 Jun 10, 2026
ba05127
Temporary sets of communication routines
dhyan1272 Jun 10, 2026
fef6858
no memcpy
dhyan1272 Jun 10, 2026
a8dae06
Improved timers
dhyan1272 Jun 12, 2026
b292e21
Test memcpy by changing SD field to contiguous RightLayout
dhyan1272 Jun 12, 2026
d746e76
Passing Host arry directly to communicate routine for preparing buffer
dhyan1272 Jun 13, 2026
b0dc417
Uses a host buffer for meshFields to avoid GPU-CPU copy
dhyan1272 Jun 15, 2026
18a2f19
MLS, Vel uses new MPI Comm
dhyan1272 Jun 16, 2026
e50ea6a
BCs in polyMPO
dhyan1272 Jun 22, 2026
888c28f
Moved startComm and deleted old non-templated comm routines
dhyan1272 Jun 24, 2026
c87fa7f
API to do halo exchange of velocity post BC in elstic loop
dhyan1272 Jun 26, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ set(HEADERS
pmpo_MPMesh_assembly.hpp
pmpo_c.h
pmpo_createTestMPMesh.hpp
pmpo_const_relation.hpp
)

set(SOURCES
Expand Down Expand Up @@ -53,4 +54,4 @@ add_library(polyMPO INTERFACE)
target_link_libraries(polyMPO INTERFACE ${polyMPO_EXPORTED_TARGETS})
bob_export_target(polyMPO)

bob_end_subdir()
bob_end_subdir()
957 changes: 657 additions & 300 deletions src/pmpo_MPMesh.cpp

Large diffs are not rendered by default.

264 changes: 236 additions & 28 deletions src/pmpo_MPMesh.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,30 +18,231 @@ template <> const MaterialPointSlice meshFieldIndexToMPSlice < MeshF_OnSurfVeloI
#define maxMPsPerElm 8

class MPMesh{
private:

bool isPreComputed;


public:

MPMesh() : isPreComputed(false){};
void computeMatricesAndSolve();
void resetPreComputeFlag();
Kokkos::View<double*[vec4d_nEntries]> precomputedVtxCoeffs;

Mesh* p_mesh;
MaterialPoints* p_MPs;

std::map<MeshFieldIndex, std::function<void()>> reconstructSlice = std::map<MeshFieldIndex, std::function<void()>>();
//For MPI Communication
int numOwnersTot, numHalosTot;
std::vector<int> numOwnersOnOtherProcs;
std::vector<int> numHalosOnOtherProcs;
std::vector<int>haloOwnerProcs;
std::vector<std::vector<int>> haloOwnerLocalIDs;
std::vector<std::vector<int>> ownerOwnerLocalIDs;
std::vector<std::vector<int>> ownerHaloLocalIDs;

void startCommunication();

void communicate_and_take_halo_contributions(const Kokkos::View<double**>& meshField, int nEntities, int numEntries, int mode, int op);

//Now Kokkos views are made 1D
template <typename ViewType>
void communicate_and_take_halo_contributions1(
const ViewType& meshField,
int nEntities,
int numEntries,
int mode ,
int op){

int self;
MPI_Comm comm = p_MPs->getMPIComm();
MPI_Comm_rank(comm, &self);

Kokkos::Timer timer;
auto reconVals_host = Kokkos::create_mirror_view_and_copy(Kokkos::HostSpace(), meshField);
pumipic::RecordTime("SD: GPU-CPU copy-" + std::to_string(self), timer.seconds());

timer.reset();
std::vector<std::vector<int>> recvIDVec;
std::vector<std::vector<double>> recvDataVec;
pumipic::RecordTime("SD: Recv Vec Allocation-" + std::to_string(self), timer.seconds());

timer.reset();
//communicateFields1(fieldData1, nEntities, numEntries, mode, recvIDVec, recvDataVec);
communicateFields1(reconVals_host, nEntities, numEntries, mode, recvIDVec, recvDataVec);
pumipic::RecordTime("SD: IP Comm-" + std::to_string(self), timer.seconds());

timer.reset();
int numProcsTot = recvIDVec.size();
//Flatten IDs
int totalSize = 0;
std::vector<int> offsets(numProcsTot, 0);
for(int i=0; i<numProcsTot; i++) {
offsets[i] = totalSize;
totalSize += recvIDVec[i].size();
}
std::vector<int> flatIDVec(totalSize, 0);
for(int i=0; i<numProcsTot; i++){
std::copy(recvIDVec[i].begin(), recvIDVec[i].end(), flatIDVec.begin() + offsets[i]);
}
pumipic::RecordTime("SD: Flatten IDs-" + std::to_string(self), timer.seconds());

timer.reset();
Kokkos::View<int*> recvIDGPU("recvIDGPU", totalSize);
auto hostView = Kokkos::View<int*, Kokkos::HostSpace>("recvIDCPU", totalSize);
std::copy(flatIDVec.begin(), flatIDVec.end(), hostView.data());
Kokkos::deep_copy(recvIDGPU, hostView);
Kokkos::fence();
pumipic::RecordTime("SD: Copy CPU-GPU-" + std::to_string(self), timer.seconds());

//Flatten Data
timer.reset();
int totalSize_data=0;
std::vector<int> offsets_data(numProcsTot, 0);
for(int i=0; i<numProcsTot; i++){
offsets_data[i] = totalSize_data;
totalSize_data += recvDataVec[i].size();
}
std::vector<double> flatDataVec(totalSize_data, 0);
for(int i=0; i<numProcsTot; i++) {
std::copy(recvDataVec[i].begin(), recvDataVec[i].end(), flatDataVec.begin() + offsets_data[i]);
}
pumipic::RecordTime("SD: Flatten Data-" + std::to_string(self), timer.seconds());

timer.reset();
Kokkos::View<double*> recvDataGPU("recvDataGPU", totalSize_data);
auto hostView_data= Kokkos::View<double*, Kokkos::HostSpace>("recvDataCPU", totalSize_data);
std::copy(flatDataVec.begin(), flatDataVec.end(), hostView_data.data());
Kokkos::deep_copy(recvDataGPU, hostView_data);
Kokkos::fence();
assert(totalSize_data == totalSize*numEntries);
for (int i=0; i<numProcsTot; i++){
assert(recvDataVec[i].size() == recvIDVec[i].size() * numEntries);
}
pumipic::RecordTime("SD: Copy CPU-GPU2-" + std::to_string(self), timer.seconds());

//Take contributions from other procs
timer.reset();
Kokkos::parallel_for("halo contribution", recvIDGPU.size(), KOKKOS_LAMBDA(const int i){
int vertex = recvIDGPU(i);
for(int k=0; k<numEntries; k++){
if(op==0) Kokkos::atomic_add(&meshField(vertex,k), recvDataGPU(i*numEntries+k));
if(op==1) meshField(vertex, k) = recvDataGPU(i * numEntries + k);
}
});
Kokkos::fence();
pumipic::RecordTime("SD: Contribution" + std::to_string(self), timer.seconds());
}


void communicateFields(const std::vector<std::vector<double>>& fieldData, const int numEntities, const int numEntries, int mode,
std::vector<std::vector<int>>& recvIDVec, std::vector<std::vector<double>>& recvDataVec);


template <class ViewType>
void communicateFields1(
const ViewType& fieldData,
const int numEntities, const int numEntries, int mode,
std::vector<std::vector<int>>& recvIDVec,
std::vector<std::vector<double>>& recvDataVec){

int self, numProcsTot;
MPI_Comm comm = p_MPs->getMPIComm();
MPI_Comm_rank(comm, &self);
MPI_Comm_size(comm, &numProcsTot);

assert(numEntities == numOwnersTot + numHalosTot);

std::vector<std::vector<double>> sendDataVec(numProcsTot);

recvIDVec.resize(numProcsTot);
recvDataVec.resize(numProcsTot);

for(int i = 0; i < numProcsTot; i++){
if(i==self) continue;

int numToSend = 0, numToRecv = 0;
if(mode == 0) {
//gather (halos send to owners)
numToSend = numOwnersOnOtherProcs[i];
numToRecv = numHalosOnOtherProcs[i];
}
else{
//scatter (owners send to halos)
numToSend = numHalosOnOtherProcs[i];
numToRecv = numOwnersOnOtherProcs[i];
}

if(numToSend > 0){
sendDataVec[i].reserve(numToSend*numEntries);
}
if(numToRecv > 0){
recvDataVec[i].resize(numToRecv*numEntries);
recvIDVec[i].resize(numToRecv);
}
}

if(mode == 0){
// Halos sends to owners
for (int iEnt = 0; iEnt < numHalosTot; iEnt++){
auto ownerProc = haloOwnerProcs[iEnt];
for (int iDouble = 0; iDouble < numEntries; iDouble++)
sendDataVec[ownerProc].push_back(fieldData(numOwnersTot+iEnt, iDouble));
}
}
else if(mode == 1){
// Owner sends to halos
for (size_t iProc=0; iProc<ownerOwnerLocalIDs.size(); iProc++) {
for (auto& ownerID : ownerOwnerLocalIDs[iProc]) {
for (int iDouble = 0; iDouble < numEntries; iDouble++)
sendDataVec[iProc].push_back(fieldData(ownerID, iDouble));
}
}
}

std::vector<MPI_Request> requests;
requests.reserve(4*numProcsTot);
for(int proc = 0; proc < numProcsTot; proc++){
if(proc == self) continue;
if(mode == 0 && numHalosOnOtherProcs[proc]){
assert(recvIDVec[proc].size() == (size_t)numHalosOnOtherProcs[proc]);
assert(recvDataVec[proc].size() == recvIDVec[proc].size() * (size_t)numEntries);
MPI_Request req3, req4;
MPI_Irecv(recvIDVec[proc].data(), recvIDVec[proc].size(), MPI_INT, proc, 1, comm, &req3);
MPI_Irecv(recvDataVec[proc].data(), recvDataVec[proc].size(), MPI_DOUBLE, proc, 2, comm, &req4);
requests.push_back(req3);
requests.push_back(req4);
}
if(mode == 0 && numOwnersOnOtherProcs[proc]) {
assert(haloOwnerLocalIDs[proc].size() == (size_t)numOwnersOnOtherProcs[proc]);
assert(sendDataVec[proc].size() == haloOwnerLocalIDs[proc].size() * (size_t)numEntries);
MPI_Request req1, req2;
MPI_Isend(haloOwnerLocalIDs[proc].data(), haloOwnerLocalIDs[proc].size(), MPI_INT, proc, 1, comm, &req1);
MPI_Isend(sendDataVec[proc].data(), sendDataVec[proc].size(), MPI_DOUBLE, proc, 2, comm, &req2);
requests.push_back(req1);
requests.push_back(req2);
}

if(mode == 1 && numOwnersOnOtherProcs[proc]){
MPI_Request req3, req4;
MPI_Irecv(recvIDVec[proc].data(), recvIDVec[proc].size(), MPI_INT, proc, 1, comm, &req3);
MPI_Irecv(recvDataVec[proc].data(), recvDataVec[proc].size(), MPI_DOUBLE, proc, 2, comm, &req4);
requests.push_back(req3);
requests.push_back(req4);
}
if(mode == 1 && numHalosOnOtherProcs[proc]) {
MPI_Request req1, req2;
MPI_Isend(ownerHaloLocalIDs[proc].data(), ownerHaloLocalIDs[proc].size(), MPI_INT, proc, 1, comm, &req1);
MPI_Isend(sendDataVec[proc].data(), sendDataVec[proc].size(), MPI_DOUBLE, proc, 2, comm, &req2);
requests.push_back(req1);
requests.push_back(req2);
}
}
MPI_Waitall(requests.size(), requests.data(), MPI_STATUSES_IGNORE);
}

MPMesh(Mesh* inMesh, MaterialPoints* inMPs):
p_mesh(inMesh), p_MPs(inMPs) {
p_mesh(inMesh), p_MPs(inMPs) {
};

~MPMesh() {
delete p_mesh;
delete p_MPs;
}

//MP advection and tracking
void CVTTrackingEdgeCenterBased(Vec2dView dx);
void CVTTrackingElmCenterBased(const int printVTPIndex = -1);
void T2LTracking(Vec2dView dx);
Expand All @@ -50,38 +251,45 @@ class MPMesh{
void push_swap();
void push_swap_pos();
void push();

//Used before advection to interpolate fields from mesh to MPs
//And also before reconstruction
void calcBasis();

//Reconstruction
DoubleView assemblyV0();
template <MaterialPointSlice index>
DoubleView wtScaAssembly();
template <MaterialPointSlice index>
Vec2dView wtVec2Assembly();
template <MeshFieldIndex meshFieldIndex>
void assembly(int order, MeshFieldType type, bool basisWeightFlag, bool massWeightFlag);
template <MeshFieldIndex meshFieldIndex>
void assemblyVtx0();
template <MeshFieldIndex meshFieldIndex>
void assemblyElm0();
template <MeshFieldIndex meshFieldIndex>
void assemblyVtx1();
template <MeshFieldIndex meshFieldIndex>
void subAssemblyVtx1(int vtxPerElm, int nCells, int comp, double* array);

void subAssemblyCoeffs(int vtxPerElm, int nCells, double* m11, double* m12, double* m13, double* m14,
double* m22, double* m23, double* m24,
double* m33, double* m34,
double* m44);
void solveMatrixAndRegularize(int nVertices, double* m11, double* m12, double* m13, double* m14,
double* m22, double* m23, double* m24,
double* m33, double* m34,
double* m44);
void reconstruct_coeff_full();
void invertMatrix(const Kokkos::View<double**>& vtxMatrices, const double& radius);
Kokkos::View<double*[vec3d_nEntries][vec4d_nEntries]> precomputedVtxCoeffs_new;
Kokkos::View<double*> nearAnEdge;
Kokkos::View<double*> vtxMatrixMass;

//Not used currently
std::map<MeshFieldIndex, std::function<void()>> reconstructSlice = std::map<MeshFieldIndex, std::function<void()>>();
template <MaterialPointSlice index>
DoubleView wtScaAssembly();
template <MaterialPointSlice index>
Vec2dView wtVec2Assembly();
template <MeshFieldIndex meshFieldIndex>
void assembly(int order, MeshFieldType type, bool basisWeightFlag, bool massWeightFlag);
template<MeshFieldIndex meshFieldIndex>
void setReconstructSlice(int order, MeshFieldType type);
void reconstructSlices();

void printVTP_mesh(int printVTPIndex);
void writeMPTrackingVTP(int printVTPIndex, int numMPs, const Vec3dView& history, const Vec3dView& resultLeft,
const Vec3dView& resultRight, const Vec3dView& mpTgtPosArray);


void calculateStrain();
void calculateStress(const int constitutive_relation);
void calculateStressDivergence();
};

}//namespace polyMPO end
Expand Down
Loading
Loading