Skip to content

cuda porting #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: cuda_tgv
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 30 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,57 +1,51 @@
# ASTR code
Version 2.0

ASTR code is a high-order finite-difference flow solver for compressible turbulence research.
ASTR code is a high-order finite-difference flow solver for compressible turbulence research. This project explores the usage of CUDA-Fortran to parallelise the ASTR code. The tgv solver for a 3D case has been parallelised using CUDA.

# Download, Installation and Compilation
Required dependencies: Fortran 90, MPI, HDF5
Required dependencies: Fortran 90, NVIDIA HPC SDK, CMAKE

## Download the astr code:
The installation guide for NVIDIA HPC SDK can be found at [Installation Guide](https://docs.nvidia.com/hpc-sdk/hpc-sdk-install-guide/index.html)

git clone git@github.com:astr-code/astr.git
## Download the gpu accelerated astr code:

## Compilation and installation:
ASTR 2.0 supports both gnu make and cmake.
For the use of gnu make, do:
make
in the directory containing src folder, and the executable will be found as ./bin/astr
Clone the git repository:
```
$ git clone https://github.com/terencel411/astr.git
```
## Compilation:
The `Makefile` gives a complete and safe way of compiling the code.

The cmake gives a more complete and safe way of compiling and installing the code.
By default ASTR solves equations under non-dimensional form, unless the chemstry is included.
Go to the directory where the miniapps code is present

create a case folder, e.g.
mkdir test_case
sh path_to_the_source/script/astr.case.creater #create a new case
```
$ cd astr/miniapps/tgv_solver_3d
```

cmake path_to_the_source
cmake --build
cmake --install
ctest -L nondim
The cpu and the gpu accelerated codes (`tgvsolver_cpu.f90` and `tgvsolver_gpu.cuf`) are present in the same directory, which can be compiled using the following cmake commands

The code will be installed in test_case and excutable can be found at test_case/bin/astr
Compile and execute the cpu & gpu code

If you want to use the chemstry function, you need first to install cantera. After download and unpack the cantera, you can use the following script to install:
python scons/scripts/scons.py build python_package=none FORTRAN=<your fortran compiler> f90_interface=y prefix=<the directory of centera to install> boost_inc_dir=<to boost directory>
```
$ cmake all
```

python scons/scripts/scons.py test
The cpu and gpu code can also be compiled and executed separately

python scons/scripts/scons.py install
```
$ cmake cpu
$ cmake gpu
```

you may need to make and test ASTR with chemstry with following cmake commands:
## Acceleration Comparison
The time acceleration statistics can be obtained by running the following command.

cmake -DCHEMISTRY=TRUE -DCANTERA_DIR=path_to_cantera path_to_the_source
```
$ cmake compare
```

cmake --build

cmake --install

ctest -L combustion


## Run the solver:

Once the excutable is built, a typical simulation can be run as,
mpirun -np 8 ./astr run ./datin/input_file
A text file `time_report.txt` will be generated with the accelerations statistics.



Expand Down
44 changes: 44 additions & 0 deletions miniapps/tgv_solver_3d/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Compiler
FC = nvfortran

# Executable files
EXEC1 = tgv_cpu
EXEC2 = tgv_gpu
EXEC3 = time_stats

# Flags
FLAGS = -O1

SRC1 = tgvsolver_cpu.f90
SRC2 = tgvsolver_gpu.cuf
SRC3 = compstats.f90

# all: $(EXEC1) $(EXEC2) $(EXEC3)
# runs cpu & gpu, then compares
all: cpu gpu

# Compile
# nvfortran -O3 tgvsolver.cuf -o tgv
# $^ is the source file
# $@ is the exec file

# $(EXEC1): $(SRC1)
# $(FC) $^ -o $@

cpu: $(SRC1)
$(FC) $(FLAGS) $^ -o $(EXEC1)
./$(EXEC1)

gpu: $(SRC2)
$(FC) $(FLAGS) $^ -o $(EXEC2)
./$(EXEC2)

compare: $(SRC3)
$(FC) $^ -o $(EXEC3)
./$(EXEC3)

clean:
rm -f $(EXEC1) $(EXEC2) $(EXEC2)

# Phony targets
.PHONY: all clean cpu gpu compare
75 changes: 75 additions & 0 deletions miniapps/tgv_solver_3d/comparison.f90
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
program comparison
implicit none
character(len=200) :: line1, line2
integer :: i, ios1, ios2, choice
logical :: flag1=.true.

print *, "Enter your choice: "
read(*,*) choice

if(choice == 1) then
open(unit=10, file='gradcal_cpu.txt', status='old', action='read')
open(unit=20, file='gradcal_gpu.txt', status='old', action='read')
open(unit=30, file='gradcal.txt', status='unknown', action='write')
open(unit=40, file='gradcal_ne.txt', status='unknown', action='write')
else if(choice == 2) then
open(unit=10, file='convection_cpu.txt', status='old', action='read')
open(unit=20, file='convection_gpu.txt', status='old', action='read')
open(unit=30, file='convection.txt', status='unknown', action='write')
open(unit=40, file='convection_ne.txt', status='unknown', action='write')
else if(choice == 3) then
open(unit=10, file='diffusion_cpu.txt', status='old', action='read')
open(unit=20, file='diffusion_gpu.txt', status='old', action='read')
open(unit=30, file='diffusion.txt', status='unknown', action='write')
open(unit=40, file='diffusion_ne.txt', status='unknown', action='write')
else if(choice == 4) then
open(unit=10, file='filterq_cpu.txt', status='old', action='read')
open(unit=20, file='filterq_gpu.txt', status='old', action='read')
open(unit=30, file='filterq.txt', status='unknown', action='write')
open(unit=40, file='filterq_ne.txt', status='unknown', action='write')
else if(choice == 5) then
open(unit=10, file='q2fvar_cpu.txt', status='old', action='read')
open(unit=20, file='q2fvar_gpu.txt', status='old', action='read')
open(unit=30, file='q2fvar.txt', status='unknown', action='write')
open(unit=40, file='q2fvar_ne.txt', status='unknown', action='write')
else if(choice == 6) then
open(unit=10, file='bchomo_cpu.txt', status='old', action='read')
open(unit=20, file='bchomo_gpu.txt', status='old', action='read')
open(unit=30, file='bchomo.txt', status='unknown', action='write')
open(unit=40, file='bchomo_ne.txt', status='unknown', action='write')
end if


do
read(10, '(A)', iostat=ios1) line1
read(20, '(A)', iostat=ios2) line2

if (ios1 /= 0 .and. ios2 /= 0) exit

write(30, '(A)') trim(line1)
write(30, '(A)') trim(line2)
write(30, '(A)') '----------------------------------------------------------------------'

if(line1/=line2) then
flag1 = .false.
write(40, '(A)') trim(line1)
write(40, '(A)') trim(line2)
write(40, '(A)') '----------------------------------------------------------------------'
else
end if

end do

if(flag1) then
write(30, '(A)') 'Data is same'
write(40, '(A)') '----- NO DATA -----'
else
write(30, '(A)') 'Data is not same'
end if
write(30, '(A)') '----------------------------------------------------------------------'

close(10)
close(20)
close(30)
close(40)
end program comparison
91 changes: 91 additions & 0 deletions miniapps/tgv_solver_3d/compstats.f90
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
program merge_files
use ieee_arithmetic
!
implicit none
!
character(len=40) :: line1, line2
integer :: ios1, ios2, i1, i2
real :: num1, num2, num3
logical :: flag = .true.
character(len=10) :: second_last_word
integer :: pos_last_space, pos_second_last_space, ln

ln = 0

open(unit=10, file='time_report_cpu.txt', status='old', action='read')
open(unit=20, file='time_report_gpu.txt', status='old', action='read')
open(unit=30, file='time_report.txt', status='replace', action='write')

do
read(10, '(A)', iostat=ios1) line1
read(20, '(A)', iostat=ios2) line2

if (ios1 /= 0 .or. ios2 /= 0) then
write(30, '(A, A, A)') line1, line2, '---------------------'
exit
end if

ln = ln + 1
flag = .true.

! read(line1, '(F13.3)', IOSTAT=i) num1
! read(line1, '(F13.3)', IOSTAT=i) num2
! num3 = num1/num2

pos_last_space = len_trim(line1)
do while (pos_last_space > 1 .and. line1(pos_last_space:pos_last_space) /= ' ')
pos_last_space = pos_last_space - 1
end do

pos_second_last_space = pos_last_space - 1
do while (pos_second_last_space > 1 .and. line1(pos_second_last_space:pos_second_last_space) /= ' ')
pos_second_last_space = pos_second_last_space - 1
end do

second_last_word = adjustl(line1(pos_second_last_space+1:pos_last_space-1))
read(second_last_word, '(F13.3)', iostat=i1) num1

! print *, line1, second_last_word

pos_last_space = len_trim(line2)
do while (pos_last_space > 1 .and. line2(pos_last_space:pos_last_space) /= ' ')
pos_last_space = pos_last_space - 1
end do

pos_second_last_space = pos_last_space - 1
do while (pos_second_last_space > 1 .and. line2(pos_second_last_space:pos_second_last_space) /= ' ')
pos_second_last_space = pos_second_last_space - 1
end do

second_last_word = adjustl(line2(pos_second_last_space+1:pos_last_space-1))
read(second_last_word, '(F13.3)', iostat=i2) num2

if (ln == 1) then
write(30, '(A, A, A)') line1, line2, '-----acceleration----'
cycle
end if

if (ln == 2) then
write(30, '(A, A, A)') line1, line2, '---------------------'
cycle
end if

if(num1 == 0 .or. num2 == 0) flag = .false.
! print *, num1, num2, flag

if (flag) then
num3 = num1/num2
write(30, '(A, A, F13.1)') line1, line2, num3
else
write(30, '(A, A, A)') line1, line2
end if

if (line1 == '.end' .or. line2 == '.end') print *, "end", line1, line2
end do

close(10)
close(20)
close(30)

print*,'-- acceleration statistiscs report generated : time_report.txt --'
end program merge_files
Loading