Skip to content

Commit 5a678ef

Browse files
committed
update README.
1 parent 95a4e06 commit 5a678ef

File tree

2 files changed

+40
-0
lines changed

2 files changed

+40
-0
lines changed

artifacts/run_all_ncu_cutlass.sh

100644100755
File mode changed.

artifacts/table6/README.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,3 +37,43 @@ The profiling results shown in Table 6 are based on [NVIDIA Nsight Compute (ncu)
3737
In the output file of the profile results, you will find the memory traffic behavior of the kernel of interest. You can then further process and analyze these results.
3838

3939
We cannot pre-assign names due to libraries like Triton having internal implementations that call extra kernels. Filtering based on names is not feasible. To address this, we run profiling multiple times (e.g., three) to observe log outputs, then run the tested program several times (e.g., five) to identify patterns. This helps us pinpoint actual kernel calls and post-process the ncu profiling logs to compute network traffic over the memory hierarchy.
40+
41+
### Run the test
42+
43+
We have prepared a testing environment on the provided server to run the tests.
44+
45+
1. [run_all_ncu_cutlass.sh](../run_all_ncu_cutlass.sh) is used to run the test for flash attention2 implmented in CuTlass.
46+
47+
```bash
48+
sudo -i # Switch to root account
49+
cd /home/sosp/nnfusion/artifacts
50+
./run_all_ncu_cutlass.sh
51+
```
52+
53+
2. [run_all_ncu_flash2.sh](../run_all_ncu_pt.sh) is used to run the test for flash attention2 implemented in PyTorch.
54+
55+
```bash
56+
sudo -i # Switch to root account
57+
cd /home/sosp/nnfusion/artifacts
58+
# Choose the environment you want to test
59+
source /home/sosp/env/torch_env.sh
60+
./run_all_ncu_flash2.sh
61+
```
62+
63+
3. [run_all_ncu_ft.sh](../run_all_ncu_ft.sh) is used to run the test bigbird and flash attention implemented in FractalTensor.
64+
65+
```bash
66+
sudo -i # Switch to root account
67+
cd /home/sosp/nnfusion/artifacts
68+
./run_all_ncu_pt.sh
69+
```
70+
71+
4. [run_all_ncu_pt.sh](../run_all_ncu_pt.sh) is used to run the test for bigbird and flash attention implemented in PyTorch.
72+
73+
```bash
74+
sudo -i # Switch to root account
75+
cd /home/sosp/nnfusion/artifacts
76+
# Choose the environment you want to test
77+
source /home/sosp/env/torch_env.sh
78+
./run_all_ncu_pt.sh
79+
```

0 commit comments

Comments
 (0)