-
Notifications
You must be signed in to change notification settings - Fork 66
Documentation review #1451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Documentation review #1451
Conversation
Includes screenshots and minor edits to the documentation, including a new index page. Signed-off-by: Melissa Weber Mendonça <melissawm@gmail.com>
Signed-off-by: Pavithra Eswaramoorthy <pavithraes@outlook.com>
Signed-off-by: Pavithra Eswaramoorthy <pavithraes@outlook.com>
Signed-off-by: Melissa Weber Mendonça <melissawm@gmail.com>
Signed-off-by: Melissa Weber Mendonça <melissawm@gmail.com>
docs/hlo_op_stats.md
Outdated
|
||
For GPUs, HLO ops have an N:M relationship with the kernels that actually get | ||
executed. For statistics at the kernel level, see the GPU Kernel Stats tool. | ||
executed. For statistics at the kernel level, see the _GPU Kernel Stats_ tool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the GPU Kernel Stats tool is this one: https://www.tensorflow.org/guide/profiler#gpu_kernel_stats
We may want to create a page from it with similar (but updated) content.
@@ -60,18 +60,20 @@ The HLO Op Stats tool has the following key components: | |||
|
|||
### HLO Operation Statistics Table Details | |||
|
|||
 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the columns listed below do not show up at all for me (any columns past the Avg. time are missing). I wonder if this has to do with the fact that I can't seem to change the size of the output on colab, and maybe the columns are just overflowing the size of the screen. Unfortunately I tried different browsers and sizes and still I could not see them.
@@ -8,11 +8,11 @@ utilizes hardware resources, identify performance bottlenecks, and optimize your | |||
model for faster execution. The Trace Viewer UI is based on the one used in | |||
`chrome://tracing` and therefore requires that you use the Chrome browser. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This worked fine for me on Firefox, so maybe this is not a requirement anymore?
* Clicking on an XLA op provides additional information in the details pane. | ||
For example, it links to the op in the Graph Viewer tool. It may also | ||
provide pointers to source code and/or the Python stack trace, the framework | ||
op that caused this XLA op to get generated, etc. (if present in the | ||
profile). It may also show FLOPS (number of floating point operations | ||
executed by the op) and bytes accessed by the op; this information is | ||
statically acquired from XLA during compilation, rather than runtime | ||
information from the profile. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reorganized this content to be included in the details pane description as I felt it would be more self-contained there.
docs/graph_viewer.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice a formatting issue with this page -- I'll push a commit to update it.
I have just added a skeleton of a page for the GPU Kernel Stats tool. I'm not sure if this is something we want to keep or not, please let me know! |
* For TPUs only, time per HLO by replica group: A drop down lets you pick | ||
from the different collective operations executed during the profiling | ||
session. Different instances of that collective op may have been | ||
executed among different replica groups (e.g., | ||
[AllGather](https://openxla.org/xla/operation_semantics#allgather)); a | ||
pie chart shows the distribution of time between these different | ||
instances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see this component, not sure if something went wrong on my side.
* A chart titled **Time spent on outside compilation**. Outside | ||
compilation is a TensorFlow feature that enables certain ops within an | ||
XLA computation to transparently run on the host CPU rather than the | ||
accelerator device (e.g., `tf.summary` or `tf.print` that requires I/O | ||
access that the device does not possess). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This chart shows up empty for me. Not sure if this is the correct wording or if it should be reviewed.
This PR includes a number of improvements to documentation, including a few content changes and some screenshots of the XProf Tensorboard Plugin.
I have a few questions I will ask as comments below. This PR is co-authored by @pavithraes , and we are happy to address any feedback.
We also found a couple of other documents containing similar content:
Should we include some of that content here, link out to those pages, or update the pages themselves to have up-to-date content?
Thank you!