Skip to content

Documentation review #1451

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Documentation review #1451

wants to merge 10 commits into from

Conversation

melissawm
Copy link

@melissawm melissawm commented May 14, 2025

This PR includes a number of improvements to documentation, including a few content changes and some screenshots of the XProf Tensorboard Plugin.

I have a few questions I will ask as comments below. This PR is co-authored by @pavithraes , and we are happy to address any feedback.

We also found a couple of other documents containing similar content:

Should we include some of that content here, link out to those pages, or update the pages themselves to have up-to-date content?

Thank you!

melissawm and others added 5 commits May 14, 2025 12:57
Includes screenshots and minor edits to the documentation, including a new index page.

Signed-off-by: Melissa Weber Mendonça <melissawm@gmail.com>
Signed-off-by: Pavithra Eswaramoorthy <pavithraes@outlook.com>
Signed-off-by: Pavithra Eswaramoorthy <pavithraes@outlook.com>
Signed-off-by: Melissa Weber Mendonça <melissawm@gmail.com>
Signed-off-by: Melissa Weber Mendonça <melissawm@gmail.com>

For GPUs, HLO ops have an N:M relationship with the kernels that actually get
executed. For statistics at the kernel level, see the GPU Kernel Stats tool.
executed. For statistics at the kernel level, see the _GPU Kernel Stats_ tool.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the GPU Kernel Stats tool is this one: https://www.tensorflow.org/guide/profiler#gpu_kernel_stats

We may want to create a page from it with similar (but updated) content.

@@ -60,18 +60,20 @@ The HLO Op Stats tool has the following key components:

### HLO Operation Statistics Table Details

![Tensorboard HLO Op Stats Table](images/hlo_op_stats_2.png)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the columns listed below do not show up at all for me (any columns past the Avg. time are missing). I wonder if this has to do with the fact that I can't seem to change the size of the output on colab, and maybe the columns are just overflowing the size of the screen. Unfortunately I tried different browsers and sizes and still I could not see them.

@@ -8,11 +8,11 @@ utilizes hardware resources, identify performance bottlenecks, and optimize your
model for faster execution. The Trace Viewer UI is based on the one used in
`chrome://tracing` and therefore requires that you use the Chrome browser.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worked fine for me on Firefox, so maybe this is not a requirement anymore?

Comment on lines -151 to -158
* Clicking on an XLA op provides additional information in the details pane.
For example, it links to the op in the Graph Viewer tool. It may also
provide pointers to source code and/or the Python stack trace, the framework
op that caused this XLA op to get generated, etc. (if present in the
profile). It may also show FLOPS (number of floating point operations
executed by the op) and bytes accessed by the op; this information is
statically acquired from XLA during compilation, rather than runtime
information from the profile.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reorganized this content to be included in the details pane description as I felt it would be more self-contained there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice a formatting issue with this page -- I'll push a commit to update it.

@melissawm
Copy link
Author

I have just added a skeleton of a page for the GPU Kernel Stats tool. I'm not sure if this is something we want to keep or not, please let me know!

Comment on lines -53 to -59
* For TPUs only, time per HLO by replica group: A drop down lets you pick
from the different collective operations executed during the profiling
session. Different instances of that collective op may have been
executed among different replica groups (e.g.,
[AllGather](https://openxla.org/xla/operation_semantics#allgather)); a
pie chart shows the distribution of time between these different
instances.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see this component, not sure if something went wrong on my side.

Comment on lines +37 to +41
* A chart titled **Time spent on outside compilation**. Outside
compilation is a TensorFlow feature that enables certain ops within an
XLA computation to transparently run on the host CPU rather than the
accelerator device (e.g., `tf.summary` or `tf.print` that requires I/O
access that the device does not possess).
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This chart shows up empty for me. Not sure if this is the correct wording or if it should be reviewed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants