feat: Github Pages for Javadocs #6069

alicejli · 2023-06-26T15:52:28Z

Pared down version of #6032
This only includes the workflows/scripts/config files. The Javadocs should get populated at the next release/manual trigger. This should make it easier to review as it does not contain the static Javadoc files.

Once this is confirmed to work/merged, I will submit a follow up PR that updates the links to the Javadocsto be based off of this repo instead of my fork.

.github/workflows/javadocs_scripts/google-cloud-java_javadocs_modules.txt

.github/workflows/generate_javadocs_google-cloud-java.yaml

.github/workflows/generate_javadocs_sdk-platform-java.yaml

burkedavison · 2023-06-26T16:22:23Z

.github/workflows/generate_javadocs_sdk-platform-java.yaml

+        SDK_PLATFORM_JAVA_TAG=$(grep -oP "(?<=${SDK_PLATFORM_JAVA}: \")[^\"]*" "$VARIABLES_FILE" | tr -d '\n')
+        echo "SDK_PLATFORM_JAVA_TAG=$SDK_PLATFORM_JAVA_TAG" >> $GITHUB_ENV
+
+    - name: Generate javadocs for modules within sdk-platform-java repo


The three new generate workflows have very similar logic, but slightly different setup steps.

Do they need to be different?

I am definitely open to refactoring how these workflows are set up.
The reason I've separated them into 3 separate ones is mainly due to navigating our repo structures which is why they're all similar but slightly different (more detail below).

generate_javadocs_sdk-platform-java and generate_javadocs_google-cloud-java are most similar where they grab the correct version tag to checkout their respective repos from the variables.yaml file, generate the javadocs for the appropriate modules, and then add them.

generate_javadocs_handwritten_libraries uses a separate file containing the repos/versions tags (handwritten_libraries_javadocs_modules.txt) as it needs to checkout each repo, generate the javadocs, and add them to the site.

We could combine generate_javadocs_sdk-platform-java and generate_javadocs_google-cloud-java, but one benefit of having them as separate repos is that javadoc generation for the google-cloud-java repo takes a while (~1-1.5 hours) so if it fails, it's siloed from the other javadoc generation.

See if we can refactor the common config stuff and just have the logic specific to each workflow in the file.

.github/workflows/javadocs_scripts/libraryTable_generation.py

burkedavison · 2023-06-26T19:40:58Z

.github/workflows/javadocs_scripts/parse_versionstxt.py

+output_file = 'variables.yaml'
+
+# Excludes lines in versions.txt files that contain any of the following strings. Since we do not want to publish separate Javadocs for `google-cloud-<service>`, `grpc-google-<service>`, and `proto-google-<service>` artifacts, the latter two packages are excluded.
+exclude_packages = ['gapic-generator-java', 'google-cloud-java', 'grpc-google-cloud', 'proto-google-cloud', 'google-cloud-bom', 'full-convergence-check', 'java-cloud-bom-tests', 'gax-httpjson', 'google-cloud-shared-dependencies']


How does this list relate to exclude_packages in parse_versionstxt_addl_repos.py?

Root question: Throughout these scripts, we have a mix of configuration (these lists) and logic (what to do). Can we split them, so it becomes easy to add/remove a new library/repo?

These were copied from each other. Have refactored the scripts based on your feedback so that there are only 2 exclusion lists (one in libraryTable_generation.py and one in update_googlecloudjava_javadocs_modules.py), and have also refactored the parse_verionstxt.pyscript such that we no longer needparse_versionstxt_addl_repos.py`.

Understood on the config/logic mix; I've refactored this a bit so that it's hopefully easier to follow. Current logic is as follows:

Any modules in the sdk-platform-java repo that we generate javadocs for are explicitly listed in the sdk-platform-java_javadocs_modules.txt file. If we add any new modules, we will need to update that file.

Any new modules in the google-cloud-java repo will be automatically included.

Any new handwritten libraries that are part of the google-cloud-bom/pom.xml will be automatically included.

If there are any modules not part of the above, then we will need to figure out how to manually add them.

.github/workflows/javadocs_scripts/parse_versionstxt_addl_repos.py

site/config.toml

site/package.json

site/static/favicons/android-144x144.png

…d with the update_site_var workflow. Update name of job in generate_javadocs_sdk-platform-java.yaml

…d with the update_site_var workflow

meltsufin · 2023-06-28T18:51:04Z

Is the site directory included in the source intentionally?

alicejli · 2023-06-28T19:04:46Z

Is the site directory included in the source intentionally?

Yep - there are some files within it that are needed to populate the Github Pages site (e.g. config.toml, some css/html stuff) that are not re-created with every release.

meltsufin

Some questions/comments:

What is the purpose of Hugo? I don't see it described in the design doc.
Would it be possible to move most of the scripting out of the GitHub workfow configs so that they can be tested locally more easily? I think it is worthwhile to be careful not to lock ourselves into GitHub actions because it hinders local testing and migration to an alternative build system.
What are the entrypoints? It looks like there are two main problems that are being solved here: (1) For a given libraries-bom version create a table of all modules in that release; (2) generate the javadocs for those modules. So, would it make sense to have an entrypoint script for each of those problems? I prefer this kind of separation because in the future we might want to download pre-built javadocs instead of re-generating them.

meltsufin · 2023-06-28T20:25:36Z

Is the site directory included in the source intentionally?

Yep - there are some files within it that are needed to populate the Github Pages site (e.g. config.toml, some css/html stuff) that are not re-created with every release.

Did you hand-write all of those files or imported from somewhere?

alicejli · 2023-06-28T21:16:41Z

Some questions/comments:

What is the purpose of Hugo? I don't see it described in the design doc.

Thanks for pointing that out! I added a section in the design doc for it. Hugo is a static site generator and Docsy is a theme for Hugo (and is used by gRPC for their website which is the primary reason I used it).

Would it be possible to move most of the scripting out of the GitHub workfow configs so that they can be tested locally more easily? I think it is worthwhile to be careful not to lock ourselves into GitHub actions because it hinders local testing and migration to an alternative build system.

I take your point; I need to think through what this might look like. We have time tomorrow to discuss.

What are the entrypoints? It looks like there are two main problems that are being solved here: (1) For a given libraries-bom version create a table of all modules in that release; (2) generate the javadocs for those modules. So, would it make sense to have an entrypoint script for each of those problems? I prefer this kind of separation because in the future we might want to download pre-built javadocs instead of re-generating them.

I like how you articulated this. All of the scripts under .github/workflows/javadocs_scripts are used for entrypoint (1) - generating the table of modules for javadoc generation, and they're invoked in the update_site_var.yaml workflow.

All of entrypoint 2 (aka the javadoc generation) is happening in the 3 other workflows (generate_javadocs_google-cloud-java, generate_javadocs_handwritten_libraries, generate_javadocs_sdk-platform-java). The main reason I have them as 3 separate workflows instead of 1 is that they are being generated out of separate repos, and the generate_javadocs_google-cloud-java job takes a while (~1.5 hours) so if the other 2 workflows run successfully, at least some of the site is updated.
Burke had a similar comment though, so it's probably worth me combining all the javadoc generation workflows into one.

alicejli · 2023-06-28T21:33:02Z

Is the site directory included in the source intentionally?

Yep - there are some files within it that are needed to populate the Github Pages site (e.g. config.toml, some css/html stuff) that are not re-created with every release.

Did you hand-write all of those files or imported from somewhere?

The content/docs/_index.md, config.toml, and custom files in layouts/shortcodes (javaModuleLibraryReferenceLinks.html, javaModuleProductReferenceLinks.html, libraryTable.html, variable.html) were manually edited by me.

The files within data/ and javadocHelpers are created via the scripts in .github/workflows/javadocs_scripts.

Everything else within site is boilerplate from the Docsy template (https://github.com/google/docsy-example).

meltsufin · 2023-06-28T22:06:51Z

Is the site directory included in the source intentionally?

Yep - there are some files within it that are needed to populate the Github Pages site (e.g. config.toml, some css/html stuff) that are not re-created with every release.

Did you hand-write all of those files or imported from somewhere?

The content/docs/_index.md, config.toml, and custom files in layouts/shortcodes (javaModuleLibraryReferenceLinks.html, javaModuleProductReferenceLinks.html, libraryTable.html, variable.html) were manually edited by me.

The files within data/ and javadocHelpers are created via the scripts in .github/workflows/javadocs_scripts.

Everything else within site is boilerplate from the Docsy template (https://github.com/google/docsy-example).

What's the benefit of keeping the generated code in the repository? Also, the whole Hugo and Docsy setup seems a bit heavy for what appears to be a single documentation index page. Can we just use some plain HTML with a bit of CSS?

alicejli · 2023-06-29T17:14:37Z

Had an offline discussion with @burkedavison and @meltsufin; documenting below for reference:

This PR will get reworked to populate the table directly into this repo's README.md file as part of the release.

Will then see how we can mirror this README.md somewhere on Cloud RAD.
Will spin up a separate thread around auditing Maven artifact generation to see if it makes sense to change what is currently published in Maven both for ease of customer use and for javadoc generation.
The question of hosting standard Javadocs is part of a larger question that we will discuss internally.

alicejli · 2023-07-11T14:50:47Z

Closing this in favor of #6083

feat: Github Pages for Javadocs

f6869bc

alicejli requested a review from a team June 26, 2023 15:52

product-auto-label bot added the size: xl Pull request size is extra large. label Jun 26, 2023

burkedavison reviewed Jun 26, 2023

View reviewed changes

alicejli added 18 commits June 27, 2023 09:12

feat: Github Pages for Javadocs

c8f0e4c

remove google-cloud-java_javadocs_modules.txt as it will get generate…

6e3bbd3

…d with the update_site_var workflow. Update name of job in generate_javadocs_sdk-platform-java.yaml

remove extra file

c3cb231

remove google-cloud-java_javadocs_modules.txt as it will get generate…

bc5e872

…d with the update_site_var workflow

remove default Docsy stuff

fa2aa43

Merge branch 'main' into GithubPagesWorkflows

1361a4e

update file names

6a278ce

standardize capitalization

eb91862

update file names and comment

19dacb3

refactor parse_versionstxt.py and update update_site_var

d105f44

add content file

af2fec5

update content page

e411871

update scripts

e832a7d

remove hardcoded exclude list

c39d30e

update parse_pom_auth_library.py

afa330b

remove unused icons and favicons

a11af7c

Merge branch 'main' into GithubPagesWorkflows

0a50b71

move txt files to site and update names

b7e5815

alicejli requested a review from meltsufin June 28, 2023 17:04

alicejli mentioned this pull request Jun 28, 2023

feat: Enable Github Pages hosting Standard Javadocs #6032

Closed

meltsufin reviewed Jun 28, 2023

View reviewed changes

Merge branch 'main' into GithubPagesWorkflows

9ea370e

move txt files to separate folder

b9dbcd6

alicejli added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Jun 29, 2023

alicejli mentioned this pull request Jul 11, 2023

feat: add table of modules for a libraries-bom version to README #6083

Merged

alicejli closed this Jul 11, 2023

alicejli deleted the GithubPagesWorkflows branch July 25, 2023 16:53

feat: Github Pages for Javadocs #6069

feat: Github Pages for Javadocs #6069

Uh oh!

Conversation

alicejli commented Jun 26, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

burkedavison Jun 26, 2023

Choose a reason for hiding this comment

Uh oh!

alicejli Jun 27, 2023

Choose a reason for hiding this comment

Uh oh!

alicejli Jun 28, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

burkedavison Jun 26, 2023

Choose a reason for hiding this comment

Uh oh!

alicejli Jun 27, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

meltsufin commented Jun 28, 2023

Uh oh!

alicejli commented Jun 28, 2023

Uh oh!

meltsufin left a comment

Choose a reason for hiding this comment

Uh oh!

meltsufin commented Jun 28, 2023

Uh oh!

alicejli commented Jun 28, 2023

Uh oh!

alicejli commented Jun 28, 2023

Uh oh!

meltsufin commented Jun 28, 2023

Uh oh!

alicejli commented Jun 29, 2023

Uh oh!

alicejli commented Jul 11, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants