Skip to content

Conversation

@alicejli
Copy link
Contributor

Pared down version of #6032
This only includes the workflows/scripts/config files. The Javadocs should get populated at the next release/manual trigger. This should make it easier to review as it does not contain the static Javadoc files.

Once this is confirmed to work/merged, I will submit a follow up PR that updates the links to the Javadocsto be based off of this repo instead of my fork.

@alicejli alicejli requested a review from a team June 26, 2023 15:52
@product-auto-label product-auto-label bot added the size: xl Pull request size is extra large. label Jun 26, 2023
SDK_PLATFORM_JAVA_TAG=$(grep -oP "(?<=${SDK_PLATFORM_JAVA}: \")[^\"]*" "$VARIABLES_FILE" | tr -d '\n')
echo "SDK_PLATFORM_JAVA_TAG=$SDK_PLATFORM_JAVA_TAG" >> $GITHUB_ENV

- name: Generate javadocs for modules within sdk-platform-java repo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The three new generate workflows have very similar logic, but slightly different setup steps.

Do they need to be different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am definitely open to refactoring how these workflows are set up.
The reason I've separated them into 3 separate ones is mainly due to navigating our repo structures which is why they're all similar but slightly different (more detail below).

  1. generate_javadocs_sdk-platform-java and generate_javadocs_google-cloud-java are most similar where they grab the correct version tag to checkout their respective repos from the variables.yaml file, generate the javadocs for the appropriate modules, and then add them.

  2. generate_javadocs_handwritten_libraries uses a separate file containing the repos/versions tags (handwritten_libraries_javadocs_modules.txt) as it needs to checkout each repo, generate the javadocs, and add them to the site.

We could combine generate_javadocs_sdk-platform-java and generate_javadocs_google-cloud-java, but one benefit of having them as separate repos is that javadoc generation for the google-cloud-java repo takes a while (~1-1.5 hours) so if it fails, it's siloed from the other javadoc generation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See if we can refactor the common config stuff and just have the logic specific to each workflow in the file.

output_file = 'variables.yaml'

# Excludes lines in versions.txt files that contain any of the following strings. Since we do not want to publish separate Javadocs for `google-cloud-<service>`, `grpc-google-<service>`, and `proto-google-<service>` artifacts, the latter two packages are excluded.
exclude_packages = ['gapic-generator-java', 'google-cloud-java', 'grpc-google-cloud', 'proto-google-cloud', 'google-cloud-bom', 'full-convergence-check', 'java-cloud-bom-tests', 'gax-httpjson', 'google-cloud-shared-dependencies']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this list relate to exclude_packages in parse_versionstxt_addl_repos.py?

Root question: Throughout these scripts, we have a mix of configuration (these lists) and logic (what to do). Can we split them, so it becomes easy to add/remove a new library/repo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were copied from each other. Have refactored the scripts based on your feedback so that there are only 2 exclusion lists (one in libraryTable_generation.py and one in update_googlecloudjava_javadocs_modules.py), and have also refactored the parse_verionstxt.pyscript such that we no longer needparse_versionstxt_addl_repos.py`.

Understood on the config/logic mix; I've refactored this a bit so that it's hopefully easier to follow. Current logic is as follows:

  1. Any modules in the sdk-platform-java repo that we generate javadocs for are explicitly listed in the sdk-platform-java_javadocs_modules.txt file. If we add any new modules, we will need to update that file.

  2. Any new modules in the google-cloud-java repo will be automatically included.

  3. Any new handwritten libraries that are part of the google-cloud-bom/pom.xml will be automatically included.

If there are any modules not part of the above, then we will need to figure out how to manually add them.

@meltsufin
Copy link
Member

Is the site directory included in the source intentionally?

@alicejli
Copy link
Contributor Author

Is the site directory included in the source intentionally?

Yep - there are some files within it that are needed to populate the Github Pages site (e.g. config.toml, some css/html stuff) that are not re-created with every release.

Copy link
Member

@meltsufin meltsufin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions/comments:

  1. What is the purpose of Hugo? I don't see it described in the design doc.
  2. Would it be possible to move most of the scripting out of the GitHub workfow configs so that they can be tested locally more easily? I think it is worthwhile to be careful not to lock ourselves into GitHub actions because it hinders local testing and migration to an alternative build system.
  3. What are the entrypoints? It looks like there are two main problems that are being solved here: (1) For a given libraries-bom version create a table of all modules in that release; (2) generate the javadocs for those modules. So, would it make sense to have an entrypoint script for each of those problems? I prefer this kind of separation because in the future we might want to download pre-built javadocs instead of re-generating them.

@meltsufin
Copy link
Member

Is the site directory included in the source intentionally?

Yep - there are some files within it that are needed to populate the Github Pages site (e.g. config.toml, some css/html stuff) that are not re-created with every release.

Did you hand-write all of those files or imported from somewhere?

@alicejli
Copy link
Contributor Author

Some questions/comments:

  1. What is the purpose of Hugo? I don't see it described in the design doc.

Thanks for pointing that out! I added a section in the design doc for it. Hugo is a static site generator and Docsy is a theme for Hugo (and is used by gRPC for their website which is the primary reason I used it).

  1. Would it be possible to move most of the scripting out of the GitHub workfow configs so that they can be tested locally more easily? I think it is worthwhile to be careful not to lock ourselves into GitHub actions because it hinders local testing and migration to an alternative build system.

I take your point; I need to think through what this might look like. We have time tomorrow to discuss.

  1. What are the entrypoints? It looks like there are two main problems that are being solved here: (1) For a given libraries-bom version create a table of all modules in that release; (2) generate the javadocs for those modules. So, would it make sense to have an entrypoint script for each of those problems? I prefer this kind of separation because in the future we might want to download pre-built javadocs instead of re-generating them.

I like how you articulated this. All of the scripts under .github/workflows/javadocs_scripts are used for entrypoint (1) - generating the table of modules for javadoc generation, and they're invoked in the update_site_var.yaml workflow.

All of entrypoint 2 (aka the javadoc generation) is happening in the 3 other workflows (generate_javadocs_google-cloud-java, generate_javadocs_handwritten_libraries, generate_javadocs_sdk-platform-java). The main reason I have them as 3 separate workflows instead of 1 is that they are being generated out of separate repos, and the generate_javadocs_google-cloud-java job takes a while (~1.5 hours) so if the other 2 workflows run successfully, at least some of the site is updated.
Burke had a similar comment though, so it's probably worth me combining all the javadoc generation workflows into one.

@alicejli
Copy link
Contributor Author

Is the site directory included in the source intentionally?

Yep - there are some files within it that are needed to populate the Github Pages site (e.g. config.toml, some css/html stuff) that are not re-created with every release.

Did you hand-write all of those files or imported from somewhere?

The content/docs/_index.md, config.toml, and custom files in layouts/shortcodes (javaModuleLibraryReferenceLinks.html, javaModuleProductReferenceLinks.html, libraryTable.html, variable.html) were manually edited by me.

The files within data/ and javadocHelpers are created via the scripts in .github/workflows/javadocs_scripts.

Everything else within site is boilerplate from the Docsy template (https://github.com/google/docsy-example).

@meltsufin
Copy link
Member

Is the site directory included in the source intentionally?

Yep - there are some files within it that are needed to populate the Github Pages site (e.g. config.toml, some css/html stuff) that are not re-created with every release.

Did you hand-write all of those files or imported from somewhere?

The content/docs/_index.md, config.toml, and custom files in layouts/shortcodes (javaModuleLibraryReferenceLinks.html, javaModuleProductReferenceLinks.html, libraryTable.html, variable.html) were manually edited by me.

The files within data/ and javadocHelpers are created via the scripts in .github/workflows/javadocs_scripts.

Everything else within site is boilerplate from the Docsy template (https://github.com/google/docsy-example).

What's the benefit of keeping the generated code in the repository? Also, the whole Hugo and Docsy setup seems a bit heavy for what appears to be a single documentation index page. Can we just use some plain HTML with a bit of CSS?

@alicejli
Copy link
Contributor Author

Had an offline discussion with @burkedavison and @meltsufin; documenting below for reference:

  1. This PR will get reworked to populate the table directly into this repo's README.md file as part of the release.

image

  1. Will then see how we can mirror this README.md somewhere on Cloud RAD.

  2. Will spin up a separate thread around auditing Maven artifact generation to see if it makes sense to change what is currently published in Maven both for ease of customer use and for javadoc generation.

  3. The question of hosting standard Javadocs is part of a larger question that we will discuss internally.

@alicejli alicejli added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Jun 29, 2023
@alicejli
Copy link
Contributor Author

Closing this in favor of #6083

@alicejli alicejli closed this Jul 11, 2023
@alicejli alicejli deleted the GithubPagesWorkflows branch July 25, 2023 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do not merge Indicates a pull request not ready for merge, due to either quality or timing. size: xl Pull request size is extra large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants