fix: improve information on CancellationException for LROs#2236
fix: improve information on CancellationException for LROs#2236diegomarquezp merged 22 commits intomainfrom
Conversation
| if (super.shouldRetry(nextAttemptSettings)) { | ||
| return true; | ||
| } | ||
| if (LOGGER.isLoggable(Level.SEVERE)) { |
There was a problem hiding this comment.
QQ, is there a reason SEVERE logging level was picked? It seems like that is meant for more fatal application issues (and less for an RPC timeout), but I also am not sure if there is a better level (Info? Warning?).
There was a problem hiding this comment.
That makes sense. My reasoning was more like "exception=severe", but in the context of a timeout on a request I agree something like WARNING is better
There was a problem hiding this comment.
I decreased it to WARNING
| */ | ||
| public class OperationTimedPollAlgorithm extends ExponentialRetryAlgorithm { | ||
|
|
||
| @VisibleForTesting |
There was a problem hiding this comment.
nit: I don't think the logger needs a VisibleForTesting annotation.
| @Test | ||
| public void testGRPC_LROUnsuccessfulResponse_exceedsTotalTimeout_throwsDeadlineExceededException() | ||
| throws Exception { | ||
| FakeLogHandler logHandler = new FakeLogHandler(); |
There was a problem hiding this comment.
nit: Might be better to create field for FakeLogHandler that all tests can use without having to initialize in multiple tests, if possible
setUp() -> Creates it
clean() -> clears all the message so the next test can run
There was a problem hiding this comment.
Makes sense, thanks! I only added a setUp() since creating a new handler wipes out the old handler with its maessages
| FakeLogHandler logHandler = new FakeLogHandler(); | ||
| OperationTimedPollAlgorithm.LOGGER.addHandler(logHandler); |
lqiu96
left a comment
There was a problem hiding this comment.
LGTM, added a few nits. Can you also have @burkedavison or @JoeWang1127 take a quick look? Thanks for this @diegomarquezp
| */ | ||
| public class OperationTimedPollAlgorithm extends ExponentialRetryAlgorithm { | ||
|
|
||
| public static Logger LOGGER = Logger.getLogger(OperationTimedPollAlgorithm.class.getName()); |
There was a problem hiding this comment.
Why the logger has to be public?
There was a problem hiding this comment.
This is so it can be accessed by the test in order to add a log handler. @lqiu96, maybe we can fall back to @VisibleForTesting because it is... visible only for testing?
There was a problem hiding this comment.
Ah, right. Let's try the the default package scope, which I believe should still be accessible. If not, then we can add the annotation.
There was a problem hiding this comment.
There're some suggestions made by Quality Gate: https://sonarcloud.io/project/issues?id=googleapis_gapic-generator-java&pullRequest=2236&resolved=false&types=CODE_SMELL
| int attemptCount = operationFuture.getPollingFuture().getAttemptSettings().getAttemptCount(); | ||
| assertThat(attemptCount).isGreaterThan(1); | ||
| assertEquals(1, logHandler.getAllMessages().size()); | ||
| assertThat(logHandler.getAllMessages().get(0)) |
There was a problem hiding this comment.
Is there a guarantee that the message is the first element of the list?
There was a problem hiding this comment.
So far the implementation has only one case where the logger is used. I think the current assertions are a safe heuristic. Maybe I can assert that at least one message contains the entry I'm looking for?
There was a problem hiding this comment.
Maybe I can assert that at least one message contains the entry I'm looking for?
I think this maybe better.
gax-java/gax/src/test/java/com/google/api/gax/util/FakeLogHandler.java
Outdated
Show resolved
Hide resolved
|
I don't have a lot of context of this change so I provide some general suggestions. |
| @@ -39,6 +43,13 @@ | |||
| */ | |||
| public class ITLongRunningOperation { | |||
There was a problem hiding this comment.
Are we sure that this already-complex integration test the best place to add these new assertions? Rephrased: Is the showcase server needed to verify the client's logging behavior?
The changes we're making are in com.google.api.gax.longrunning.OperationTimedPollAlgorithm, but we don't yet have any unit tests for this class. Is it possible for us to write a direct unit test that focuses purely on the logging behavior of OperationTimedPollAlgorithm.shouldRetry()?
Without looking at this PR and only the logic we're trying to test, I would have expected to see a set of unit tests that were roughly:
// WARNING: This is all completely pseudocode...
public void testTroubleshootingLinkProvided() {
LOGGER.addHandler(handler);
assertThrows(..., algorithm.shouldRetry(NEXT_ATTEMPT_SHOULD_NOT_RETRY));
assertThat(handler.getLogMessage().exists())
assertThat(handler.getLogMessage().contains(LRO_TROUBLESHOOTING_LINK));
}
public void testTroubleshootingLinkNotProvided() {
LOGGER.addHandler(handler);
assertThat(algorithm.shouldRetry(NEXT_ATTEMPT_SHOULD_RETRY), is(true));
assertThat(handler.getLogMessage().doesNotExist())
}
Then the integration test can be kept clear of client-side-only implementation details, allowing the person debugging a problem to focus only on the relevant interactions.
There was a problem hiding this comment.
@burkedavison thank you. At some point writing the test in this file helped me find that the exception thrown in OperationTimedPollAlgorithm does not bubble up to the client (hence the use of a logger), but maybe that's just part of finding out about the behavior of LROs. I will move the test case to a unit test in gax
| FakeApiClock clock = new FakeApiClock(System.nanoTime()); | ||
| OperationTimedPollAlgorithm algorithm = | ||
| OperationTimedPollAlgorithm.create(FAST_RETRY_SETTINGS, clock); | ||
| TimedAttemptSettings settings = | ||
| TimedAttemptSettings.newBuilder() | ||
| .setGlobalSettings(FAST_RETRY_SETTINGS) | ||
| .setRetryDelay(Duration.ofMillis(1l)) | ||
| .setRpcTimeout(Duration.ofMillis(1l)) | ||
| .setRandomizedRetryDelay(Duration.ofMillis(1l)) | ||
| .setAttemptCount(0) | ||
| .setFirstAttemptStartTimeNanos(clock.nanoTime()) | ||
| .build(); |
There was a problem hiding this comment.
Thank you for the improvement by moving this to a unit test!
At this point, I'm just nit-picking: you could choose to extract all three of these variables to class fields, and further reduce both of these tests to their essential checks.
| .setAttemptCount(0) | ||
| .setFirstAttemptStartTimeNanos(clock.nanoTime()) | ||
| .build(); | ||
| clock.incrementNanoTime(1 * 1000 * 1000 * 1000); |
There was a problem hiding this comment.
Consider a quick comment here to clarify what part of the logic this is triggering to force a no-retry.
For example (if this is accurate):
clock.incrementNanoTime(1 * 1000 * 1000 * 1000); // Force Rpc TimeoutThere was a problem hiding this comment.
good point. I added this comment
| private static TimedAttemptSettings timedAttemptSettings; | ||
| private static FakeApiClock clock; |
There was a problem hiding this comment.
By making these static and initializing them in an instance method (@Before), the test now becomes flaky if multithreading is ever enabled. One test may overwrite the reference during a different test.
| private static TimedAttemptSettings timedAttemptSettings; | |
| private static FakeApiClock clock; | |
| private TimedAttemptSettings timedAttemptSettings; | |
| private FakeApiClock clock; |
There was a problem hiding this comment.
ah, good catch. thanks. I didn't think through the need for static in this test class. I just switched them as instance variables
edit: I didn't notice this comment had a suggestion but I added the same change in another commit
|
[gapic-generator-java-root] Kudos, SonarCloud Quality Gate passed! |
|
[java_showcase_integration_tests] Kudos, SonarCloud Quality Gate passed! |








Fixes #1858
When a LRO runs out of retries/time, it throws a
CancellationExceptionthat is then swallowed by guava'sAbstractFuture, meaning there is no simple way to bubble up a custom exception with message.This approach uses a logger with a message that points to LRO documentation to ensure the user is informed on what they can do about this.