Get optimizations from target-postgres dependency#3
Merged
Conversation
KBorders01
approved these changes
Jun 9, 2022
KBorders01
left a comment
There was a problem hiding this comment.
These are good optimizations. Let’s submit them back to data mill as well if that can be done quickly and easily.
Author
I opened a PR for that repo at https://github.com/datamill-co/target-snowflake/pull/35/files However, there are a bunch of open PRs there which are quite old, and there hasn't been a merge since Jan 2021, so it seems that it isn't being maintained anymore. |
|
Ah, okay, that's unfortunate. Hopefully this will be good enough that we
don't need to do too many more fixes in the future.
…On Thu, Jun 9, 2022 at 1:27 PM Nick Smith ***@***.***> wrote:
These are good optimizations. Let’s submit them back to data mill as well
if that can be done quickly and easily.
I opened a PR for that repo at
https://github.com/datamill-co/target-snowflake/pull/35/files
However, there are a bunch of open PRs there which are quite old, and
there hasn't been a merge since Jan 2021, so it seems that it isn't being
maintained anymore.
—
Reply to this email directly, view it on GitHub
<#3 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAY2NQQKF5SRISRKF46FTKLVOISRNANCNFSM5YKLR3PQ>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Update the target-postgres dependency to get the latest
masterbranch commit, which is many commits ahead of the0.2.4package version. This provides several performance optimizations that have been added, such as:Additionally, bypass table insertion work when the record count is zero. The code still respects the
persist_empty_tablessetting to manage the table schema itself, but will not go through the expensive process to perform a zero-record insertion. That process takes ~6s per table, which in the GitHub tap means a few minutes of completely wasted time for an ETL process which has little-to-no new data.Testing
Tested locally with tap-github running on minwareco/repotest, which brought runtime down from ~1m45s to ~40s. Also, most of that remaining run time will be eliminated when https://minware.atlassian.net/browse/MW-496 is completed, which causes only one GitHub ingest pipeline per org to process the global data (e.g. collaborators, issues).