improvement(connectors): audit and harden all 30 knowledge base connectors#3603
improvement(connectors): audit and harden all 30 knowledge base connectors#3603waleedlatif1 merged 6 commits intostagingfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
PR SummaryMedium Risk Overview Improves connector correctness and efficiency: Updates auth behavior and metadata: fixes Notion refresh-token exchange to use Basic Auth + JSON body, adjusts required OAuth scopes (Slack private channels, Microsoft Teams channel lookup) and trims Salesforce provider scopes, and adds/extends Last Modified/Created tagging for GitHub, Google Calendar, and Google Sheets. Also adds a new Written by Cursor Bugbot for commit 441d2a6. Configure here. |
Greptile SummaryThis PR is a broad audit and hardening pass across all 30 knowledge base connectors, fixing a range of correctness, security, and performance issues found by cross-referencing each connector against its service's official API docs. Key changes and their impact:
Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Connector sync triggered] --> B{listDocuments}
B -->|cursor / pages| C[External API call\nwith fetchWithRetry]
C --> D[Store result in syncContext\ne.g. cloudId / portalId / siteId]
D --> E[Build ExternalDocument list]
E --> F{hasMore?}
F -->|yes| B
F -->|no| G[Sync engine: per-document refresh]
G --> H{getDocument}
H --> I{syncContext hit?}
I -->|yes - HubSpot · Jira · SF · SP · Slack| J[Reuse cached ID\nno extra API call]
I -->|no - Google Sheets| K[Extra Drive API call\nfor modifiedTime each time]
J --> L[Fetch document content]
K --> L
L --> M[Build ExternalDocument\nwith tags & contentHash]
style K fill:#ffe0e0,stroke:#cc0000
style I fill:#e0f0ff
|
…orce - HubSpot: revert to Search API (POST /search) to restore lastmodifieddate DESCENDING sorting - Salesforce: restore ArticleBody field and add it to HTML_FIELDS for proper stripping - Jira: add zero-remaining guard to prevent requesting 0 maxResults
…Version field ArticleBody is not a standard field on KnowledgeArticleVersion per Salesforce API docs. Article body content lives in custom fields on org-specific __kav objects. Including ArticleBody in the SOQL query would cause runtime errors.
|
@greptile |
|
@cursor review |
- OneDrive: use Buffer.subarray for byte-accurate truncation instead of
character-count slice
- Reddit: deduplicate comment extraction — fetchPostComments now calls
extractComments instead of duplicating the logic
- Webflow: replace crude value.includes('<') with regex /<[a-z][^>]*>/i
to avoid false positives on plain text containing '<'
- Jira: add response.ok check in getJiraCloudId before parsing JSON to
surface real HTTP errors instead of misleading "No Jira resources found"
|
Addressed both issues from the review summary:
Both fixed in 441d2a6. |
|
@greptile |
|
@cursor review |
|
@cursor review |
…Outlook URL encoding - Jira: replace bare fetch() with fetchWithRetry in downloadJiraAttachments for retry logic on transient errors and rate limits - Outlook: use URLSearchParams in validateConfig $search URL construction to match buildInitialUrl and produce RFC 3986 compliant encoding
|
Addressed the review findings: Fixed:
Won't fix (design limitations, not bugs): |
|
@greptile |
…ctors (simstudioai#3603) * improvement(connectors): audit and harden all 30 knowledge base connectors * fix(oauth): update Notion test to match Basic Auth + JSON body config * fix(connectors): address PR review comments for hubspot, jira, salesforce - HubSpot: revert to Search API (POST /search) to restore lastmodifieddate DESCENDING sorting - Salesforce: restore ArticleBody field and add it to HTML_FIELDS for proper stripping - Jira: add zero-remaining guard to prevent requesting 0 maxResults * fix(salesforce): revert ArticleBody — not a standard KnowledgeArticleVersion field ArticleBody is not a standard field on KnowledgeArticleVersion per Salesforce API docs. Article body content lives in custom fields on org-specific __kav objects. Including ArticleBody in the SOQL query would cause runtime errors. * fix(connectors): address second round of PR review comments - OneDrive: use Buffer.subarray for byte-accurate truncation instead of character-count slice - Reddit: deduplicate comment extraction — fetchPostComments now calls extractComments instead of duplicating the logic - Webflow: replace crude value.includes('<') with regex /<[a-z][^>]*>/i to avoid false positives on plain text containing '<' - Jira: add response.ok check in getJiraCloudId before parsing JSON to surface real HTTP errors instead of misleading "No Jira resources found" * fix(jira,outlook): replace raw fetch in downloadJiraAttachments, fix Outlook URL encoding - Jira: replace bare fetch() with fetchWithRetry in downloadJiraAttachments for retry logic on transient errors and rate limits - Outlook: use URLSearchParams in validateConfig $search URL construction to match buildInitialUrl and produce RFC 3986 compliant encoding
Summary
Type of Change
Testing
Tested via TypeScript compilation and lint. All 30 connectors validated against API docs.
Checklist