oracledb_cdc: support cache resources for internal transaction buffer#4409
oracledb_cdc: support cache resources for internal transaction buffer#4409josephwoodward wants to merge 1 commit into
Conversation
| st.Events[i] = serializedDMLEvent{ | ||
| Operation: ev.Operation, | ||
| Schema: ev.Schema, | ||
| Table: ev.Table, | ||
| SQLRedo: ev.SQLRedo, | ||
| Data: encodeMap(ev.Data), | ||
| OldValues: encodeMap(ev.OldValues), | ||
| Timestamp: ev.Timestamp, | ||
| TransactionID: ev.TransactionID, |
There was a problem hiding this comment.
int64 values lose precision when round-tripping through JSON.
In encodeVal, an int64 is stored as typedVal{T: typeInt64, V: int64Val} and marshaled as a plain JSON number. When unmarshaled into typedVal{V any}, encoding/json decodes numbers into float64 by default — so on the decode side tv.V is always a float64. Casting float64 back to int64 silently loses precision for any value larger than 2^53 (~9.0e15).
This path is reachable in practice: OracleValueConverter.ConvertValue (internal/impl/oracledb/logminer/sqlredo/valueconverter.go:80) parses bare numeric SQL literals via strconv.ParseInt(str, 10, 64), so an Oracle NUMBER primary key or ID column larger than 2^53 ends up in DMLEvent.Data as an int64. When the buffered transaction is later read back from the cache and committed, the value emitted downstream is no longer the value Oracle wrote.
Suggested fix: decode the JSON using a json.Decoder with UseNumber() (or store int64 as a string in the envelope) so the integer can be reconstructed without going through float64.
| txn, err := unmarshalTransaction(data) | ||
| if err != nil { | ||
| c.log.Errorf("Failed to deserialize transaction %s: %v", txnID, err) | ||
| return nil |
There was a problem hiding this comment.
No tests for ConnectCacheResource.
This PR adds ~349 lines of new code, including a non-trivial type-tagged JSON serialization layer (encodeVal/decodeVal/encodeMap/decodeMap/marshalTransaction/unmarshalTransaction) plus the cache-backed TransactionCache lifecycle (StartTransaction / AddEvent / Commit / Rollback and the discarded set / metrics interactions), but no unit or integration tests exercise any of it — internal/impl/oracledb/logminer/logminer_test.go and internal/impl/oracledb/integration_test.go make no reference to ConnectCacheResource or the new transaction_cache field.
Per the project's test patterns, please add at least:
- Unit tests for round-tripping
Transaction/DMLEventthroughmarshalTransaction/unmarshalTransactioncovering each value type produced byOracleValueConverter(string,int64,json.Number,[]byte,time.Time, nil) — this would also surface the precision issue flagged separately. - Behavioral tests that drive
StartTransaction/AddEvent/CommitTransaction/RollbackTransactionagainst a realservice.Cache(a memory cache viaservice.MockResourcesworks) to validate thediscardedtracking and the max-events discard path.
|
Commits Review
|
|
Hi @josephwoodward — a few concerns worth surfacing on this change, ordered by severity. P0 — Correctness
|
Currently OracleDB CDC buffers transactions via an internal, in-memory cache. It does this because it has to wait to see if the transaction rows are committed or rolled back before it can decide how to process them.
This change expands this internal buffer to be configurable via Connect's various cache resources, reducing memory footprint and improving the overall reliability of the connector when working with workflows featuring long-running transactions.