This Flux sql.to() function did not report any errors in V2.4, but the data was not written to PGSQL
it is working in old V2.0.7.
import “sql”
data = from(bucket: “SEMW”)
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r[“_measurement”] == “Welding”)
|> filter(fn: (r) => r[“_field”] == “WeldResult”)
|> aggregateWindow(every: 1m, fn: last, createEmpty: false)
|> yield(name: “last”)
data
|> sql.to(
driverName: “postgres”,
dataSourceName: “postgresql://postgres:e82e0c77a0eb@10.177.108.247:5432/analytic_data?sslmode=disable”,
table: “espc.sample_data”)
Anaisdg
December 27, 2022, 7:05pm
2
Hello @FrankSONG ,
Welcome!
Hmm that seems odd.
@scott do you know anything about this?
The addition of quotes was added to Flux v0.150.0 via bugfix [1]. This was done with security in mind, particularly injection vulnerabilities.
The goal is to make sure identifiers being added to our SQL statements are quoted and escaped.
influxdata:master
← influxdata:fix/quote-db-idents
opened 10:21PM - 07 Dec 21 UTC
The aim here is to ensure all _identifiers_ (ex: table/column names) being added… to our SQL statements are _quoted and escaped_ to help mitigate risk of SQL injection.
Quoting and escaping happens in a couple places:
- in `sql.to` itself where we generate-then-exec SQL statements
- in each driver's "translate column" function
The translate column function's job is to produce appropriate DDL for use in a `CREATE TABLE` statement and as such, the content needs to be formatted appropriately to contain characters that would otherwise be invalid (not necessarily malicious).
Table names are not technically a part of the attack surface for Flux since it is specified as a string argument to `sql.to` putting it into the "trusted input" category; a string present in the flux program source.
Column names are _technically a risk_ since they are derived from the field names on incoming records from arbitrary sources.
The database integration tests have been updated to purposefully require quoting as a way to exercise the new code changes (the table name and new column has a space in it).
These changes have been _manually tested_ against BigQuery and Snowflake. _A write-up on the process for manually testing will land in the internal docs soon._
## Caveats
### Prior quoted identifiers
Folks who used `sql.to` prior to this change and happened to need to quote their identifiers themselves will need to remove _their quoting_ to avoid their quotes being escaped and included in the identifier name by _our quoting_.
### ~~Snowflake's~~ Identifier Casing
Snowflake has a quirk when it comes to identifiers. If the identifier is a bare
word (ie, _not quoted_) then it'll be automatically uppercased.
This means bare word identifiers, regardless of how they are cased, match since
they are all normalized to uppercase.
Since we're aiming to quote everything we generate in Flux (previously this was
not the case), this means callers will need to match the case of the objects in
their database.
In the case of a _new table_ being named in `sql.to`, flux will issue SQL to
create the table based on the incoming record type. The identifiers in the DDL
will (now) be quoted in this situation and so their casing will be preserved.
Assuming the identifier wasn't _already_ uppercased, this technically adds a new
requirement to now _always quote the identifier_ when it appears in queries.
I suspect it is what it is - if we need to quote all identifiers to mitigate
risk of injection, then Snowflake users will just need to be aware of the quirk
and take it into account.
> SAP HANA has a similar behavior, but continues to use the established practice of forcing identifiers to UPPERCASE (which comes with some problems of its own, noted via comments in the diff).
The big breaking change in this diff comes from the behavior changes we see in case sensitivity, and this isn't limited to Snowflake alone. Almost all of the supported DB engines (except perhaps Vertica) normalize bare word identifiers to lower case. A few normalize to upper case. Regardless of which way they normalize, it's still an opportunity for a casing mismatch once the identifiers are being quoted.
### Done checklist
- [ ] docs/SPEC.md updated **N/A**
- [x] Test cases written
1 Like
@Anaisdg Please look at the last message