sql.to() function not working in V2.4.0

FrankSONG · December 24, 2022, 2:22pm

This Flux sql.to() function did not report any errors in V2.4, but the data was not written to PGSQL
it is working in old V2.0.7.

import “sql”
data = from(bucket: “SEMW”)
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r[“_measurement”] == “Welding”)
|> filter(fn: (r) => r[“_field”] == “WeldResult”)
|> aggregateWindow(every: 1m, fn: last, createEmpty: false)
|> yield(name: “last”)

data
|> sql.to(
driverName: “postgres”,
dataSourceName: “postgresql://postgres:e82e0c77a0eb@10.177.108.247:5432/analytic_data?sslmode=disable”,
table: “espc.sample_data”)

Anaisdg · December 27, 2022, 7:05pm

Hello @FrankSONG,
Welcome!
Hmm that seems odd.
@scott do you know anything about this?

FrankSONG · December 28, 2022, 1:56am

The addition of quotes was added to Flux v0.150.0 via bugfix [1]. This was done with security in mind, particularly injection vulnerabilities.
The goal is to make sure identifiers being added to our SQL statements are quoted and escaped.

github.com/influxdata/flux

fix(stdlib/sql): quote db identifiers

influxdata:master ← influxdata:fix/quote-db-idents

opened 10:21PM - 07 Dec 21 UTC

onelson

+491 -348

The aim here is to ensure all _identifiers_ (ex: table/column names) being added… to our SQL statements are _quoted and escaped_ to help mitigate risk of SQL injection. Quoting and escaping happens in a couple places: - in `sql.to` itself where we generate-then-exec SQL statements - in each driver's "translate column" function The translate column function's job is to produce appropriate DDL for use in a `CREATE TABLE` statement and as such, the content needs to be formatted appropriately to contain characters that would otherwise be invalid (not necessarily malicious). Table names are not technically a part of the attack surface for Flux since it is specified as a string argument to `sql.to` putting it into the "trusted input" category; a string present in the flux program source. Column names are _technically a risk_ since they are derived from the field names on incoming records from arbitrary sources. The database integration tests have been updated to purposefully require quoting as a way to exercise the new code changes (the table name and new column has a space in it). These changes have been _manually tested_ against BigQuery and Snowflake. _A write-up on the process for manually testing will land in the internal docs soon._ ## Caveats ### Prior quoted identifiers Folks who used `sql.to` prior to this change and happened to need to quote their identifiers themselves will need to remove _their quoting_ to avoid their quotes being escaped and included in the identifier name by _our quoting_. ### ~~Snowflake's~~ Identifier Casing Snowflake has a quirk when it comes to identifiers. If the identifier is a bare word (ie, _not quoted_) then it'll be automatically uppercased. This means bare word identifiers, regardless of how they are cased, match since they are all normalized to uppercase. Since we're aiming to quote everything we generate in Flux (previously this was not the case), this means callers will need to match the case of the objects in their database. In the case of a _new table_ being named in `sql.to`, flux will issue SQL to create the table based on the incoming record type. The identifiers in the DDL will (now) be quoted in this situation and so their casing will be preserved. Assuming the identifier wasn't _already_ uppercased, this technically adds a new requirement to now _always quote the identifier_ when it appears in queries. I suspect it is what it is - if we need to quote all identifiers to mitigate risk of injection, then Snowflake users will just need to be aware of the quirk and take it into account. > SAP HANA has a similar behavior, but continues to use the established practice of forcing identifiers to UPPERCASE (which comes with some problems of its own, noted via comments in the diff). The big breaking change in this diff comes from the behavior changes we see in case sensitivity, and this isn't limited to Snowflake alone. Almost all of the supported DB engines (except perhaps Vertica) normalize bare word identifiers to lower case. A few normalize to upper case. Regardless of which way they normalize, it's still an opportunity for a casing mismatch once the identifiers are being quoted. ### Done checklist - [ ] docs/SPEC.md updated **N/A** - [x] Test cases written

FrankSONG · December 28, 2022, 1:56am

@Anaisdg Please look at the last message

Topic		Replies	Views
Flux and sql.from	2	430	May 5, 2021
"to" function doesn't work at all Fluxlang influxdb	2	460	June 30, 2020
Sending data to SQL server Fluxlang influxdb , flux	3	480	August 25, 2021
Mqtt.to() not working in v2.5.1 InfluxDB 2 influxdb , influxql	2	360	March 27, 2023
How to get more information out of an error?	2	411	May 10, 2022

sql.to() function not working in V2.4.0

Related topics