It is necessary to explain the internals of the WriteApi implementation to fully understand what is happening. It works with two internal buffered structures, write buffer and retry buffer, The below description is extracted from reading the code.
write buffer contains lines that were not sent to InfluxDB yet
- this buffer is required to optimize communication to InfluxDB
- lines are appended to this buffer by every writeXYZ function of the API, data points are converted to protocol lines when they are added to the buffer
- the maximum capacity of the buffer is set to 1000 lines or 50MB; if the buffer is full, the lines are sent to InfluxDB
- a timer is scheduled to send the buffer contents right after the buffer becomes non-empty, after 60 seconds by default, the timer is canceled everytime the data is sent
- the buffer can be flushed programatically to send its data to InfluxDB by
writeApi.flush
, writeApi.close
also calls writeApi.flush
- a buffer becomes empty right before its content is sent to InfluxDB
retry buffer contains lines that failed to write to InfluxDB
- lines are added to the retry buffer after sending the lines to Influxdb fails and the result error is recoverable (networking/unrecognized errors, write HTTP status code >= 429)
- the retry buffer keeps up to 32_000 lines by default, whenever its maximum size exceeds, the oldest lines are removed for the newer lines to fit in, an error message is written to log when this happens, the removed lines are lost
- when lines are added to the retry buffer, an internal entry is created to accompany these lines with expiration time for these lines to retry (180 seconds by default) and the count of already failed retries of these lines (initially 0)
- the retry buffer entry is scheduled for retrying
- the InfluxDB server can inform about the delay that is required to resend the failed lines, if it is found (
After
HTTP header present in the response), the send operation is scheduled to happen initially at that time, otherwise the next execution time is scheduled with a variant of an exponential backoff algorithm that randomizes the next send operation with respect to already failed retries (bigger delay after more failures).
- the retry operation can OOTB happen at most 5 times, with delays varying from 5 to 125 seconds (and a random jitter of 200ms), the lines are not retried after exceeding 180 seconds after they first failed
- before the send operation is executed, the lines are removed from retry buffer, they can be re-added to the retry buffer after sending of the retried lines fail
- internal retry mechanism is quite complicated, it can be turned off by setting up
maxRetries
to 0 in writeOptions
(passed when creating the WriteApi), the implementation can then retry the failed writes on its own
the send operation sends the supplied lines and handles write failures with retry buffer, its implementation is sendBatch
- if the send operation succeeds, writeSuccess is called if configured in the writeOptions
- if the send operation fails (for whatever reason)
- writeFailed function is called (when configured), this method can return a Promise to inform that the function provides a failure recovery on its own, otherwise a default error failure recovery is applied:
- if the lines already expire or exceed the maximum retry count, an error is written to Log ( customizable with setLogger function), the lines are lost
- otherwise, a warning is written into Log about a failed write attempt and the failed lines are added to the retry buffer
- when
writeApi.flush
was called programatically, the calling code is informed about the failure by a rejected promise
The node.js process keeps running until there are some tasks in the event loop. The writeAPI implementation creates two timers that create such tasks:
- a timer that writes the lines that were not sent yet, expires at most after 60 seconds
- a timer that retries the lines that failed to write, expires after the failed lines are successfully written or expire
Both timers are canceled by calling writeApi.close
or writeApi.dispose
. An unfinished networking communication with InfluxDB can also block the node.js
process from exiting. The networking timeout is set by default to 10 seconds.
Now I can try to answer your specific questions:
Does writePoints() keep the nodesjs process active indefinitely as it tries and keeps retrying the writes to influxDBs API (at least until the lambda itself times out?
An unflushed write buffer or entries in a retry buffer keeps the node process running. Write buffer gets empty after at most 60 seconds (by default) and then it cannot keep the process running anymore. The retry buffer can take much more time to get empty, the timer keeps the process running until all possible retries are executed.
If I don’t call WriteApi.close or flush after writePoints is the process never going to exit?
It will, but it can take minutes to happen, depending on the content of the retry buffer.
And, most confusing, by calling close() am I flushing away points that may not have been written yet and therefore I’m losing data… without even knowing what I’ve lost?
The close OOTB sends the lines that were not written to InfluxDB yet and then terminates all the retries of the lines that already failed. You can lose the data that were not written yet (close returns with a rejected Promise) and you will lose all the lines that already failed to write.
Or, by calling close(), does the process stay active and keep retrying the writes and then just resolves the Promise once its successfully complete?
No, close will stop retrying. The returned promise informs about the result of the lines that were not written yet. You can call writeApi.flush(true)
to try to flush both write and retry buffer, the lines in the retry buffer can be then lost depending on the setting of maxRetries
(5) and maxRetryTime
(180 seconds) in writeOptions.
Or is there a timeout in which, if points are unsuccessfully written, close() aborts the process and resolves the Promise anyway?
Close should let process exit right after it sends the lines that were not sent yet.
And, if the process is aborted, how do I keep track of what points where NOT successfully written?
Your implementation can be informed about failed and successful writes via callbacks. You are also free to implement a retry mechanism on your own, it might be better for your case.