Influx closing the io stream for a http code 400 before my program can read the error message


#1

I have a java program inserting data, and it’s been fairly well tuned (using it constantly for 2 years), and can handle different error messages from influx when there’s an error inserting data. I’m using the http /write api endpoint by the way.

If we don’t get a 204 response code, We make an attempt to retrieve an error message which usually is sent back.

However, I have a particular data set insert which is causing a 400 code, but the server is closing the io error stream, thus causing an ioexception in my java program. “java.io.IOException: stream is closed” to be exact

relevant java code:
if (responseCode != 204) {
//get the response from the server.
Loggable logger = LoggingHelper.getLogger(CLASS_NAME, “sendDataToInflux”, LoggingReferences.S_OPERATION);
BufferedReader in = new BufferedReader(new InputStreamReader(con.getErrorStream(), “UTF-8”));
String inputLine;
StringBuilder response = new StringBuilder();

                while ((inputLine = in.readLine()) != null) {    <<<<---------- this line, trying to do the in.readLine is causing io exception. 
                    response.append(inputLine);
                }
               
                in.close();
                .... Here we check the error message, log it, etc. 
               }

I can go ahead and add some additional try catch logic in this area for reading the error, but i’d really like to know why it closed the stream. When I run the same data through a second time, I don’t get this problem. I MUST delete the database, and perform the insert again to get this error.

I say all that to simply ask, is there a way to log an 404 response messages in the log on the server side?
This is all I’m getting for that insert in question:
Jul 5 16:06:15 perflocal2 influxd: [httpd] 127.0.0.1 - - [05/Jul/2018:16:06:14 -0400] “POST /write?db=myDB&precision=ms&rp=RP_4+Day%28s%29 HTTP/1.1” 400 72 “-” “Java/1.8.0_65” deb10c3d-808e-11e8-8695-000000000000 1395364

As an aside, I had the exact “stream is closed” errors when my insert was too big, but I got around that by setting max-body-size = 0 which seemed to do the trick.

So, hopefully someone can help point me in the direction of how to enable some logging on the server side so I can figure out why this insert is returning a 400 code.


#2

I think this has something to do with the size of the insert, as it initially did when I first added the max-body-size = 0, but this must be something else in influx breaking. I noticed that the error occurred on the very first batch write of the inserts, which I have configured for 5 seconds. This was about 1.5MB of text line feed protocol insert (I turned on trace logging and captured the insert).
I reduced my batch delay from 5 seconds to 1 second, and it no longer causes the influx insert to fail.
I feel like there’s a bug in here somewhere about boundary conditions, or when performing very large batch inserts.
Anyone have ideas?


#3

Could this be due to the time that it takes to perform the write itself? The larger size of the insert means its going to take more time for the write to be transmitted and land…right?

I’ve seen this happen in a couple of different dimensions.

  1. if the database is sitting behind a firewall/load balancer – and the timeout on the fw/lb is set too low. If you are using an OSS instance, I’m assuming this isn’t going to be your challenge.
  2. Play with the write-timeout — increase it from 10s to something larger to see if this has an impact:
    https://docs.influxdata.com/influxdb/v1.5/administration/config/#write-timeout-10s

#4

After adding additional logging on my size to see how big the data was we were putting into the http stream before sending it (calling getResponseCode), I was noticing that some of the sizes were identical immediately next to one another. It looked like I was somehow attempting to send the data twice. More precisely, I was attempting to re-use a httpConnection object, which is a big no-no in Java (new one must be created, after getResponseCode is called) So, after some more debugging, and logging, I re-worked my send thread for all my http connection objects to handle what looks to be a coincidental timing/size type issue.
Lesson learned is be absolutely sure that there’s not a chance of re-using the same http connection object again. The streams are closed, so you’ll get “stream is closed” error messages. The reason the influx logs didn’t really have much in them, was because it wasn’t actually getting a send. my error code 400 was just from data points being outside the retention policy, so they were dropped. So, it was a partial batch insert, which is exactly what it should be doing in that use case.

Thanks to influx for the great response and reaching out to investigate this issue. Hopefully, this can help someone out with a similar issue in the future.