Influxdb as a database has some significant problems

I’ve been using influxdb for about 2 years now and as I get deeper into it, I’m getting really frustrated. As a user that has been collecting data for some time, I now need to look at how to manage all that data because just magically increasing storage capacity isn’t always the solution. Do I look at downsampling or deletion or …

This has led me down the path of looking at how to down sample data and after running into some problems, looking at the open issues. There’s a shocking trail of broken behaviour that has me questioning about whether my choice to use influxdb was sound.

Problems that keep popping up in issues:

  • drop measurements don’t disappear. sometimes they do, better odds if you restart influxdb. There’s a constant “try the latest version” as if that will have some magic, without any hint from those suggesting that about why the latest version should be better.
  • deleting of data may or may not work, even when the query matches times in the database.
  • doing a select usually results in correct data being returned but sometimes it does not - it depends on how many influxdb specials are used with it. Using “select into” seems to be a subscription for random behaviour. Then there are problems with the use of fill() that are long standing and unresolved.

These are all fundamental database operations and they do not function with 100% reliability. If mysql was this unreliable people would abandon it in droves for postgres (or at least face massive market suspicion).

The only database operation that seems to work with any amount of reliability is INSERT.

Of course, I’m using it for free, so what should I expect? A product that works well enough that makes me want to buy support for it and convince management that it is worth money. Afterall, that is what they’re selling - that plus the cluster/cloud service.

Let me rephrase that, they’re attempting to sell a database that has bugs opened against it that demonstrate a failure to pass basic database functionality.

But flux. I don’t really care about flux if long standing bugs aren’t fixed - and closing them with a “stale” bot isn’t the answer, that’s just sweeping the dirt under the carpet and hoping nobody goes looking.

Do I jump ship and if so where to? Or stick around and hope that someone can find all the holes to stop the ship sinking?

Hi @dataMechanic,

Sorry to hear you’re getting frustrated, but I certainly hope you found your rant therapeutic :grinning:

InfuxDB is OSS, it will definitely have bugs. It’s also the market leading TSDB, which is a new category with new problems; those that MySQL won’t have, because it’s a whole new paradigm.

That being said, we don’t like bugs either; even if they’re par for course with OSS and category building database. We run our own software at a very large scale and we want it to work well for us and everyone else.

What I’d suggest, if you’re willing, is that you post links to the GitHub issues you’re hitting and I’ll try to get you a solution, an explanation, or maybe some tips to try and bypass the pain.

While I may not be able to help with everything, I’ll certainly do my best :+1:

Cheers,
David

Hi @dataMechanic welcome to the community , sorry to hear you are getting frustrated

I found three questions in your post …

1.Of course, I’m using it for free, so what should I expect?
answer : you can expect a lot of Influxdb users ready to help where they can.

2. Do I jump ship and if so where to?
answer : no , do not jump until you are convinced nobody can help 
         you solve your problem(s).

3. Or stick around and hope that someone can find all the holes 
to stop the ship sinking?
answer : yes please stick around to find out if the community can help , 
         and to realize that there is no ship sinking.

Have I run into another instance of this?

what does a real error look like?

Hi @dataMechanic,

  1. What kind of data are you storing?
  2. What’s its resolution(how many samples every x seconds/minutes)?
  3. How much data is stored per day?
  4. How big is your current setup(in terms of disk space)?
  5. Do you need full resolution of your data regardless of its age?
    5a) Did you look into continues queries?
  6. What’s the retention time of your data?

Do you understand how shards work in InfluxDB and how and when data is “declared” as obsolete so it gets deleted? I stumbled upon this article a while ago, which really helped me understand how things work and what to expect from InfluxDB.

Can you please be more specific…:
Is there a way to reproduce the problems for someone willing to help?
Can you point to any bug reports you filed?

André

Lets be more specific.

influxdb is an open source project. The source code is available on github. The instructions on how to build influxdb do not work. When the git repository is cloned, there are no self contained instructions that work to build it on the latest released version of CentOS. Oh, it’s written in go, so there is no “./configure && make && make install” available. That’s NOT my problem. Neither is there a “.src.rpm” file that I can use with rpmbuild.

Summary: influxdb is open source’d but it fails at the most basic aspect of being opensource - allowing anyone that desires to build it from source themselves to do so with whatever changes they so desire. Finally Ikea has some competition for flat-pack construction difficulty.

Issues. If you look at the influxdb issues on the web you could be forgiven that the project has been abandoned. Most of the issues reported don’t even show any sign of being triaged. Has the project run so low on funds that they can’t even pay someone to triage bugs? To improve the issue situation there is (or was?) an automated bot that would close reported issues that had been open for too long. I don’t know about you but I get really upset when a vendor closes problems I report to them just because they’ve been open too long. It’s a great way to make your metrics look good but it doesn’t improve the overall quality of the product.

Summary: the influxdb project is failing to adequately respond to open issues. Crying wolf isn’t a problem here because nobody is listening anyway.

Pull requests on github. There is actually a good amount of activity here and there isn’t a long queue of pull requests nor are they very old. This suggests that there is good involvement by the developers at influxdb with others that want to improve the database (once the code assignment is agreed.)

Summary: the pull request queue indicates continued community involvement and code changes don’t rot. The lights are on in the back room and steam is coming out the roof.

Error messages. Today I was trying to work out why I was getting this error:
2019/09/17 18:06:24 unable to convert values according to TypesDB: len(args) = 2, want 3
My hat goes off to the programmer who wrote that error message because that goes up there with the most useless error messages I’ve ever come across. Up there with “Connection Refused” errors that dont tell you the IP/TCP port or “File Not Found” that don’t mention the path name. But it appears that this isn’t an influxdb problem (although a patch could be included to fit it when influxdb is built):


and following that it seems that collectd is worse still, “don’t send patches, do pull requests.” The influxdb issue did get closed by a human that timed it out rather than a bot, but there was no “Is it ok to close this”, rather just a “No feedback, closing.” Ouch.

Summary: influxdb requires lots of other packages to build and run successfully, but it itself contains no patches for the other elements that build from source, meaning that it is only as strong as the weakest link in its chain. Humans are being replaced by emotionless bots at influxdb.

It’s 3am and I need to be up at 7am.