InfluxDB OSS 2.0 General Availability Roadmap

Introduction

This is a message for our community of InfluxDB Open Source users and contributors, from InfluxData engineering. InfluxData is the company behind the popular InfluxDB Time Series Database.

Over the past year, we have focused heavily on building a competitive Database as a Service platform (DBaaS) using the InfluxDB 2.0 codebase. While this resulted in a lot of open source code being written and committed to our repos on Github, it didn’t provide the same level of results for packaging, polishing, and releasing the open source code on its own.

In this post we would like to correct that by:

  1. Announcing our intentions to make InfluxDB 2.0 Open Source “generally available” as soon as possible.
  2. Restate our commitment to our Open Source community, and explain why Open Source is important to us.
  3. Briefly review the major changes and improvements in InfluxDB 2.0 OSS.
  4. Explain what we learned about the OSS vs. Cloud Storage Engine and what we plan to do about it.
  5. Scheduling a live Town Hall event where we will discuss this work and answer questions from the community

If you have any feedback, comments or questions about this roadmap, please leave them in a comment to this post and we will do our best to answer it either here or in the upcoming Town Hall.

General Availability

General Availability, or “GA” as we say, means removing the beta tag from InfluxDB 2.0 OSS. At that point, the engineering team is signalling that we believe OSS is ready for production use cases, and we are willing to stand behind it for production workloads. As you may or may not know, our multi-tenant InfluxDB Cloud 2 commercial offering has been GA for almost a year now, so a lot of the code in OSS has been battle tested in real production scenarios already. However, as discussed below, there are some features and other issues that we will address before declaring InfluxDB 2.0 OSS “GA.”

Importance of an Open Source InfluxDB

First, we want to reiterate, publicly, that an open source version of InfluxDB is an invaluable part of our story to developers, and that the successful adoption of a fully open source InfluxDB 2.0 is critical to our success as a product, community, and as a company.

Our open source user community is orders of magnitude larger than our commercial one; it always has been and,so long as we continue to develop compelling technology, we expect it always will be. This is good for InfluxData and the InfluxDB community because the more people who run our code in different environments, with different workloads, under different use cases, the more our code gets tested, bugs can be squashed, and our overall quality and reliability improved. Open source users improve the product, a better product gets us more customers, more customers let us continue investing in open source—it’s a virtuous circle.

Our roots as a company and as developers are in open source. With InfluxDB, we have provided an open source, functional, competitive time series database. We want to expand on that with open APIs and libraries that developers can use to build their applications on top of InfluxDB as a platform. We also want to keep the open source and DBaaS options API compatible to support the kind of hybrid deployments needed for emerging IoT and Edge Computing workloads. As we look forward, we see the opportunity to connect the various InfluxDB editions (OSS, Enterprise and Cloud) together in ways that will make it even easier to address these kinds of use cases.

All of this needs to be done in the open, using an open source development model and released under a true open source license.

Major Developments in InfluxDB 2.0

Combining Chronograf and Kapacitor into InfluxDB

One of the biggest changes in 2.0 is a single executable that combines the InfluxDB time series database engine with a user interface (UI) for exploring data, building dashboards and the ability to execute scripts and trigger alters. Previously, these features had been developed separately as Chronograf and Kapacitor, each with their own unique way of doing things and needing a coordinated deployment in the form of a TICK stack. This separation, which was largely a result of how the components were developed as the company grew, proved contentious and wasn’t providing the user experience that we wanted.

In InfluxDB 2.0, all of these capabilities are now included within a single binary, simplifying the configuration and installation on your end and allowing for better integration and faster development on our end. InfluxDB will still integrate with third-party components and software too, and we are working hard to continue to expand integration support based on community driven requests.

The Flux Language

In the past, InfluxQL was the language used for querying data and building dashboards while TICKscript was used for tasks, alerts, and more sophisticated handling of time series data. Both of these languages had their drawbacks and neither were capable of handling all the needs of developers in our community. When taking all of the various inputs into account including combining the visual and processing parts of our stack, this led to the development of a new language that could deliver many of the longstanding feature requests we simply couldn’t address within InfluxQL and eliminate the necessity to understand two separate languages.

Flux is the result of combining these different approaches, taking what worked well from each in a way that allowed for maximum flexibility and reuse. With InfluxDB 2.0 you can use the same Flux script to power your alert checks as you do to build your dashboard. It offers a richer set of language features to support deep data analysis, calculations, joining data from multiple measurements and even from non-InfluxDB sources. Flux also allows for extensions in the form of libraries written either in pure Flux or in Go, giving developers the freedom to add whatever features they need.

As Paul outlined in his original vision, Flux will be both the most powerful language for data transformation, as well as the easiest way to access data from disparate data sources. The essentials of the Flux standard library are in place, it is the primary language for working with data within InfluxDB 2.0, and we have just finished our first round of performance optimizations which have been deployed into InfluxDB Cloud. Flux has an LSP, a Visual Studio Code Plugin, and a unit testing library as well.

After we GA, we still have a lot of work ahead of us. We are eager to deliver:

  • Performance improvements implemented for InfluxDB Cloud ported to OSS
  • More and better conceptual documentation
  • Support for many more data sources using the from and to heuristics. We currently have sql.from()/sql.to() and csv.from(), but we want to create many more
  • Support for custom libraries and shared libraries
  • Continuing to deliver more and more performance improvements

We will keep supporting Flux in OSS 2.0 in the same way we do in InfluxDB Cloud, and hope to add a standalone experience (using Flux without InfluxDB at all) as well. This unlocks the ability for developers to use Flux as a data manipulation language on its own.

Templates and Stacks

By combining the UI, actions, and scripting language of InfluxDB into a unified platform, InfluxDB 2.0 gives you the ability to package up and deploy collections InfluxDB resources as reusable, installable packages. These packages, called InfluxDB Templates, contain everything you need to deploy your monitoring set up at scale: Dashboards, Queries, Tasks, Alerts, even your Telegraf configurations. Stacks allow you to iteratively improve and deploy updates to your Templates in a GitOps-friendly way.

We’ve taken this one step further and we’ve begun collecting InfluxDB Templates from our community, as well as those we built for ourselves, making them available for everyone to use in our Community Templates repository. There you can find Templates that cover a variety of common use cases across a variety of domains, distilling the best practices from experts in those systems, and making them available to you with a one-command or single-click installation process.

Storage

Early on, InfluxDB 2.0 introduced a new TSM format, that we called TSM 2.0. This format was intended to be used in both OSS 2.0 and Cloud 2.0. However, in practice we found that the use case of running the storage engine in a Kubernetes cluster and as a scalable SaaS was so different from managing an instance in a more traditional manner (bare metal, VM, container, etc…) that we found it was not useful to maintain the same engine for both. Because the existing 1.8 storage engine has years of optimizations for its use case, we have decided that it is best for community and enterprise users to stick with that tried and true storage engine in OSS 2.0.

As a result, when InfluxDB OSS 2.0 goes GA it will use the same TSM format as InfluxDB 1.8. Again, this provides a much more streamlined upgrade path for our existing community members… However, if you are a current user of InfluxDB 2.0 OSS beta 16 or earlier, you will be impacted. As we complete this change to InfluxDB OSS 2.0, you will not be able to simply install and upgrade as your existing data will be in an incompatible format. We understand that this is a potentially disruptive experience for those of you who have participated in the Alpha and Beta program. However, you should be able extract your data and migrate that data and we plan to provide specific tools and instructions for helping you do this.

Please note that we are currently updating the master branch of the OSS code base with the 1.8 storage engine. At that point, if you build from master, your existing data will be incompatible. Of course, you can continue to use your current build until you are ready to run through the data migration steps.

API Compatibility in InfluxDB 2.0

InfluxDB Cloud already exposes both the 1.x read and write APIs to provide compatibility with existing applications, dashboards and more.

We are turning our attention to adding this same API compatibility to InfluxDB OSS 2.0 as well, and InfluxDB 2.0 OSS will not go GA without such support. In order to support folks who have existing code bases of queries written in InfluxQL, we have added the InfluxQL Query API to Cloud 2.0. We are turning our attention to adding it to OSS 2.0 as well, and InfluxDB 2.0 OSS will not go GA without such support. This means that you will be able to use InfluxQL through the read compatibility API and technologies that leverage this API.

However, it is important to note that InfluxQL support is currently only planned to be available via the API. So, for instance, it will not be supported as a language for defining Tasks or dashboard cells and it cannot be used within the 2.0 Data Explorer. However, it does support existing Chronograf users as well as Grafana users and their associated dashboards. Meaning, existing dashboards and other queries can continue to function as-is. A simple change in endpoint and security credentials allows this to be a nearly drop-in replacement. (see more details below).

Upgrade Plan

While we would like the upgrade process from InfluxDB 1.x to 2.0 to be as effortless as possible, this is a major release that introduces a lot of changes and new features. In other words, we don’t expect upgrades to be seamless, though we absolutely expect them to be worth it. Our responsibility to the community is to ensure we describe those rough edges so that everyone understands where they are, what needs to be done about them, and when the right time to upgrade is based on timing and availability.

Our first commitment is to keep supporting InfluxDB 1.x, that includes security and bug fixes, as well as incremental feature addition. In fact, some of the major 2.0 developments have already been added to the 1.x release, such as Flux language support and 2.0 compatible APIs , and many of you are already using them! So if you are currently using InfluxDB 1.x you can feel confident in continuing to use it until the upgrade path to 2.0 is clear for you to follow.

With that in mind, we want to support users in upgrading to the 2.0 release right away. We are targeting the most common features and use cases of InfluxDB 1.x so those users can start taking advantage of the new release as soon as possible. As mentioned above, starting with Beta 17 InfluxDB 2.0 will use the same storage engine as InfluxDB 1.8, so migrating your data should be straight forward, and both the 1.x read (via InfluxQL) and write APIs should “just work” from your external applications. Our plan is that with each major release of InfluxDB 2.x, more and more users will be able to migrate from their existing instances with minimal changes.

In terms of specific seams that we are aware of now, there are a couple of things to point out. These include:

Let’s tackle each of these in turn.

Continuous Queries

InfluxDB 1.x includes a capability called continuous queries (CQs). This is typically used for downsampling data from one database and retention policy to another. CQs worked well under conditions where the amount of downsampling required was modest and the number of CQs that you needed to execute was small. At scale, the recommendation was to leverage Kapacitor (see below) for downsampling. In InfluxDB 2.0, there is a native task subsystem which allows you to define downsampling tasks (along with a much, much richer set of recurring actions). But, it will require you to redefine your downsampling CQ using Flux. If you are relying upon CQs today, we recommend extracting those from your InfluxDB 1.x instance (using show continuous queries) prior to upgrade. If you need help understanding how to convert these to Flux, please reach out via our community channels and/or have a look at our documentation for tips.

Supported Protocols

There are a number of “native” protocols that InfluxDB1.x supports. These include:

  • CollectD
  • Graphite
  • OpenTSDB
  • Prometheus
  • UDP

For InfluxDB 2.0 OSS GA, none of these will be supported directly. However, for the write compatibility, you can leverage Telegraf as an intermediary to translate from the source protocol to InfluxDB 2.0. Outside of the Prometheus remote write support, we do not currently plan to support any of these other protocols directly. Of course, we are open to community feedback. So, please let us know your thoughts!

Platform Availability

Initially the various Linux flavors will be made available as part of InfluxDB 2.0 OSS GA. Windows and ARM packaging is planned, but will arrive after the initial GA launch. If you are using Windows or ARM, you’ll want to hold off on the upgrade process until we make these platforms available.

Kapacitor

If you are using Kapacitor, you can continue to do so with InfluxDB OSS 2.0. In all cases, you will need to update the Kapacitor configuration to use the appropriate security credentials per 1.x compatibility API docs (essentially using the appropriate user and token to allow access).

As Kapacitor supports 2 different styles of TICKscripts, it is important to understand how each continues to work with InfluxDB 2.0 OSS.

Batch-style TICKscripts

If you have Batch-style TICKscripts, these will work, unchanged via the 1.x compatibility APIs provided as part of InfluxDB 2.0 OSS.

Stream-style TICKscripts

InfluxDB 2.0 OSS does not provide a subscription AP like InfluxDB 1.x does. However, best practice is to write data directly to BOTH Kapacitor and InfluxDB. (See dual writes using Telegraf.) Doing so will allow stream tasks to continue to function, unchanged.

If you use other mechanisms to feed data in, Telegraf could be used as an interstiatial layer to feed both Kapacitor and InfluxDB. There are other architectures and mechanisms possible, but the bottom line is that by feeding the data directly to both Kapacitor and InfluxDB, you can continue to use stream tasks with InfluxDB 2.0 OSS.

Over the longer term the idea is to use the underlying and native Task subsystem as part of InfluxDB 2.0 OSS for the batch style tasks. But, depending on how many tasks you currently have, it may take some time to translate these. So, we believe we have provided a means for you to selectively translate these and take your time to do so while continuing to leverage what you’ve already created. We are continuing to expand the number of notification endpoints (aka Alert Handlers in Kapacitor) that can be called via InfluxDB 2.0 and if you have a favorite that you don’t see, please let us know.

If you are a heavy Kapacitor user and have follow-up questions, please just reach out through our community channels!

Dashboards, Client Libraries and other 1.x Read/Write API usage

Finally, with the 1.x Compatibility APIs delivered as part of InfluxDB 2.0 OSS, you can continue to leverage existing dashboarding tools such as Chronograf and Grafana – unchanged. However, you will need to re-configure the data sources to point at your new InfluxDB 2.0 OSS instance with the appropriate security credentials.

This also means that you can use all of your other existing technologies integrated with either the read or write APIs. The only change required is to adjust the endpoint and security credentials. This includes everything from Telegraf agents to Client Libraries and custom code that you may have written. Of course, over time, we anticipate that you’ll want to take advantage of the 2.0 APIs, but this approach creates as seamless an experience as we could think of to allow you to get going with InfluxDB 2.0.

Town Hall

We will be holding a live event on August the 25th (start time TBD), to give you the opportunity to hear more details directly from engineering managers, and especially to ask questions. The sessions are designed to be interactive, with a short discussion or presentation followed by taking questions from the community.

The Town Hall will be a YouTube live-stream for you to watch, and we will use the #community channel in our Slack to let you chat with the speakers and ask your questions. Questions will be answered as part of the live-stream, and a recording of the video will be published to your YouTube channel after the event.

Speakers

  • Michael Hall - Introduction to the event, share logistics & schedule
  • Ryan Betts - Overall goals of the OSS 2.0 release, how the team is organized and working
  • Barbara Nelson - New UI combining Chronograf and Kapacitor
  • Nathaniel Cook - Flux language, why it was made, what it can do now, where it’s going
  • Sam Dillard - Templates & Stacks, building apps on InfluxDB Platform
  • Rick Spencer - Cloud 2, it’s relation to OSS 2, and Migrating from 1.x to 2.0 OSS

Watch

These topics and more were covered by Tim Hall during our monthly Community Office Hours on August 12, 2020p.

4 Likes