Onvo DocumentationReducing Data Usage

Reducing data usage for your dashboards

When creating dashboards for your users, you would want to reduce the amount of data consumed by your dashboards. This is especially important if the dashboards are reading data from a database whose egress usage is metered like in the case of Google Big Query. This is important for your users as this will reduce the amount of data they have to download as well as reduce the amount of time it takes to refresh dashboards leading to a better experience.

We will break down this process into the following methods:


Dimensionality reduction
Reduce the number of columns fetched
Row filtering
Reduce the number of rows fetched
Pre-aggregation
Aggregating raw data based on business logic
Aggregation functions
Simple aggregation functions built into the query language
Reduce dashboard refresh intervals
Reduce how often data is refreshed
Reduce copies of dashboards
Reduce number of dashboards overall

Each of these could reduce data usage and maybe you might have to apply multiple methods to work for your use case.

Dimensionality reduction

This method involves selecting just the columns from a datasource needed to build a dashboard. For example, if your original datasource had 250 columns and you only need 25 to build dashboards, thats data savings of about 90%.

Dimensionality reduction

For a practical example of an SQL datasource, instead of

SELECT * FROM table

you can do

SELECT col1, col2, col3 FROM table

Row filtering

This method involves filtering out rows from a datasource that are not needed to build a dashboard. For example, if your original datasource had 10,000 rows and you only need 2,000 to build dashboards, thats data savings of about 80%.

Row filtering

For a practical example of an SQL datasource, instead of

SELECT * FROM table

you can do

SELECT * FROM table WHERE col1 = 'value' and col2 = 'value'

This filtering could be based on columns like dates or to filter out null and blank values.

Pre-aggregation

This method involves aggregating data based on business logic before it is sent to the dashboard. For example, if your original datasource had rows of minute by minute data you only need daily averages to build dashboards, thats data savings of about 98%.

This would mean you would have some system setup that would regularly aggregate the data and store that in a secondary table. This secondary table would then be connected to the dashboard instead of the primary raw data.

Aggregation functions

This method involves using aggregation functions in the query language to reduce the amount of data sent to the dashboard. This is a simpler version of pre-aggregation using functions like AVG, SUM, MIN, MAX, COUNT, COUNT_DISTINCT and MEDIAN. For example, in the previous example, instead of writing code to save a copy of the aggregated data to a secondary table you could do something like this:

SELECT 
  DATE(datetime) AS date,
  AVG(value) AS average
FROM 
  table
GROUP BY 
  date
ORDER BY 
  date;

This would mean less code that you would have to write as well as less storage needed, however you would be limited by the functions available.

Reduce dashboard refresh intervals

This method involves reducing the frequency at which dashboards are refreshed. This is quite easy to setup on Onvo AI and can be done by going to the dashboard settings and changing the refresh interval. For example instead of refreshing your dashboard every 5 minutes as is the default, refreshing it once an hour would reduce the amount of data sent to the dashboard by about 90%.

Reducing dashboard refresh intervals

Reduce copies of dashboards

This method involves reducing the number of dashboards that are created. For every embed user setup, Onvo AI creates a copy of the dashboard for them. This is useful for when every user would need a separate copy of a dashboard to edit.

However, if you have a dashboard that is read-only and the data on it is the sample for multiple users, like all the users in an organization, you can reduce the data usage substantially by creating a single dashboard for the organization instead of creating a separate dashboard for each user.

For example if you have an organization with 10 users, this method would reduce the data usage by about 90%.

Reduce copies of dashboards