Reducing data usage for your dashboards
When creating dashboards for your users, you would want to reduce the amount of data consumed by your dashboards. This is especially important if the dashboards are reading data from a database whose egress usage is metered like in the case of Google Big Query. This is important for your users as this will reduce the amount of data they have to download as well as reduce the amount of time it takes to refresh dashboards leading to a better experience.
We will break down this process into the following methods:
Each of these could reduce data usage and maybe you might have to apply multiple methods to work for your use case.
Dimensionality reduction
This method involves selecting just the columns from a datasource needed to build a dashboard. For example, if your original datasource had 250 columns and you only need 25 to build dashboards, thats data savings of about 90%.
For a practical example of an SQL datasource, instead of
SELECT * FROM table
you can do
SELECT col1, col2, col3 FROM table
Row filtering
This method involves filtering out rows from a datasource that are not needed to build a dashboard. For example, if your original datasource had 10,000 rows and you only need 2,000 to build dashboards, thats data savings of about 80%.
For a practical example of an SQL datasource, instead of
SELECT * FROM table
you can do
SELECT * FROM table WHERE col1 = 'value' and col2 = 'value'
This filtering could be based on columns like dates or to filter out null and blank values.
Pre-aggregation
This method involves aggregating data based on business logic before it is sent to the dashboard. For example, if your original datasource had rows of minute by minute data you only need daily averages to build dashboards, thats data savings of about 98%.
This would mean you would have some system setup that would regularly aggregate the data and store that in a secondary table. This secondary table would then be connected to the dashboard instead of the primary raw data.
Aggregation functions
This method involves using aggregation functions in the query language to reduce the amount of data sent to the dashboard. This is a simpler version of pre-aggregation using functions like AVG
, SUM
, MIN
, MAX
, COUNT
, COUNT_DISTINCT
and MEDIAN
.
For example, in the previous example, instead of writing code to save a copy of the aggregated data to a secondary table you could do something like this:
SELECT
DATE(datetime) AS date,
AVG(value) AS average
FROM
table
GROUP BY
date
ORDER BY
date;
This would mean less code that you would have to write as well as less storage needed, however you would be limited by the functions available.
Reduce dashboard refresh intervals
This method involves reducing the frequency at which dashboards are refreshed. This is quite easy to setup on Onvo AI and can be done by going to the dashboard settings and changing the refresh interval. For example instead of refreshing your dashboard every 5 minutes as is the default, refreshing it once an hour would reduce the amount of data sent to the dashboard by about 90%.
Reduce copies of dashboards
This method involves reducing the number of dashboards that are created. For every embed user setup, Onvo AI creates a copy of the dashboard for them. This is useful for when every user would need a separate copy of a dashboard to edit.
However, if you have a dashboard that is read-only and the data on it is the sample for multiple users, like all the users in an organization, you can reduce the data usage substantially by creating a single dashboard for the organization instead of creating a separate dashboard for each user.
For example if you have an organization with 10 users, this method would reduce the data usage by about 90%.