Aggregation

What is data aggregation?

Aggregation is a process in which the relationship between one or more entities is treated as a single entity. One of the purpose of data aggregation is to be able to obtain more information about a particular group of entities based on specific variables. Data aggregation is often used when users want to create reports, because it offers the possibility to present the data from the database in a way that it’s suitable for human analysis. 

In SQL languages the data aggregation is implemented through functions that return a single value, obtained by groping together the values of multiple rows. Usually this single value has a more significant meaning for the user. 

Aggregate functions

The aggregate functions are: SUM, AVG, STD, MIN, MAX, COUNT and CONCAT.

Aggregate functionMeaning
SUM()Returns the sum
AVG()Returns the average
STD()Returns the population standard deviation
MIN()Returns the minimum value
MAX()Returns the maximum value
COUNT()Returns a count of the number of rows returned
CONCAT()Returns a concatenated string

Examples

SUM

Let us imagine we have an entity called invoice, that has a Parent-Child relationship with the entity invoice_items. Each item has a property called amount that tells us the amount price for each item. In order to find out the total amount of an invoice we will need to use the SUM aggregate function.

In order to see all of the line items we can run the following FETCH query:

FETCH invoice (key, number AS "Invoice number"){
    invoice_item TO id_invoice (amout AS "Amount")
}

With the result of the query being:

keyInvoice numberAmount
91334000
913388
1010299
10102100
10102201

However we want to calculate the total amount of the invoice not just see the amount of each line. To do this we would have to re-write the query as this:

FETCH invoice (key GROUP BY, number AS "Invoice number") {
    invoice_item TO id_invoice (SUM(amount) AS "Invoice amount")
}

Now that we grouped the invoices based on the primary key of the entity, the FETCH query will automatically add the values in the property amount for each of the grouped items, and our result will look like this:

keyInvoice numberInvoice amount
91334088 (4000+88)
10102400 (99+100+201)

AVG

Let’s take another example where we have the entity student with the properties first_name, last_name, grade; and we want to find out the average grade of the class.

First to see each individual grade for each student we would run the following query:

FETCH student (key, first_name AS "First name", last_name AS "Last name", grade AS "Grade")

The result of the query will look like this:

keyFirst nameLast nameGrade
1JohnChristian7
2JasonBourne6
3AlbertCalvin9
4BradleyCooper4
5HannahBlake9

In order to find out the average grade of the class we simply have to run the following query:

FETCH student (AVG(grade) AS "Average Grade")

This will return the result:

Average Grade
7

STD

Based on the previous example we can calculate the standard deviation as follows:

FETCH student (STD(grade) AS "Standard Deviation")

This will return the result:

Standard Deviation
2

MIN and MAX

Let us find the lowest and highest grade from the previous example. To do this we would run the following query:

FETCH student (MIN(grade) AS "Min Grade", MAX(grade) AS "Max Grade")

This will return the results:

Min GradeMax Grade
49

COUNT

We can use the COUNT() function to find out how many students there are with the query:

FETCH student (COUNT(key) AS "Number of students")

This will return the result:

Number of students
5

CONCAT

The CONCAT() function is a group concatenation of a single property. The result will always return under the following format: “value1,value2,value3,…,valueN” with the values separated by a comma.

In case we want a list with all of the first names of the students we will run the following query:

FETCH student (CONCAT(first_name) AS "First name")

This will return the result:

First name
John,Jason,Albert,Bradley,Hannah