Aggregate

A Callable for returning aggregated values of columns. This is similar to the Group By operation in SQL queries.

Definition

class controller.callable.transformer.Aggregate(self, group_by: list[ColumnReference], values: list[ColumnReference], aggregation_function: AggregationFunction, output_names: list[str])
Callable name:

aggregate

Callable type:

Transformer

Parameters:
  • group_by (list[ColumnReference]) – A list of column references where values in the columns are used to group the rows.

  • values (list[ColumnReference]) – A list of column references where values in the columns are grouped according to the group_by columns.

  • aggregation_function (AggregationFunction) – The function used to summarize the values.

  • output_names – A list of output names for the columns of aggregated values to return.

Parameters

group_by

A list of column references where values in the columns are used to group the rows.

Type:

list[ColumnReference]

Required:

True

Choices:

All columns in the target DataFrame

values

A list of column references where values in the columns are grouped according to the group_by columns.

Type:

list[ColumnReference]

Required:

True

Choices:

All columns in the target DataFrame

aggregation_function

The function used to summarize the values.

Type:

AggregationFunction

Required:

True

Choices:

All available names of aggregation functions

output_names

A list of output names for the columns of aggregated values to return.

Type:

list[str]

Required:

True

Output

A DataFrame is returned containing columns of aggregated (grouped) values.

  • The names of the resulting columns are assigned by zipping the output_names parameter with the values parameter. That is, the first item in the output_names list is assigned to the first item in the values list, and so on.

Note

The original input data is not preserved.

Example configuration

{
    "name": "foo",
    "callable": "aggregate",
    "params": {
        "group_by": ["$assignment1_answers.question.title"],
        "values": ["assignment1_answers.question_response.score"],
        "aggregation_function": "mean",
        "output_names": ["assignment1_mean_score"]
    }
}