Guide to Using the API

The best way to explore the capabilities of the API is to use the automatically-generated OpenAPI interface at http://localhost:8080/v1/ui/. (In this section, all links will assume the API has been deployed at the URL http://localhost:8080, which is the default during development.)

Some more specific examples based on the sample datasets provided by protari-sample are described below.

All requests can include an Authorization header to access datasets with authorization requirements, eg. with the database auth interface:

curl -X GET --header 'Accept: application/json' --header 'Authorization: Protari key=abc123' 'http://localhost:8080/v1/datasets/'

To use Protari with OpenID Connect, you will need a token from the authentication provider, and provide it in place of TOKEN in the following:

curl -X GET --header 'Accept: application/json' --header 'authorization: Bearer TOKEN' 'http://localhost:8080/v1/datasets/'

The sample datasets do not require any authorization keys.

/about

The http://localhost:8080/v1/about returns just the "header" property that is included at every endpoint:

{
  "header": {
    "api_version": "1.0.0",
    "base_url": "https://protari.example.com/v1",
    "organization": {
      "name": "Sample",
      "title": "Sample organization"
    },
    "prepared_at": "2018-07-01T00:44:53.742560Z",
    "terms": "Insert some terms that you must agree to here."
  }
}

/datasets/

The http://localhost:8080/v1/datasets/ endpoint returns a json-formatted list of the available datasets, with high-level metadata, that are visible to all users.

Eg.

{
  "header": {
    "api_version": "1.0.0",
    "organization": {
      "name": "Sample",
      "title": "Sample organization"
    },
    "prepared_at": "2018-07-04T04:51:03.694901Z",
    "terms": "Insert some terms that you must agree to here."
  },
  "datasets": [
    {
      "description": "Dummy data set.",
      "name": "sample",
      "notes": "This is just sample data. None of this data is real.",
      "title": "Dummy data set",
      "unit": {
        "plural": "people",
        "singular": "person"
      }
    },
    {
      "description": "Dummy longitudinal data set.",
      "name": "sample_longitudinal",
      "notes": "This is just sample data. None of this data is real.",
      "title": "Dummy longitudinal data set",
      "unit": {
        "plural": "people",
        "singular": "person"
      }
    }
  ]
}

The name fields above are the names you can use to refer to datasets in subsequent API calls.

/datasets/{dataset_name}

Place a specific dataset name at the end of the http://localhost:8080/v1/datasets/ endpoint to request the full metadata (including the accessible fields and values) for that dataset.

The output for http://localhost:8080/v1/datasets/sample is:

{
  "description": "This dataset contains sample data.",
  "query_classes": {
    "aggregation": {
      "allowed_functions": [
        {
          "name": "count",
          "parameter_types": [
            {
              "description": "A field whose sentinel values should be excluded from the count.",
              "is_required": false
            }
          ]
        },
        {
          "name": "mean",
          "parameter_types": [
            {
              "description": "The field whose non-sentinel values are to be averaged.",
              "is_required": true,
              "requires_numeric_data": true
            }
          ]
        },
        {
          "name": "sum",
          "parameter_types": [
            {
              "description": "The field whose non-sentinel values are to be summed.",
              "is_required": true,
              "requires_numeric_data": true
            }
          ]
        }
      ],
      "field_exclusion_rules": [
        {
          "limit": 4
        }
      ]
    }
  },
  "fields": [
    {
      "has_numeric_data": false,
      "name": "SEX",
      "title": "Sex",
      "type": "string",
      "values": [
        {
          "name": "F",
          "title": "Female"
        },
        {
          "name": "M",
          "title": "Male"
        },
        {
          "is_sentinel": true,
          "name": "",
          "title": "Not stated"
        }
      ]
    },
    {
      "has_numeric_data": false,
      "name": "COUNTRY",
      "title": "Country of Birth",
      "type": "integer",
      "values": [
        {
          "name": "1000",
          "title": "Oceania",
          "values": [
            {
              "name": "1100",
              "title": "Australia (including External Territories)",
              "values": [
                {
                  "name": "1101",
                  "title": "Australia"
                },
                {
                  "name": "1102",
                  "title": "Norfolk Island"
                },
                {
                  "name": "1199",
                  "title": "Australian External Territories"
                }
              ]
            },
            {
              "name": "1201",
              "title": "New Zealand"
            },
            {
              "name": "1300",
              "title": "Melanesia",
              "values": [
                {
                  "name": "1301",
                  "title": "New Caledonia"
                },
                {
                  "name": "1302",
                  "title": "Papua New Guinea"
                },
                {
                  "name": "1303",
                  "title": "Solomon Islands"
                },
                {
                  "name": "1304",
                  "title": "Vanuatu"
                }
              ]
            }
          ]
        },
        {
          "name": "9999",
          "title": "Other"
        },
        {
          "is_sentinel": true,
          "name": "null",
          "title": "Not stated"
        }
      ]
    },
    {
      "fields": [
        {
          "description": "Type of dwelling of usual residence",
          "has_numeric_data": false,
          "name": "DWELL",
          "title": "Dwelling Type",
          "type": "string",
          "values": [
            {
              "name": "001",
              "title": "House"
            },
            {
              "name": "002",
              "title": "Apartment"
            },
            {
              "name": "003",
              "title": "Other"
            }
          ]
        },
        {
          "can_be_above_maximum": true,
          "can_be_below_minimum": false,
          "exclusive_maximum": 8,
          "has_numeric_data": true,
          "maximum": 7,
          "minimum": 0,
          "name": "ROOMS",
          "title": "Number of Bedrooms",
          "type": "integer",
          "values": [
            {
              "is_sentinel": true,
              "name": "null",
              "title": "Not stated"
            },
            {
              "is_sentinel": true,
              "name": "999",
              "title": "Not applicable"
            },
            {
              "is_numeric": true,
              "name": "0",
              "title": "None (including bedsits)",
              "value": 0
            },
            {
              "is_numeric": true,
              "name": "1",
              "title": "1",
              "value": 1
            },
            {
              "is_numeric": true,
              "name": "2",
              "title": "2",
              "value": 2
            },
            {
              "is_numeric": true,
              "name": "3",
              "title": "3",
              "value": 3
            },
            {
              "is_numeric": true,
              "name": "4",
              "title": "4",
              "value": 4
            },
            {
              "is_numeric": true,
              "name": "5",
              "title": "5",
              "value": 5
            },
            {
              "is_numeric": true,
              "name": "6",
              "title": "6",
              "value": 6
            },
            {
              "is_numeric": true,
              "name": "7",
              "title": "7",
              "value": 7
            },
            {
              "exclusions": [
                999
              ],
              "is_numeric": true,
              "minimum": 8,
              "name": ">=8",
              "title": ">=8"
            }
          ]
        }
      ],
      "title": "Dwelling"
    },
    {
      "description": "Postcode of usual residence",
      "has_numeric_data": false,
      "name": "POSTCODE",
      "region_type": "poa",
      "title": "Postcode",
      "type": "string"
    },
    {
      "can_be_above_maximum": true,
      "can_be_below_minimum": true,
      "decimal_places": 2,
      "exclusive_maximum": 100,
      "has_numeric_data": true,
      "maximum": 99.99,
      "minimum": 0,
      "name": "AMOUNT",
      "precision": 0.01,
      "title": "Dollar Amount",
      "type": "number",
      "values": [
        {
          "exclusive_maximum": 0,
          "is_numeric": true,
          "maximum": -0.01,
          "name": "<0",
          "title": "<0"
        },
        {
          "exclusive_maximum": 10,
          "is_numeric": true,
          "maximum": 9.99,
          "minimum": 0,
          "name": "0-9.99",
          "title": "0-9.99"
        },
        {
          "exclusive_maximum": 20,
          "is_numeric": true,
          "maximum": 19.99,
          "minimum": 10,
          "name": "10-19.99",
          "title": "10-19.99"
        },
        {
          "exclusive_maximum": 30,
          "is_numeric": true,
          "maximum": 29.99,
          "minimum": 20,
          "name": "20-29.99",
          "title": "20-29.99"
        },
        {
          "exclusive_maximum": 40,
          "is_numeric": true,
          "maximum": 39.99,
          "minimum": 30,
          "name": "30-39.99",
          "title": "30-39.99"
        },
        {
          "exclusive_maximum": 50,
          "is_numeric": true,
          "maximum": 49.99,
          "minimum": 40,
          "name": "40-49.99",
          "title": "40-49.99"
        },
        {
          "exclusive_maximum": 60,
          "is_numeric": true,
          "maximum": 59.99,
          "minimum": 50,
          "name": "50-59.99",
          "title": "50-59.99"
        },
        {
          "exclusive_maximum": 70,
          "is_numeric": true,
          "maximum": 69.99,
          "minimum": 60,
          "name": "60-69.99",
          "title": "60-69.99"
        },
        {
          "exclusive_maximum": 80,
          "is_numeric": true,
          "maximum": 79.99,
          "minimum": 70,
          "name": "70-79.99",
          "title": "70-79.99"
        },
        {
          "exclusive_maximum": 90,
          "is_numeric": true,
          "maximum": 89.99,
          "minimum": 80,
          "name": "80-89.99",
          "title": "80-89.99"
        },
        {
          "exclusive_maximum": 100,
          "is_numeric": true,
          "maximum": 99.99,
          "minimum": 90,
          "name": "90-99.99",
          "title": "90-99.99"
        },
        {
          "is_numeric": true,
          "minimum": 100,
          "name": ">=100",
          "title": ">=100"
        }
      ]
    }
  ],
  "header": {
    "api_version": "1.0.0",
    "base_url": "http://localhost:8080/v1",
    "organization": {
      "name": "Sample",
      "title": "Sample organization"
    },
    "prepared_at": "2018-07-07T00:51:43.661882Z",
    "terms": "Insert some terms that you must agree to here."
  },
  "name": "sample",
  "notes": "This is just sample data. None of this data is real.",
  "title": "Sample dataset",
  "unit": {
    "plural": "people",
    "singular": "person"
  }
}

The field names and value names are the names you can use to refer to fields and values in subsequent API calls.

You can flatten the fields hierarchy, the value hierarchy, or both, by passing flatten=fields, values or all respectively.

To suppress all values from the output, pass show_values=false. (In this case flatten=values has no effect.)

/datasets/{dataset_name}/fields/{field_name}

Use this endpoint to get metadata (including the allowed values) about a specific field. You can optionally pass flatten=values as for the previous endpoint.

Eg. The output for http://localhost:8080/v1/datasets/sample/fields/SEX is:

{
  "field": {
    "has_numeric_data": false,
    "name": "SEX",
    "title": "Sex",
    "type": "string",
    "values": [
      {
        "name": "F",
        "title": "Female"
      },
      {
        "name": "M",
        "title": "Male"
      },
      {
        "is_sentinel": true,
        "name": "",
        "title": "Not stated"
      }
    ]
  },
  "header": {
    "api_version": "1.0.0",
    "base_url": "http://localhost:8080/v1",
    "organization": {
      "name": "Sample",
      "title": "Sample organization"
    },
    "prepared_at": "2018-07-15T22:49:23.680344Z",
    "terms": "Insert some terms that you must agree to here."
  }
}

/datasets/{dataset_name}/aggregation

The http://localhost:8080/v1/datasets/{dataset_name}/aggregation URL is the workhorse of the API, letting you ask for aggregated data from the dataset.

With no further parameters, at this URL the API returns a perturbed count of all fields in the dataset in json format, broken down by year, eg. http://localhost:8080/v1/datasets/sample_longitudinal/aggregation:

{
  "fields": [
    {
      "as_string": "YEAR",
      "is_longitudinal": true,
      "name": "YEAR",
      "title": "Year"
    },
    {
      "as_string": "perturbed count",
      "function": {
        "name": "count"
      },
      "is_result": true,
      "is_perturbed": true,
      "name": "perturbed count",
      "title": "perturbed count"
    }
  ],
  "header": {
    "api_version": "1.0.0",
    "organization": {
      "name": "Sample",
      "title": "Sample organization"
    },
    "prepared_at": "2018-07-08T22:55:36.355887Z",
    "terms": "Insert some terms that you must agree to here."
  },
  "query": {
    "as_string": "function=count",
    "function": [
      {
        "as_string": "count",
        "type": {
          "name": "count",
          "parameter_types": [
            {
              "description": "A field whose sentinel values should be excluded from the count.",
              "is_required": false
            }
          ]
        }
      }
    ],
    "dataset": {
      "description": "Description of dummy longitudinal data set.",
      "name": "sample_longitudinal",
      "notes": "This is just sample data. None of this data is real.",
      "title": "Dummy longitudinal data set",
      "unit": {
        "plural": "people",
        "singular": "person"
      }
    }
  },
  "values": [
    [
      2000,
      199
    ],
    [
      2005,
      199
    ]
  ]
}

You can request this in csv format by appending /csv, with the result:

YEAR,perturbed count
2000,199
2005,199

The SDMX-JSON format is also available by appending /sdmx-json.

To make more specific queries, you simply add query parameters to the URL, starting with ? and separating them with &, eg.

http://localhost:8080/v1/datasets/sample_longitudinal/aggregation?group_by=DWELL&where=SEX=M

The results are confidentialised according to the dataset's configuration.

group by

Use this to show the results broken down according to the value in a particular field (or fields). Supply as a comma-separated list of field names, eg.

http://localhost:8080/v1/datasets/sample_longitudinal/aggregation/csv?group_by=DWELL,SEX

returns:

DWELL,SEX,YEAR,perturbed count
001,F,2000,18
001,F,2005,32
001,M,2000,18
001,M,2005,27
001,,2000,15
001,,2005,18
002,F,2000,25
002,F,2005,21
002,M,2000,24
002,M,2005,24
002,,2000,24
002,,2005,15
003,F,2000,30
003,F,2005,21
003,M,2000,24
003,M,2005,15
003,,2000,18
003,,2005,21

The first line tells you there were approximately 18 females dwelling in houses (DWELL "001") in 2000, etc.

Notice that the API has automatically grouped by the longitudinal field, YEAR.

You can also group by numeric fields, eg.

http://localhost:8080/v1/datasets/sample/aggregation/csv?group_by=AMOUNT

returns:

AMOUNT,perturbed count
0-9.99,31
10-19.99,15
20-29.99,25
30-39.99,21
40-49.99,15
50-59.99,25
60-69.99,18
70-79.99,15
80-89.99,18
90-99.99,21

and

http://localhost:8080/v1/datasets/sample/aggregation/csv?group_by=ROOMS

returns:

ROOMS,perturbed count
1,15
2,18
3,25
4,43
5,36
6,9
7,12
>=8,34
,6

Custom groups

You can specify custom groups for the group by fields using colons to separate groups and semi-colons to separate values within the groups, eg.

http://localhost:8080/v1/datasets/sample/aggregation?group_by=POSTCODE:6722;6725:6429;6710

This returns three perturbed counts: those in postcodes 6722 or 6725; those in postcodes 6429 or 6710; and those who live in any other postcode:

{
  ...
  "values": [
    [
      "6722;6725",
      70
    ],
    [
      "6429;6710",
      58
    ],
    [
      "other",
      74
    ]
  ]
}

For integer categorical fields, you can replace sequences such as 0;1;2;3 with 0-3. The COUNTRY field on the sample dataset is a good example:

http://localhost:8080/v1/datasets/sample/aggregation?group_by=COUNTRY:1101-1201

From the output, you will see that Protari has treated this as the same as 1101;1102;1199;1201:

{
  "fields": [
    {
      "as_string": "COUNTRY:1101-1201",
      "custom_groups": {
        "else": {
          "name": "other"
        },
        "groups": [
          {
            "name": "1101-1201",
            "values": [
              {
                "as_string": "1101"
              },
              {
                "as_string": "1102"
              },
              {
                "as_string": "1199"
              },
              {
                "as_string": "1201"
              }
            ]
          }
        ]
      },
      "name": "COUNTRY",
      "title": "Country of Birth"
    },
    {
      "as_string": "perturbed count",
      "function": {
        "name": "count"
      },
      "is_perturbed": true,
      "is_result": true,
      "name": "perturbed count",
      "title": "perturbed count"
    }
  ],
  "values": [
    [
      "1101-1201",
      129
    ],
    [
      "other",
      72
    ]
  ],
  ...
}

You can even refer to hierarchical values in custom groups, eg. group_by=COUNTRY:1100:1201;1300.

On range data fields, you can use the full range syntax to define the custom groups, eg:

http://localhost:8080/v1/datasets/sample/aggregation?group_by=ROOMS:0:1;2:3-6:>=7

returns:

{
  "fields": [
    {
      "as_string": "ROOMS:0:1;2:3-6:>=7",
      "custom_groups": {
        "else": {
          "name": "other"
        },
        "groups": [
          {
            "name": "0",
            "values": [
              {
                "as_string": "0",
                "value": 0
              }
            ]
          },
          {
            "name": "1;2",
            "values": [
              {
                "as_string": "1",
                "value": 1
              },
              {
                "as_string": "2",
                "value": 2
              }
            ]
          },
          {
            "name": "3-6",
            "values": [
              {
                "as_string": "3-6",
                "exclusive_maximum": 7,
                "maximum": 6,
                "minimum": 3
              }
            ]
          },
          {
            "name": ">=7",
            "values": [
              {
                "as_string": ">=7",
                "exclusions": [
                  999
                ],
                "minimum": 7
              }
            ]
          }
        ]
      },
      "name": "ROOMS",
      "title": "Number of Bedrooms"
    },
...
  "values": [
    [
      "0",
      12
    ],
    [
      "1;2",
      32
    ],
    [
      "3-6",
      111
    ],
    [
      ">=7",
      34
    ],
    [
      "other",
      15
    ]
  ]
}

Custom groups cannot overlap (eg. group_by=ROOMS:3-6:>=5 will raise an error).

where

You can restrict the query to records meeting a condition with the "where" parameter. For string fields, only equality is supported (you can include more than one value by separating them with semi-colons). For numeric data fields, equality and ranges are both supported. You can include multiple conditions.

Eg. to count the total number of males, and return a csv file:

http://localhost:8080/v1/datasets/sample_longitudinal/aggregation/csv?where=SEX=M

which returns:

YEAR,perturbed count
2000,67
2005,66

You can request records with a null or empty value as follows:

http://localhost:8080/v1/datasets/sample_longitudinal/aggregation?where=SEX=

Some examples including multiple values and ranges are:

http://localhost:8080/v1/datasets/sample_longitudinal/aggregation?where=ROOMS=1;3-5;>=7
http://localhost:8080/v1/datasets/sample_longitudinal/aggregation?where=SEX=M;F
http://localhost:8080/v1/datasets/sample_longitudinal/aggregation?where=ROOMS=1-5,DWELL=001

Ranges are inclusive of both ends. The where clause in the first of these examples is returned with the result as follows:

    "where": [
      {
        "as_string": "ROOMS=1;3-5;>=7",
        "condition_values": {
          "equals": [
            {
              "value": "1"
            },
            {
              "as_string": "3-5",
              "exclusive_maximum": 6,
              "maximum": 5,
              "minimum": 3
            },
            {
              "as_string": ">=7",
              "minimum": 7
            }
          ]
        },
        "name": "ROOMS",
        "title": "Number of Rooms",
        "values_as_received": {
          "equals": [
            {
              "value": "1"
            },
            {
              "as_string": "3-5",
              "exclusive_maximum": 6,
              "maximum": 5,
              "minimum": 3
            },
            {
              "as_string": ">=7",
              "minimum": 7
            }
          ]
        }
      }
    ]

You can use floating-point "number" fields in a where clause too, as long as they align with the allowed intervals, eg.

http://localhost:8080/v1/datasets/sample/aggregation?where=AMOUNT=0-49.99

If you use a range that is not aligned with the allowed intervals, eg.

http://localhost:8080/v1/datasets/sample/aggregation?where=AMOUNT=0-13

the result is an error with a suggestion for an allowed query:

{
  "detail": "Upper bound 13 not allowed - try 9.99 or 19.99 instead.",
  "status": 400,
  "title": "",
  "type": "about:blank"
}

For integer categorical fields, you can also replace sequences such as 0;1;2;3 with 0-3 (as noted earlier for custom group). The COUNTRY field on the sample dataset is a good example:

http://localhost:8080/v1/datasets/sample/aggregation?where=COUNTRY=1101-1201

This returns (note the 1101-1201 has been expanded to the equivalent 1101;1102;1199;1201):

{
  ...
  "query": {
    ...
    "where": [
      {
        "as_string": "COUNTRY=1101-1201",
        "condition_values": {
          "equals": [
            {
              "as_string": "1101"
            },
            {
              "as_string": "1102"
            },
            {
              "as_string": "1199"
            },
            {
              "as_string": "1201"
            }
          ]
        },
        "name": "COUNTRY",
        "title": "Country of Birth",
        "values_as_received": {
          "equals": [
            {
              "as_string": "1101-1201"
            }
          ]
        }
      }
    ]
  },
  "values": [
    [
      129
    ]
  ]
}

Using where with hierarchical values

When you use a hierarchical value in a where clause, the API will expand it into its child values. To see this in action, try this query:

http://localhost:8080/v1/datasets/sample/aggregation?where=COUNTRY=1100

You will receive a result like:

{
  "fields": [
    {
      "as_string": "perturbed count",
      "function": {
        "name": "count"
      },
      "is_result": true,
      "is_perturbed": true,
      "name": "perturbed count",
      "title": "perturbed count"
    }
  ],
  "header": {
    "api_version": "1.0.0",
    "base_url": "http://localhost:8080/v1",
    "organization": {
      "name": "Sample",
      "title": "Sample organization"
    },
    "prepared_at": "2018-07-10T01:24:28.903917Z",
    "terms": "Insert some terms that you must agree to here."
  },
  "query": {
    "as_string": "function=count&where=COUNTRY=1100",
    "function": [
      {
        "as_string": "count",
        "type": {
          "name": "count",
          "parameter_types": [
            {
              "description": "A field whose sentinel values should be excluded from the count.",
              "is_required": false
            }
          ]
        }
      }
    ],
    "dataset": {
      "description": "Description of dummy data set.",
      "name": "sample",
      "notes": "This is just sample data. None of this data is real.",
      "title": "Dummy data set",
      "unit": {
        "plural": "people",
        "singular": "person"
      }
    },
    "where": [
      {
        "as_string": "COUNTRY=1100",
        "condition_values": {
          "equals": [
            {
              "value": "1101"
            },
            {
              "value": "1102"
            },
            {
              "value": "1199"
            }
          ]
        },
        "name": "COUNTRY",
        "title": "Country of Birth",
        "values_as_received": {
          "equals": [
            {
              "value": "1100"
            }
          ]
        }
      }
    ]
  },
  "values": [
    [
      109
    ]
  ]
}

In the returned where clause, you can see that the API has converted the field value 1100 into its child values 1101, 1102 and 1199. That's because 1100 is configured as a "parent" value. The received query values (here, only 1100) are also returned in values_as_received.

function

So far, all the requests have returned counts. You can also request means and sums by using the function parameter, eg.

http://localhost:8080/v1/datasets/sample_longitudinal/aggregation/csv?function=mean ROOMS

which returns

YEAR,perturbed mean ROOMS
2000,4.36
2005,4.32

In this result, sentinel and "null" values of ROOMS have been excluded. (Eg. 999 is a "sentinel" value because it corresponds to an unknown number of rooms. Including it in the mean would skew the results.)

You can see the list of excluded values (sentinels and null values) in the json output:

http://localhost:8080/v1/datasets/sample_longitudinal/aggregation?function=mean ROOMS
  "query": {
    "as_string": "function=mean ROOMS",
    "function": [
      {
        "as_string": "mean ROOMS",
        "parameters": [
          {
            "as_string": "ROOMS",
            "condition_values": {
              "not_equals": [
                {
                  "value": ""
                },
                {
                  "value": "999"
                }
              ]
            },
            "name": "ROOMS",
            "title": "Number of Rooms"
          }
        ],
        "type": {
          "name": "mean",
          "parameter_types": [
            {
              "description": "The field whose non-sentinel values are to be averaged.",
              "is_required": true,
              "requires_numeric_data": true
            }
          ]
        }
      }
    ],
    ...
  }

The result has also been rounded to an appropriate number of decimal places (in this case because ROOMS is an integer, but more generally based on the field's decimal_places).

You can also exclude a field's sentinel and null values from counts, by including the field as an argument to count, eg.:

http://localhost:8080/v1/datasets/sample_longitudinal/aggregation?function=count ROOMS
http://localhost:8080/v1/datasets/sample_longitudinal/aggregation?function=count SEX

giving counts of 176 in 2000 and 174 in 2005 for people with valid ROOMS values, and counts of 139 in 2000 and 144 in 2005 for people with a stated SEX. Recall the total count is 199 in each year.

Requesting Multiple Functions

Optionally, the data custodian may allow users to request more than one function at a time (by default you cannot). Be aware that in this case, sentinel and null values for all function fields are excluded for all the results.

So, for example:

http://localhost:8080/v1/datasets/sample_longitudinal/aggregation?function=mean ROOMS,count SEX

returns

YEAR,perturbed mean ROOMS,perturbed count SEX
2000,4.339,125
2005,4.276,128

ie. 125 for the count in 2000 and 128 in 2005 - these are the records that have both a valid ROOMS value and a stated SEX.

Because this is potentially misleading, max_functions defaults to 1 if it is not included in the global settings.

totals

Add totals=true to retrieve all the cross-totals in the one request. This is only available with the standard json format. Eg.

http://localhost:8080/v1/datasets/sample_longitudinal/aggregation/csv?group_by=DWELL,SEX&totals=true

also returns the results of grouping by DWELL on its own, SEX on its own, and the result with no group by parameter. If you think of the result as a two-dimensional table with DWELL values down and SEX values across, with each entry in the table being a count, then the first totals are the column totals, the second are row totals and the final is the grand total in the bottom right corner.