Unpacking Spatial Aggregation: The Important Frontier

Spatial aggregation is one of the techniques used to ensure data is protected when citizens disclose their personal information during census.

In many countries around the world, disputes between fishermen and marine conservation activists regarding the impact of fishing on the wider marine environment are becoming far too common.

On one side, the activists argue that fishing activities endanger the marine environment while the argument on the other side of the coin is that such lobbying is derailing economic growth and has the potential of leaving thousands of families that depend on fishing without a means of earning a living.

In any dispute, more often than not, a viable solution is brought up in the conversation. In this context, that compromise is spatial management of marine fisheries in the frame of prohibiting certain gear types in specific areas or limiting heavy towed gear use in sensitive benthic habitats. Such efforts are mitigating unwanted negative environmental impacts from fishing.

Protecting the marine environment (or rather proving that fishing is harmful) is just one of the ways spatial aggregation is applied.

But first things first. What is Spatial Aggregation, why is it important and what are some of the other ways it is being used?

Spatial aggregation
An illustration of Spatial Aggregation. Spatial Aggregation is being applied in different sectors including urban transport planning & environment protection. Image Source {ResearchGate}

Spatial Aggregation

Spatial aggregation is defined by IBM as the conglomeration of all data points for a group of resources during a specified amount of time. The conglomeration of data in Group Time Series reports is categorized under the spatial aggregation banner.

Spatial Aggregation computes statistics in sections that an input layer extends over a boundary layer. To make this point easier to understand, I will draw your attention to a good example of spatial aggregation being applied.

A business analyst working for an association of colleges applies spatial aggregation when conducting research to assess which colleges have the highest Return on Investment (RoI) in regions/counties that are home to high-value institutions. In short, spatial aggregation can be used to find colleges with above-average ROI.

How Spatial Aggregation Works

It’s actually simpler than one would think. Just calculate the weighted mean, applied for the area and line features, using the equation below.

Spatial aggregation is calculated using the weighted mean formula.


How Spatial Aggregation Works: Using the weighted mean formula. {Image Source: ArcGIS Insights}

Where:

Wi= weights

Xi=observations

N= number of observations

The weighted mean here is the sum of all the observations multiplied by their respective weights, divided by the sum of all weights.

[You can refer to ArcGIS Insights for running the Spatial Aggregation capability on maps using two layers with one area layer set out as the one with the boundaries to be deployed for aggregation like census tracts, counties, or police census while the other layer to be used for aggregation.]

Use Cases of Spatial Aggregation

Data Privacy

In a study dubbed Privacy of energy consumption data of a household in a smart grid published in Science Direct in 2019, authors Sandhya Armoogum and Vandana Bassoo observe that spatial aggregation is applied in privacy preservation.

The authors give the example of smart grids during the collection and transmission of customers’ data. In this scenario, data collected from multiple households may be conglomerated such that the attacker cannot conclude the energy consumption of an individual household from the total.

Smart grid
A graphic illustration of a smart grid. Image Source {ScienceDirect}

The authors also give the example of temporal aggregation being applied. Data is collected over a period of time and conglomerated by the smart meter before transmission to the utility centre.

They also point out that spatial aggregation is used when preserving customer usage and billing data.

“To further preserve customer usage and billing data, privacy-preserving data aggregations have been developed which relies on cryptographic algorithms like homomorphic encryption. A homomorphic encryption scheme is a cryptosystem, which allows computations to be performed on the encrypted data without the need to decrypt the data to process it. The result of such computations when decrypted produces the same value as when performing the mathematical operation on the unencrypted data.”

STUDY: Privacy of energy consumption data of a household in a smart grid

Environment

Author A.M. Bento in the Encyclopedia of Energy, Natural Resource and Environmental Economics published in 2013 states that spatial aggregation can be relied upon to be a more accurate guide in the formulation of environmental policy.

He hinges his argument on the effects of the 1990 CAAA on the ambient concentration of particulates which he says provides tangible evidence that the ranges of values discovered in earlier work are likely to be linked with the level of spatial aggregation at which the analysis is performed.

“There is convincing evidence that, by relying on more aggregate analysis, prior estimates may have ‘averaged out’ the true effects of the regulation, often underestimating the real impact of the regulation.”

A.M Bento IN THE Encyclopedia of Energy, Natural Resource and Environmental Economics

He adds that the aggregation problem is particularly a nightmare if air quality managers train their concentration on parts of counties assumed to be “dirtier” and reduce ambient concentrations by larger margins there compared to areas deemed to be “cleaner”.

Landfill in Bulgaria

A glimpse at the polluted air and piles of trash and rubbish in one of the slums of Bulgaria. Image Source: {Niklas Liniger via Unsplash}

Geographic Masking

Spatial aggregation is one technique that accomplishes the geographic mask during the disclosure of personal information in census data.

Geographic masks are methods used to protect privacy when publishing sensitive data in maps. They are however grossly underutilized in research.

Author G Rushton in the International Encyclopedia of Human Geography published in 2009 states that, in such scenarios, individual data is geocoded to established geographic entities.

The situation varies. In the UK, it might be a census ward while, in the US, it might be a tract number.

British citizens walking on a busy street.
British citizens going about their daily activities on a busy street. Image Source {The BBC}

“Studies of relationships between environmental exposures and health, for example, often rely on estimates of exposure to harmful contaminants made at particular locations and estimates of the locations of people close to these locations.”

GM Rushton in the International Encyclopedia of Human Geography

“If information on exposures are averages for areas and information on health outcomes are also averages for areas, it may be impossible to determine whether the exposures at particular locations led to the poor health outcomes at those locations because the size of the geographic mask conceals a relationship that really exists.”

GM RUSHTON IN THE INTERNATIONAL ENCYCLOPEDIA OF HUMAN GEOGRAPHY

Urban Transport Planning

In a paper dubbed the Effects of Spatial Aggregation Level on an Urban Transportation Planning Model published in the Korean Society of Civil Engineers (KCSE) Journal of Civil Engineering in 2012, researchers Jea- Ho Jeon, Seung- Young Kho, Je Jin Park, and Dong Kyu Kim observe that spatial aggregation is one of the techniques used to forecast travel demand by analyzing transport data, which in turn, provides planners with information they need to compare the merits and demerits of every available option and to plan for the future.

Bus and people
Bus making its way on a street. Spatial aggregation is used in urban transport planning. Image Source {Kirkai via Unsplash}

The authors sought to find out and analyze the spatial aggregation effects of Traffic Analysis Zone (TAZ) systems and network templates on traffic task results on planning transport for urban areas based on Seoul City data.

The authors conclude that the traditional household surveys and land-use inventories often used to collect traffic & socio-economic data are expensive. Hence, they propose two new aggregation levels for planning assessments of higher-class roads.

They recommend that it is paramount to match the TAZ structure with that of the network model, arguing that the linkage effect between the TAZ structure and the network model has not been explored adequately in practice and literature.

They conclude that different aggregation levels between a network model and a TAZ structure may culminate into inaccurate results. The errors, they say, are caused by the shift from low class to higher class roads.

Age Like Fine Wine

As technology continues to advance, spatial aggregation will advance and will be purposed to solve more problems heading into the future.

From being used to accurately formulate environmental policy to stopping thieves from stealing personal data, spatial aggregation’s usefulness cannot be understated in the current times.

I guess that, as a technological field, spatial aggregation will age like fine wine.