First-, Second- And Third-party Data For Big Data-Driven Climate Action

What are the data source origins? First-, second- and third-party data. First-party data is the data you collect directly from your audience. Second-party data is the data you buy from a partner company who collects from its audience. Third-party data is mainly data collected from various data owners by an aggregator and sold as a packaged data set. Climate action initiatives are no stranger to data-driven research, so what are the data sources for climate action? We take a look at climate action as a use case of these three data source origins.

Big Data can be used for Climate Action

During the last ice age, Akimiski Island in Canada's James Bay lay beneath vast glaciers that pressed down with immense force. As the climate changed and the ice retreated, Akimiski began a gradual rebound. The island's slow but steady increase in elevation is recorded along its naturally terraced edges where the coastline seems etched with bathtub rings, the result of the rising landmass and wave action at previous sea levels.
What are the data sources used for climate action?
Image Source: USGS

In Part 3 of our Big Data-driven sustainable development article series, we found that climate action, which is Sustainable Development Goal number 13 (SDG 13), is such a common idea of sustainability.

There are so many examples of Big Data-driven climate action initiatives that climate change research could be one of the next big things in Big Data and Artificial Intelligence (AI).

The numerous Big Data-driven climate action initiatives prove that Big Data is useful in assessing environmental risks and figuring out the next steps in protecting the environment.

First-party data for Climate Action

First-party data
First-party data is your company’s own data. It is the perfect starting point for Data Science analysis.

First-party data is basically your own data, which you have direct access to.

Your company collects the data directly from your audience (site visitors, social media followers, customers, etc).

In the case of climate action, first-party data is usually the field data collected by the environmental research team from the research subject itself.

Second-party data for Climate Action

Second party data
Second-party data is the data you buy from someone you have an agreement with. Data Scientists can merge these data with your own and leverage additional value.

Second-party data is your partner company’s first-party data.

In the case whereby you’re collaborating with another company, the partner company gathers its own data and sells the data directly to your company.

Hence, second-party data would include the same first-party data, except that the data originates from someone else you agreed to partner with.

Third-party data for Climate Action

Third-party data
Third-party data is the data you buy from an aggregator who buys data from various data owners and compiles them into a package. Data Scientists can match the data and “personas” in the aggregator’s data set with the company’s own data to learn more about potential/new customers.

If the data is from neither your organisation nor your partner organisation, but is instead from an external organisation, it’s considered as third-party data.

Third-party data is data collected by a company that has no direct connection to your company.

Third-party data is also data collected by an aggregator from various sources and sold as a package.

The aggregators pay data owners for their first-party data, compile them into one large data set and sell it as third-party data.

The most obvious examples for third-party data in climate action are environmental data repositories and geospatial datasets.

Climate action use case for data sources

In an attempt to make this article clearer than an unpolluted river, here is an example for climate action.

Let’s say that your research team investigates the deforestation caused by a slaughterhouse.

The team collects data by flying the team’s own drones over the targeted areas, going undercover to take images and videos, interviewing witnesses, collecting samples of evidence, etc.

The images, videos, testimonials, samples and other pieces of relevant information collected by your team are your first-party data.

The data collected by your team are then complemented with satellite data from your satellite imagery partner, Planet Labs.

The satellite data is your second-party data.

On top of those, your team refers to data repositories for third-party data for a deeper understanding of the situation.

Real-life examples of first-, second- and third-party data in climate action initiatives

However, it isn’t so black and white in real life.

It seems that using publicly available third-party data is the most common choice for many Big Data-driven climate action initiatives, as we’ll see in the examples below.

Google Earth Engine

Google Earth Engine Timelapse
Earth Engine’s Timelapse brings users back through time to display climate change temporally.

Google Earth Engine combines publicly available satellite imagery and geospatial datasets to detect changes on Earth’s surface.

The Timelapse feature of Earth Engine shows how some parts of Earth have been changing over the past 35 years.

Like the drying of the Aral Sea in Central Asia, bushfires in Australia and deforestation in Bolivia.

The Earth Engine team collaborates with Google Cloud “to bring the Landsat and Sentinel-2 [satellite data] collections to Google Cloud Storage as part of the Google Cloud public data program.”

This means that the satellite data is first-party data for the Landsat and Sentinel-2 teams.

Meanwhile, the satellite data is technically second-party data for Earth Engine and Google Cloud.

And if you’re an external individual, business, government agency or environmental conservation organisation using the Earth Engine tool, the satellite data is third-party data for you.

Surging Seas

Surging Seas is an interactive tool platform developed by Climate Central to provide data on the rising sea levels, coastal flood risk, tides, storms, and tsunamis, mainly throughout the USA.

Judging by its peer-reviewed research papers, methods and reports, Surging Seas’ research approach seems to have mostly third-party data since it derives its data from official current and historical datasets.

The Surging Seas team then analyses and aggregates the data into the Surging Seas tools used by third-party users like city planners, government agencies and businesses.

Global Forest Watch

Global Forest Watch map screenshot
For WRI’s GFW, checking global deforestation rates is easier with Big Data.
Image Source: Techlogger

World Resources Institute’s (WRI) Global Forest Watch (GFW) combines the latest technology with partnerships to enhance forest information for monitoring efforts.

In a working paper by University of Oxford’s Smith School of Enterprise And The Environment, it’s said that the WRI team partnered with Google to cut costs, run algorithms, and add cloud technology for GFW.

GFW also combines satellite technology and crowdsourcing to generate a mapping application that is used by non-profit organisations, governments, etc.

For WRI, the technology from Google is second-party technology while the satellite and crowdsourced data and technology are third-party.

How the data sources fit into Big Data-driven Climate Action

Tying up the data sources for climate action, first-party data is the field data collected by the environmental research team from the research subject itself.

Second-party data is the research team’s partner’s data. They can be the partner’s field data or satellite data.

The partner’s proprietary technology used to obtain and analyse data can even be considered as a second-party utility.

Meanwhile, third-party data is usually the publicly available datasets uploaded by various organisations that the research team comes across while searching for any supporting background information.

However, the climate action use cases for the data source origins might be in contrary to what we said about complementary data source origins in the e-commerce use case.

In e-commerce, the data sources fill each other’s gaps like a jigsaw puzzle, whereby e-commerce businesses can reap their combined benefits by using all three data sources.

In climate action, the research project teams don’t necessarily have to rely on all three sources to study climate change.

Noticeably from the real-life examples, using third-party data is popular for anyone studying the climate, especially if the user is an administrative figure or business owner looking to make a difference.

Perhaps, the decision to use first-, second- and/or third-party data depends on the research questions and the availability of the data that can answer these questions.

Whether the data is considered as first-, second- or third-party data depends on who collects and owns the data directly and who uses the data.

Crucial things to keep in mind

What are the data source origins?


– First-party data
– Second-party data
– Third-party data

What is first-party data and why is it important?

First-party data is the data you collect directly from your audience.

First-party data is very valuable and reliable because it is readily available, accurate, relevant, secure and cost-effective.

What is second-party data and why is it important?

Second-party data is the data you buy from a partner company who collects from its audience.

Second-party data can show more information about your audience outside of your first-party dataset.

The high-quality, precise and transparent second-party data gives access to niche data, allows for price negotiations and opens up opportunities to building a long-term business relationship.

What is third-party data and why is it important?

Third-party data is mainly data collected from various data owners by an aggregator (who may have no connection with you) and sold as a packaged data set.

Third-party data fills the gaps of the first and second-party data by providing larger data sets of external audience segments and events.

What is the relationship between the data source types?

The data source types are complementary to each other, but climate change research does not necessarily follow this relationship.

No Responses

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Your free special webinar guest invitation: How to avoid the worst big data failures