What are the data source origins? First-, second- and third-party data. First-party data is the data you collect directly from your audience. Second-party data is the data you buy from a partner company who collects from its audience. Third-party data is mainly data collected from various data owners by an aggregator and sold as a packaged data set. Climate action initiatives are no stranger to data-driven research, so what are the data sources for climate action? We take a look at climate action as a use case of these three data source origins.
- Big Data can be used for Climate Action
- First-party data for Climate Action
- Second-party data for Climate Action
- Third-party data for Climate Action
- Climate action use case for data sources
- Real-life examples of first-, second- and third-party data in climate action initiatives
- How the data sources fit into Big Data-driven Climate Action
- Crucial things to keep in mind
Big Data can be used for Climate Action
In Part 3 of our Big Data-driven sustainable development article series, we found that climate action, which is Sustainable Development Goal number 13 (SDG 13), is such a common idea of sustainability.
There are so many examples of Big Data-driven climate action initiatives that climate change research could be one of the next big things in Big Data and Artificial Intelligence (AI).
First-party data for Climate Action
First-party data is basically your own data, which you have direct access to.
Your company collects the data directly from your audience (site visitors, social media followers, customers, etc).
In the case of climate action, first-party data is usually the field data collected by the environmental research team from the research subject itself.
Second-party data for Climate Action
Second-party data is your partner company’s first-party data.
In the case whereby you’re collaborating with another company, the partner company gathers its own data and sells the data directly to your company.
Third-party data for Climate Action
If the data is from neither your organisation nor your partner organisation, but is instead from an external organisation, it’s considered as third-party data.
Third-party data is data collected by a company that has no direct connection to your company.
The aggregators pay data owners for their first-party data, compile them into one large data set and sell it as third-party data.
Climate action use case for data sources
In an attempt to make this article clearer than an unpolluted river, here is an example for climate action.
Let’s say that your research team investigates the deforestation caused by a slaughterhouse.
The team collects data by flying the team’s own drones over the targeted areas, going undercover to take images and videos, interviewing witnesses, collecting samples of evidence, etc.
The images, videos, testimonials, samples and other pieces of relevant information collected by your team are your first-party data.
The data collected by your team are then complemented with satellite data from your satellite imagery partner, Planet Labs.
The satellite data is your second-party data.
On top of those, your team refers to data repositories for third-party data for a deeper understanding of the situation.
Real-life examples of first-, second- and third-party data in climate action initiatives
However, it isn’t so black and white in real life.
It seems that using publicly available third-party data is the most common choice for many Big Data-driven climate action initiatives, as we’ll see in the examples below.
Google Earth Engine
Google Earth Engine combines publicly available satellite imagery and geospatial datasets to detect changes on Earth’s surface.
The Timelapse feature of Earth Engine shows how some parts of Earth have been changing over the past 35 years.
Like the drying of the Aral Sea in Central Asia, bushfires in Australia and deforestation in Bolivia.
This means that the satellite data is first-party data for the Landsat and Sentinel-2 teams.
Meanwhile, the satellite data is technically second-party data for Earth Engine and Google Cloud.
And if you’re an external individual, business, government agency or environmental conservation organisation using the Earth Engine tool, the satellite data is third-party data for you.
Surging Seas is an interactive tool platform developed by Climate Central to provide data on the rising sea levels, coastal flood risk, tides, storms, and tsunamis, mainly throughout the USA.
Judging by its peer-reviewed research papers, methods and reports, Surging Seas’ research approach seems to have mostly third-party data since it derives its data from official current and historical datasets.
Global Forest Watch
World Resources Institute’s (WRI) Global Forest Watch (GFW) combines the latest technology with partnerships to enhance forest information for monitoring efforts.
In a working paper by University of Oxford’s Smith School of Enterprise And The Environment, it’s said that the WRI team partnered with Google to cut costs, run algorithms, and add cloud technology for GFW.
GFW also combines satellite technology and crowdsourcing to generate a mapping application that is used by non-profit organisations, governments, etc.
For WRI, the technology from Google is second-party technology while the satellite and crowdsourced data and technology are third-party.
How the data sources fit into Big Data-driven Climate Action
Tying up the data sources for climate action, first-party data is the field data collected by the environmental research team from the research subject itself.
Second-party data is the research team’s partner’s data. They can be the partner’s field data or satellite data.
The partner’s proprietary technology used to obtain and analyse data can even be considered as a second-party utility.
Meanwhile, third-party data is usually the publicly available datasets uploaded by various organisations that the research team comes across while searching for any supporting background information.
However, the climate action use cases for the data source origins might be in contrary to what we said about complementary data source origins in the e-commerce use case.
In e-commerce, the data sources fill each other’s gaps like a jigsaw puzzle, whereby e-commerce businesses can reap their combined benefits by using all three data sources.
In climate action, the research project teams don’t necessarily have to rely on all three sources to study climate change.
Noticeably from the real-life examples, using third-party data is popular for anyone studying the climate, especially if the user is an administrative figure or business owner looking to make a difference.
Perhaps, the decision to use first-, second- and/or third-party data depends on the research questions and the availability of the data that can answer these questions.
Whether the data is considered as first-, second- or third-party data depends on who collects and owns the data directly and who uses the data.
Crucial things to keep in mind
What are the data source origins?
– First-party data
– Second-party data
– Third-party data
What is first-party data and why is it important?
First-party data is the data you collect directly from your audience.
First-party data is very valuable and reliable because it is readily available, accurate, relevant, secure and cost-effective.
What is second-party data and why is it important?
Second-party data is the data you buy from a partner company who collects from its audience.
Second-party data can show more information about your audience outside of your first-party dataset.
The high-quality, precise and transparent second-party data gives access to niche data, allows for price negotiations and opens up opportunities to building a long-term business relationship.
What is third-party data and why is it important?
Third-party data is mainly data collected from various data owners by an aggregator (who may have no connection with you) and sold as a packaged data set.
Third-party data fills the gaps of the first and second-party data by providing larger data sets of external audience segments and events.
What is the relationship between the data source types?
The data source types are complementary to each other, but climate change research does not necessarily follow this relationship.