You’ll Know About Data Marketplaces After Reading This

Here is a brief introduction for beginners like you to know about data marketplaces.

What is a data marketplace?

You would have guessed from the term itself that a data marketplace is where various types of data sets from various sources are bought and sold. There are 3 main types of data marketplaces:

  • Personal

Consumers sell their own data about anything (food preferences, locations, content preferences, et cetera) via personal data marketplaces.

  • Business-to-business (B2B)

Company data from various companies are curated and aggregated onto a B2B data marketplace for data users to buy to gain additional insights.

  • Internet of Things (IoT)

IoT device owners can sell real-time sensor data from their IoT devices to data users who are seeking additional insights.

Here are some examples of platforms that have data marketplaces:

Datum is a blockchain-based data storage and monetisation app which facilitates the decentralised storage and exchange of data using crypto tokens.

Ocean Protocol is also a blockchain-based data exchange system using crypto tokens.

According to the About page of Streamr’s website, Streamr is “a distributed open-source software project with contributors in a range of timezones from Boston to Wrocław, Helsinki, Zug, London and Melbourne.” It is also a decentralised blockchain-based platform for real-time data storage and exchange.

Wikipedia describes IOTA as “an open-source distributed ledger and cryptocurrency designed for the Internet of Things (IoT).” IOTA launched its IoT data marketplace in 2017.

From the examples, we can see that some data marketplaces are run using blockchain technology and cryptocurrencies to enable the decentralisation of data exchanges.

How is it different from a data lake?

The data lake is a company’s in-house or internal platform for storing and using data sets that are limited to the first-party data that the company collects and maybe the second-party data that the company buys from a partner company.

Whereas, the data marketplace is an external source of a wider range of third-party data sets that are catalogued to make the required data easier to find and access, which brings us to the importance of data marketplaces.

Why do we need data marketplaces?

“It’s logical that with so much data ripping around and no signs of slowing, opportunities are emerging for whole new marketplaces that can bring in, organize and make the data available for third party consumption. The sheer volume of data calls out for curation.”

The rise of big data marketplaces (2015), an op-ed by David Knight, a contributor of Network World.

The world has reached a point where it’s drowning in such a vast ocean of data that companies don’t know what to do with the large amounts of data they collect. That’s why data needs to be curated in such a way that users can more easily access high-quality data sets in specific formats, which saves data scientists a lot of time and effort from data prep work.

But a more important reason for data marketplaces to exist is the role of third-party data. While first- and second-party data are great sources of information, they still have gaps that need to be filled by third-party data.

A recap of first-, second-, and third-party data

  • First-party data is the data you collect directly from your audience.

Benefits: readily available, accurate, relevant, secure and cost-effective.

  • Second-party data is the data you buy from a partner company that collects from its audience.

Benefits: access to niche data outside of your first-party dataset, high-quality, precise, transparent, allows for price negotiations and opportunities for a long-term business relationship.

  • Third-party data is mainly data collected from various data owners by an aggregator (who may have no connection with you) and sold as a packaged data set. In this topic, data marketplaces fall under this category as the data sets are aggregated from various data sources.

Benefits: fills the gaps of the first and second-party data by providing larger data sets of external audience segments and events.

Unlike the first- and second-party data, third-party data allows a company to widen its horizons and take into account external events like demographic, geographic, economic and political changes that may or may not affect the company.

This is especially useful for companies looking to enter new markets, enhance their product line, check on the competition, design premium services, and do any venture that the company has no familiarity with.

Third-party data fills the gaps of first- and second-party data.

For example, you, as a retail business owner, want to know who else would buy your products to expand your customer base and whether there are social media trends that can affect your business and how. But you can’t get this information straight from your customer database, so you have to browse the data marketplaces for these data sets.

Now, you’re wondering what’s in it for the data sellers. The answer is simply monetisation. Data marketplaces give data owners the opportunity to make money out of their data.

The main problem with data marketplaces

However, the benefits that accompany data marketplaces come at a cost. This hardly-discussed issue was raised last year by data exchange platform Harbr, who said that data has several characteristics that do not go well in a marketplace setting, which requires heavy standardisation. These characteristics include:

  • Adaptability to the extent of removing the traceability to its original form.
  • Easily replaceable.
  • Complexities in information, formats, delivery mechanisms, terms and pricing.
  • Raw, as in the data still needs to be processed for use.
  • Easy to copy or steal.

While these characteristics are worth considering, this is just what 1 data exchange platform said, so let’s take it with a pinch of salt.

An interesting use case by Streamr

Streamr, the blockchain-based real-time data exchange platform mentioned earlier, collaborated with Bosch and Riddle & Code in 2019 to collect and share real-time electric vehicle data.

Making this pilot project possible are sensors, connected devices and software inside a Jaguar I-PACE electric vehicle to collect and send car IoT data to the Streamr Marketplace in real-time.

The aggregated car data can then be sold to highway agencies, smart cities, car manufacturers, insurance companies and other drivers for the following purposes:

  • Road infrastructure decisions
  • Car design decisions
  • Car insurance policies
  • Safety measures
  • Journey optimisation

You can watch a nice video about it on Streamr’s website.


A data marketplace, of which there are 3 types, is an external source of curated third-party data sets that can enhance the users’ data analysis on top of the users’ internal and partner data sets. It’s interesting to note that some data marketplaces rely on blockchain technology to function.

Despite the issues surrounding the characteristics of data versus the standardised marketplace setting, data marketplaces make these data sets more searchable and accessible to users amidst the ever-expanding universe of Big Data.