– □ x

OUR MISSION OF
RADICAL TRANSPARENCY

Where We Get Our Data

Data. It’s everywhere. But which data can you trust and which will lead you down the dark path toward despair and demise? We use scraping to extract data from “trusted” sites on the web and compare them to one and other to find inconsistencies that help us deduce things like trustworthiness, confidence, etc.

EXTRACTING DATA FROM THE WEB IS DECEPTIVELY SIMPLE.

It’s estimated there are currently 6 billion indexable pages and 1.2 million terabytes of accessible information available at any given time. And it’s growing exponentially every day.

The internet is enormous. It utilizes a wide range of technologies to deliver the simplest of web pages directly to your browser. I don’t have the pretension to explain how these technologies work. However, I do intend to provide you with the big picture of how we harvest data. And how it’s converted to meaningful data.

You will be happy to know we gainfully employ a team of humans (yes, real people) that manually collect data. Scraping with machines can be difficult, time-consuming, and most importantly error-prone. Humans make less mistakes. As individuals, we make preventable errors every day.

WHAT IS SCRAPING?

In a nutshell, scraping is the harvesting of data from a web page or similar resource. It is sometimes referred to as ‘web scraping’, ‘web harvesting’ or ‘web data extraction’.

HOW DO WE USE IT?

PRICE MONITORING

We look for price changes, sales, new, refurb and discontinued products and more.

NEWS AGGREGATION

We use sentiment analysis, as an alternative data source for products, services, etc. featured on the site.

SOCIAL MEDIA

We look for social signals & Influencer activity which includes looking at follower growth and other mechanics.

REVIEW AGGREGATION

We extract reviews from a range of websites related to products, services and more.

SEARCH ENGINE RESULTS

We monitor search engine result page (SERP) activity. Including videos, images and marketplaces.

HOW DO WE DO IT?

PUBLIC DATA VS PRIVATE DATA

We’ve been throwing the word “data” around a lot. Everyone has it, and it hasn’t always been positive. So let’s get ahead of some questions and make things clear.

HERE’S WHAT WE’RE NOT DOING

If the content is behind a paywall and not publicly available, we don’t collect it. It’s private and we treat it as such. However, if the data is accessible by way of a public means, such schema data that is often collected by Google, then we’ll collect that data.

HERE’S WHAT WE ARE DOING

Generally speaking most review websites are public. Some are pay walled, some require an email sign up. To that end, we don’t collect all the content and then republish it. We summarize the finding of publications and use their scores and test data to expedite and simplify the customer journey saving you, the reader (or consumer) hours of research before you buy your next product.

Public data is the name of the game. If it’s public, it’s legal. That’s what was ruled by the US Ninth Circuit Court in HiQ Labs, Inc v. LinkedIn Corporation.

If it’s not public, it’s not legal to scrape. That’s it. We have no use for personally identifying information. We’re helping you make decisions on products by giving you facts about products, testers, testing criteria and markets. We’re not trying to appeal to whether or not you like the color red.