Our True Score system identifies the true crème de la crème products on the web. The magic behind a True Score is the intricate synthesis of the most trusted expert and customer ratings.
Now let’s talk experts and who’s “trusted”. The promise of any publication should be to deliver accurate, relevant, and useful content to help readers make informed decisions. Then that’s where the actual reviewers come in. They should pledge to provide an honest and properly-tested review with their own quantitative test results.
So by us testing the testers, we’re making sure that promise is kept. Spoiler Alert: there’s a lot of broken promises out there. To test the testers, we use our own quantitative Trust Score system to leave no room for personal biases.
A Publication Trust Score assesses the trust of an expert site, and it serves as the bedrock for the weighting for the Expert Scores that help determine the True Scores. Expert sites can be complex and feature diverse ranges and types of content apart from each other, so to be as thorough and accurate in scoring as possible, we have to first calculate a publication’s Trust Score.
A Trust Score is a weighted score composed of two parts: a General Trust Score and Category-Specific Trust Score. A dedicated researcher calculates these scores by evaluating a wide variety of aspects on a site to determine how trustworthy the publication is on a whole and for a specific product category. Once the evaluation process is complete, we reach out to the publication, inform them of their Trust Scores, and ask if we missed anything.
These questions are based on Google’s review criteria that give creators direction in how to structure an in-depth review. We explain the Publication Trust Score – Version 1.5 criteria in the following tables:
This criteria assesses foundational aspects of each publication, such as an About Us page, sponsorship/paid promotion disclosure, and a scoring system used on every review regardless of product category. We’ve provided a sample of Trust Score criteria below across all our criteria categories. We’ll provide the entire list in another page.
This criteria assesses specific categories on each publication, such as how experienced an author is in the specific category, what type of media they present in their content, and how in-depth the product testing is. We’ve provided a sample of Trust Score criteria below across all our criteria categories. We’ll provide the entire list in another page.
NOTE: All our categories will eventually be evaluated using our 2.0 criteria.
Before we can calculate a product’s True Score, we need the Expert and Customer Scores. So when the above criteria is all evaluated, we then calculate the Trust Score:
We believe that the category-specific review content is the core of all these publications, which is why we weigh that score to be worth 80% of the Trust Score. General Trust Makes up the other 20%, and the two values are added together to become the Trust Score. The average of all the Trust Scores for a single publication is calculated, and that number becomes the Publication Trust Score.
The Publication Trust Score determines the weight of the publication’s review score when calculating the Expert Score for a specific product. So in essence, the higher the Publication Trust Score, the more weight that publication’s review has in our overall TrueScore for that product.
In this scenario, five publications have their own reviews and scores on the Dell S2721HGF monitor. They’ve become Expert Sources to calculate our Expert Score for that model due to their numerical scores for the product along with their passing Trust Scores. We do not include any publications that earned a Trust Score under 60% in the Expert Score calculation.
|Publication||Dell S2721HGF Monitor|
We then have to convert their review scores to our own
scoring system that uses a 1-100 logarithmic scale:
Through our Trust Score research, we’ve found that the
Publication Trust Scores (PTS) for the four sites are:
|Publication||Dell S2721HGF Monitor|
|Publication Trust Score (PTS)|
The publications with the higher Trust Scores get the higher weights in the score. With a weighted average, we can give greater weight in the calculation to the publications with the highest Blended Trust Scores versus those with lower scores.
The number of publications we use in our Expert Score calculation might always be different per product, since more publications may have reviewed a certain product versus another.
To figure out the weighting distribution, we use a Rubric Weighting system. The Rubric Weighting system is basically the Trust Score converted from a percentage to a decimal. Here are some examples:
|Trust Score||Rubric Weight|
The final formula is:
266.89 / 3.54 = 75.39% = Expert Score
The Expert Score is then used alongside
the Customer Score to calculate the Dell S2721HGF monitor’s TrueScore.
The second component to calculating a True Score is the Customer Score which is calculated by taking a weighted average of three large online retailers’ average customer scores for a single product. Let’s continue using the Dell S2721HGF monitor’s data as an example.
The Customer Score formula is:
Here’s a table of the customer score data of the Dell S2721HGF monitor:
|Online Retailer||Dell S2721HGF|
Average Customer Score
|Number of Customer Reviews|
We need to convert the Customer Scores to the 0-100 logarithmic scale first.
This is our initial method of calculating True Scores that involves a 75-25 rating between the expert and customer scores. This version is live on the site but is in the process of being updated to version 2.0 that uses a probabilistic model.
75 is given to the experts, and we value the customer’s voice and input towards long-term usage of products, but the issue of fake customer reviews is concerning, which is why they’re given only 25%.
Here’s the True Score formula:
True Score = (Expert Score x 0.75) + (Customer Score x 0.25)
We have all the components now, so the final calculation is very simple.
True Score = (75.39 x 0.75) + (95.05 x 0.25)
True Score = 80%
The Trust Scores are a work in progress, and since completing Phase 1 of Publication Trust Score Research, we’re now working within Phase 2, which will add onto the current Trust Scores and assess new criteria:
- How many products have they tested?
- How many Performance Criteria do they evaluate in reviews?
- Have they reviewed five of the top brands of a specific category?
- Do they review any newcomer brands’ products in a category?
- Do they cover how a product has evolved from previous models?
Future phases are to come so that we can further refine the Publication Trust Scores.
Gadget Review’s criteria were carefully selected to match up with Google’s published criteria on what makes up a quality review.
March 6, 2023: Version 2.0 Category Trust Score Created
Note: While it may seem odd that our v2.0 was created before v1.5, this is not an error! The intent of our v2.0 confidence research was to create a new and improved process with the Category side of our criteria, but it is a time-consuming, top-to-bottom process.
The General side of the criteria remained the same as it was in v1.0 aside from adjusted point values. We wanted to avoid leaving the gaps v1.0’s category research had, however, so v1.5 was created after the fact to update existing research with new data from some of the most important parts established in v2.0.
As of this writing (July 12, 2023), a majority of our current research uses the v1.5 model, but will be updated over time to v2.0.
- The biggest change in this update is the addition of our Custom Questions to break out what were previously just two generic queries regarding quantitative and benchmarked testing.
- In this system of “Custom Questions” (or CQs for short), for each category (e.g. best TVs, best gaming mice, etc.) we research leading reviews and buying guides from top authorities in the category like RTINGS, PCMag and GearLab to see what the experts test for. We also search ecommerce listings and Google keywords to see what consumers care about most. Informed by this research, we brainstorm 3-10 questions that outline what we should be looking for in each review or buying guide to prove that a given publication really knows their stuff. These fulfill the same purpose as the questions they replaced, in a more granular and specialized manner.
- We try our best to keep these CQs focused on quantitative data that can be measured in hard numbers, concrete and defined units. However, for certain categories such as blenders or VPNs, the nature of the products and services in question means that the best we can reasonably do is to scrutinize what is reported on qualitatively, such as the texture of a blender’s smoothie or the user experience and security features of a VPN, in service of the same goal to determine who is and isn’t a trustworthy expert in the consumer space.
- Previously, we were checking for real-world photos of products in either buying guides or product reviews. As of v2.0, we now check buying guides and reviews separately, as many publications will satisfy this question in the latter but not the former.
- Formerly, great consideration was granted to how long the author of a given buying guide had been writing in the industry. Obviously, a well-tenured writing staff is generally a vote of confidence for a publication, but we found that during the research process, scores for the same site could vary wildly based on which article happened to be picked for research. A few steps were taken to rectify this problem:
- Where before we had multiple incremental questions checking for a tenure ranging between 3 months and 10 years, we now only make one single check for an author who’s been writing for the category for at least one year.
- Using formulaically assembled Google search links, we have standardized the way we search for articles on each site, helping reducing variance in the research process.
- Instead of just one product review, we try to look for three in the category from each site, and separately denote whether each claims to test or not.
- In addition to checking for the presence of a testing methodology in the category, we also check if it was marked with the date it was published or last updated. Many of the categories we cover include emerging or otherwise fast-moving technologies that improve near-daily, and so in the world of tech reviews it’s important to keep your information up to date. This is also a factor of transparency, which feeds into a publication’s overall trustworthiness. A testing methodology with no listed date on it is a hit to its veracity.
- The score values of each criteria have been adjusted.
- Along the same lines of our confidence research for publications, we implemented a similar process for reviewing YouTube videos and channels to be able to cover videos that appear in Google searches.
May 23, 2023: Version 1.5 Category Trust Score Created
- Score Weights adjusted to reflect our focus on evidence based testing, both visual and reported through text data, charts and graphs.
- Point values of Phase 1.5 Research were converted to a 100-point scale.
- Qualification Category of Performance changed from 22% to 10%
- Visual Evidence Category of Performance changed from 33% to 30%
- Methodology Category of Performance changed from 22% to 20%
- Testing Proof Category of Performance changed from 22% to 40%
- Removed questions from the Testing Proof section that are replaced and covered in greater detail by our new custom questions segment.
- Quantitative product test results and/or category scores in PR and/or BG}
- BONUS: Quantitative benchmark (comparative) product tests in PR and/or BG exist}
- Added a question to Methodology to help show the focus of our research: getting to the bottom of who does and doesn’t test.
- Do they claim to test?
- Added questions to Testing Proof designed to look for proof that any claims of testing a publication makes are properly supported:
- Do they provide correct units of measurement to help support that they actually tested?
- Did the reviewer demonstrate that the product was tested in a realistic usage scenario?
- (e.g. assessing a pair of headphones’ sound quality with different genres of music, riding an e-bike on various inclines, blending ice in a blender, etc.)”
- Custom Questions (Up to 10 per category)
- Criteria of this and all following custom questions determined per category relating to relevant aspects that should be tested to demonstrate technical understanding of the products; these are often tests of performance criteria by way of measurement or use case, such a testing for brightness, how well something crushes ice, or how much debris a vacuum picks up
- These questions are connected to performance criteria/categories of performance
- Each category receives a different number of custom questions, but the sum point value of all of the custom questions is always the same, regardless of whether there are 3 or 10
- Based on the custom questions and your own assessment, is their claim to test truthful?