by Juan Robin II
Have you been involved in SEO for more than a couple of years? You probably remember Google’s Toolbar PageRank.
Here’s what it looked like:
It showed the Google PageRank of every page you visited on a logarithmic scale from 0–10.
But even before Google officially removed support for Toolbar Pagerank in 2016, they had already ceased to update it for many years. For this reason, some SEOs view PageRank as an outdated and irrelevant metric that has no place in modern‐day SEO.
Here’s a comment I found on another article about PageRank that sums up this way of thinking:
Pretty brutal. But here’s the thing: PageRank still plays a vital role in Google’s ranking algorithm.
How do I know this? Google said so.
DYK that after 18 years we’re still using PageRank (and 100s of other signals) in ranking?
— Gary “鯨理” Illyes (@methode) February 9, 2017
(Gary Illyes works for Google. So that tweet is straight from the horse’s mouth, so to speak.)
But this year‐old tweet isn’t my only evidence. Just a month ago, Gary Illyes spoke at a conference I attended in Singapore (here’s me with him!). In his talk, he reminded the audience that PageRank is still a part of their algorithm; it’s just that the public score (i.e., Toolbar PageRank) no longer exists.
With that in mind, the aim of this post is threefold:
- To set the record straight about the importance and relevance of PageRank in 2018;
- To explain the basics of the PageRank formula;
- To discuss other similar metrics that exist today, which may make suitable replacements to the deprecated public PageRank “score.”
What is Google PageRank?
PageRank (PR) is a mathematical formula that judges the “value of a page” by looking at the quantity and quality of other pages that link to it. Its purpose is to determine the relative importance of a given webpage in a network (i.e., the World Wide Web).
“Our main goal is to improve the quality of web search engines.”
That brings us to an important point: Search engines weren’t always as efficient as Google is today. Early search engines like Yahoo and Altavista didn’t work very well at all. The relevance of their search results left a lot to be desired.
Here’s what Sergey and Larry said about the state of search engines in their original paper:
“Anyone who has used a search engine recently can readily testify that the completeness of the index is not the only factor in the quality of search results. “Junk results” often wash out any results that a user is interested in. ”
PageRank aimed to solve this problem by making use of the “citation (link) graph of the web,” which the duo described as “an important resource that has largely gone unused in existing web search engines.”
The idea was inspired by the way scientists gauge the “importance” of scientific papers. That is, by looking at the number of other scientific papers referencing them. Sergey and Larry took this concept and applied it to the web by tracking references (links) between web pages.
It was so effective that it became the foundation of the search engine we now know as Google, and it still is.
How does Google PageRank work?
Here’s the full PageRank formula (and explanation) from the original paper published in 1997:
We assume page A has pages T1…Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. There are more details about d in the next section. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows:
PR(A) = (1‐d) + d (PR(T1)/C(T1) + … + PR(Tn)/C(Tn))
Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages’ PageRanks will be one.
Confused? Let’s simplify.
Google takes into account three factors when calculating the PageRank of a web page, which are:
- The quantity and quality of inbound linking pages;
- The number of outbound links on each linking page;
- The PageRank of each linking page.
Let’s say that page C has two links: one from page A and one from page B. Page A is stronger than page B, and also has fewer outgoing links. Feed this information into the PageRank algorithm, and you get the PageRank of page C.
The PageRank formula also has a so‐called “damping factor” which simulates the probability of a random user continuing to click on links as they browse the web. This is perceived to decrease with each link click.
Think of it like this: The probability of you clicking a link on the first page you visit is reasonably high. But the likelihood of you then clicking a link on the next page is slightly lower, and so on and so forth.
With that in mind, the total “vote” of a page is multiplied by the “damping factor” (generally assumed to be 0.85) with each iteration of the PageRank algorithm.
If the BBC links to a page via four “link‐hops,” the value of that link would be “damped down” to such an extent that the final page would hardly feel the benefit. But if they link to that same page via only two link‐hops, that link will have a strong influence on the page.
You might be wondering:
“What if we didn’t know the PageRank of page A or page B?”
This would be like asking the following question:
If Sergey gives half of his money to Larry, how much money does Larry have?
You can’t answer this question because a vital piece of information is missing: The amount of money Sergey had in the first place.
It’s a crude analogy, yes, but it relates to the PageRank algorithm because to calculate the PageRank of every other page in the network, you first need to know the PageRank of at least one page, right?
So how does Google overcome this problem?
Here’s another excerpt from the original PageRank paper:
PageRank or PR(A) can be calculated using a simple iterative algorithm and corresponds to the principal eigenvector of the normalized link matrix of the web.
Sound like gobbledygook?
It basically means that Google’s PageRank algorithm can calculate the PR of a page without knowing the definitive PageRank of the linking pages. This is because PageRank isn’t really an absolute “score,” but rather a relative measure of a webpage’s quality compared to every other page on the link graph (i.e., web).
Read this article if you want to geek out and learn more.
Why did Google remove the public PageRank score?
Here’s what a Google spokesperson said in 2016:
As the Internet and our understanding of the Internet have grown in complexity, the Toolbar PageRank score has become less useful to users as a single isolated metric. Retiring the PageRank display from Toolbar helps avoid confusing users and webmasters about the significance of the metric.
But there was almost certainly another contributing factor to the decision: link spam.
It’s fair to say that SEOs have long been obsessed with PageRank as a ranking factor, perhaps because the so‐called “toolbar PageRank” offered a visible gauge, quite literally, as to the rank worthiness of a webpage.
No such visual gauge existed for any other ranking factors, which made it seem like PageRank was the only factor that mattered. As a result, people soon started buying and selling “high PR” links. It became a huge industry, and still is.
If you’re wondering how link sellers build these “high PR” links in the first place, there are many ways. In the mid‐2000s, one of the primary acquisition tactics was to leave blog comments.
For Google, this was a big problem. Links were originally a good judge of quality because they were given out naturally to deserving pages. Unnatural links made their algorithm less effective at discerning the high‐quality pages from the low‐quality ones.
The introduction of “nofollow”
In 2005, Google partnered with other major search engines to introduce the “nofollow” attribute. That solved blog comment spam by allowing webmasters to stop the transfer of PageRank via specific links (e.g., blog comments).
Here’s an excerpt from Google’s official statement on the introduction of “nofollow”:
If you’re a blogger (or a blog reader), you’re painfully familiar with people who try to raise their own websites’ search engine rankings by submitting linked blog comments like “Visit my discount pharmaceuticals site.” This is called comment spam, we don’t like it either, and we’ve been testing a new tag that blocks it. From now on, when Google sees the attribute (rel=“nofollow”) on hyperlinks, those links won’t get any credit when we rank websites in our search results.
Nowadays, almost all CMS systems “nofollow” blog comment links by default.
But as Google solved one problem, another problem was made accidentally worse.
The original PageRank formula states that PageRank is divided equally between the outgoing links on a webpage. So if the PageRank of a page is y and the page has ten outgoing links, the amount of PageRank transferred via each link is y/10.
But what happens if you add a “nofollow” attribute to 9 of those 10 links? Surely it stops the flow of PageRank to nine of those pages, leaving the full PageRank value to be transferred via only one link on the page, right?
Initially, yes, this was the case, and webmasters soon began selectively adding the ‘nofollow’ attribute to pages they deemed less important (e.g., outgoing links, etc.). This allowed them to effectively “sculpt” the flow of PageRank around their site.
For example, if they had a page with a PageRank score of 7 (according to the public PR score on the toolbar), and they wanted to boost the “power” of a specific page, they would just link to it from the high PR page and “nofollow” all the other links on the page. That way, the maximum amount of PageRank would be sent to their page of choice.
Google made changes to this in 2009. Here’s an excerpt from Matt Cutts’ blog post on the matter:
So what happens when you have a page with “ten PageRank points” and ten outgoing links, and five of those links are nofollowed? […] Originally, the five links without nofollow would have flowed two points of PageRank each […] More than a year ago, Google changed how the PageRank flows so that the five links without nofollow would flow one point of PageRank each.
Here’s an illustration of the difference:
We don’t know if this is still how the ‘nofollow’ maths works. Google made this change nine years ago. Things may be different now. It’s possible that other factors (e.g., the position of a link on a page) now also influence how much value a given link transfers.
But what we do know for sure is that adding “nofollow” tags to some links won’t help to funnel more “link juice” towards the rest of the links on the page.
Google (slowly) axes the public PageRank score
Shortly after introducing changes to the way PR is passed between so‐called ‘dofollow’ and ‘nofollow’ on a page, Google removed PageRank data from Webmaster Tools.
Then, in 2014, support for the public PageRank metric took another blow when Google’s John Mueller stated that people should stop using PageRank as it would no longer be updated.
“I wouldn’t use PageRank or links as a metric. We’ve last updated PageRank more than a year ago (as far as I recall) and have no plans to do further updates. Think about what you want users to do on your site, and consider an appropriate metric for that.”
In 2016, Toolbar PageRank was officially axed.
This move made the buying and selling of “high PR links” more difficult as there was no way to find out the “true” PageRank of a webpage.
Is there a suitable replacement for the public PageRank score?
No replica of PageRank exists. Period.
But there are a few similar metrics around, one of which is Ahrefs’ URL Rating (UR).
Moz and Majestic also have some proprietary metrics that work in a similar way to PageRank. Feel free to check out the documentation on their creators’ websites to learn more. In this article, however, we’ll only be talking about Ahrefs’ URL Rating (UR) because it’s a metric that we fully understand and trust.
What is URL Rating?
“Ahrefs’ URL Rating (UR) is a metric that shows how strong a backlink profile of a *target URL* is on a scale from 1 to 100.”
How do you see the URL Rating of a page? Just paste it into Site Explorer.
Or use Ahrefs’ SEO toolbar.
How is URL Rating (UR) similar to PageRank?
We want to be transparent here, so it’s important to note that while we calculate URL Rating (UR) in a similar way to the original version of Google PageRank, it’s not the same. Nobody outside of Google knows how the PageRank formula has developed over the years.
But we do know that URL Rating (UR) is comparable to the original Google PageRank formula in the following ways:
- We count links between pages;
- We respect the “nofollow” attribute;
- We have a “damping factor”;
- We crawl the web far and wide (which is a critical component when calculating an accurate link‐based metric)
Remember: This is how URL Rating (UR) compares to the original PageRank formula. Google has almost certainly iterated and improved upon their formula in the 21 years since its inception.
How do we know this? Well, for a start, it’s a reasonable assumption to make. We know Google hasn’t stood still all this time because their search results are by far the best of any search engine.
But here’s a quote from Matt Cutts, which I found, once again, in his 2009 blog post on PageRank sculpting:
“Even when I joined the company in 2000, Google was doing more sophisticated link computation than you would observe from the classic PageRank papers. If you believe that Google stopped innovating in link analysis, that’s a flawed assumption. Although we still refer to it as PageRank, Google’s ability to compute reputation based on links has advanced considerably over the years.”
How does URL Rating (UR) differ from Google PageRank?
Google has filed many patents over the years, which are publicly accessible. But nobody, not even Bill Slawski, knows which factors are part of the live algorithm or how much weight they each receive.
This fact alone makes it very difficult to know how URL Rating (UR) differs from the current iteration of Google PageRank because we don’t fully understand how Google judges the value of a link in 2018.
Even when it comes to seemingly basic things, like the way links get counted, things aren’t as straightforward as you might assume. To illustrate, take a look at this image:
This is a great test when interviewing SEOs.
Ahrefs’ crawler counts eight links to page B, but not every crawler works the same way.
We have no clue how Google counts them.
Furthermore, the actual counting of links is only one part of the equation. When you start calculating how much value each of those links passes, the complexity reaches a whole new level.
Here are some other questions we don’t know the answers to:
1. Does the transfer of PageRank vary according to the location of the link on the page?
Google’s reasonable surfer patent indicates that this may be the case.
In particular, it’s thought that links higher up in the document may transfer more PageRank than those lower down. Same goes for links in the sidebar vs. links in the main content.
Bill Slawski lists some other features that Google may use to evaluate the importance of a link in his analysis here.
2. Do internal links transfer PageRank in the same way as external links?
Google’s reasonable surfer patent does give some indication that this may be the case.
Bill Slawski also talks about this in his analysis of the patent.
However, there is no definitive answer to this question. Just because it exists in a Google patent doesn’t mean that it’s part of the live algorithm. Google has filed a lot of patents over the years.
3. Does the first link from a site transfer more value than any subsequent links from the same site?
Bill Slawski states that subsequent links from the same site “might possibly be ignored when scores for pages are calculated.”
We also found a clear positive correlation between the number of unique referring domains and organic traffic when we analysed nearly 1 BILLION webpages.
Honestly, we could list unknowns like this all day. (If you’re interested, this article from Moz talks about more reasons why all links may not be created equal.)
Should you use URL Rating (UR) as a PageRank alternative?
URL Rating (UR) is a decent replacement metric for PageRank because it has a lot in common with the original PageRank formula.
However, it’s not a panacea. We know for a fact that URL Rating (UR) doesn’t take into account as many factors as the modern‐day iteration of Google PageRank.
So, our advice is to use it, but not to rely on it entirely. Always review link targets manually (that means visiting the actual page) before pursuing a link.
How to preserve (and boost) your PageRank
Before I start with this section, I want to stress an important point:
This is not about optimizing for PageRank or URL Rating (UR). That way of thinking often leads to poor decision making. The real task is to make sure that you’re not losing or wasting PageRank on your site.
For that, there are three areas to focus on:
- Internal links: How you link the pages together on your website affect the flow of “authority” or “link juice” around your site.
- External links: Both URL Rating (UR) and PageRank effectively share authority between all outbound links on a page. But this doesn’t mean you should delete or “nofollow” external links. (Keep reading.)
- Backlinks: Backlinks bring so‐called “link juice” into your site, which you should carefully preserve.
Let’s look at each of these individually.
Backlinks aren’t always within your control. People can link to any page on your site they choose, and they can use whatever anchor text they like.
But internal links are different. You have full control over them.
Seriously: Internal linking is a topic large enough to warrant an article of its own (let us know if you want us to write this!), but here are a few internal linking best practices to get you started:
1. Keep important content as close to your homepage as possible
Your homepage is almost certainly the strongest page on your site.
Don’t believe me? Do this:
Site Explorer > enter your domain > Best by Links
I’ll bet that your homepage is at the top of the list.
This is almost always the case for two reasons:
- Most backlinks will point to your homepage: Just look at the referring domains column on that report. You’ll most likely see that the number of links to your homepage is the highest of all pages on your site.
- Most sites link back to their homepage from all other pages: See the Ahrefs logo in the top left‐hand corner of this page? It links to our homepage. And it exists on all pages on our site. Most sites have a similar structure.
So the closer a page is to your homepage (in terms of the internal linking structure), the more “authority” it will receive. That’s why it pays to place important content as close to the homepage as possible.
Once you’ve done that, go to:
Site Audit > select project > select crawl > Data Explorer
Look at the “Depth” column, which tells you how many clicks away each page is from the homepage (assuming that’s where you started your crawl).
You can even sort the “Depth” column in descending order to see pages that are super far away from the homepage.
But let’s face it, you can’t link to every page from your homepage, right?
The good news is that your homepage is not the only high‐value page on a site capable of transferring authority to other pages. If you’re desperate to send more “link juice” to a specific page on your site, do this:
- Use the Best by Links report to find the most high‐authority pages on your site;
- Link to the page you’re trying to ‘boost’ from any relevant high‐UR pages
For example, looking at the Best by Links report for the Ahrefs blog, I see that our list of SEO tips has a high UR.
I also know that we mention PageRank in this article…
… so this is an entirely relevant, high‐UR page from which we could link to this very page.
Here’s a quick trick for finding the most relevant high‐UR pages from which to add internal links to newly‐published blog posts.
Go to Google and use the following search operator:
site:yourdomain.com “topic of the page we want to link to internally”
For example, if we wanted to find internal link opportunities for this page, we could search:
This unveils all our blog posts that mention the word “PageRank,” of which there are 22.
But which of these pages would give the most powerful internal links?
Let’s use Chris Ainsworth’s SERP scraper to scrape the results, then paste them into Ahrefs’ Batch Analysis tool and sort by URL Rating (UR).
Cool. Now we have a list of the most authoritative pages that mention the word “PageRank.” We can add internal links to this guide from a few of these pages, like so:
Internal link to this post from our list of SEO tips.
2. Fix “orphan” pages
PageRank flows throughout a site via internal and external links. Which means that “link juice” can only reach a page if it’s actually linked‐to from one or more pages on the site.
If a page doesn’t have any inlinks, it’s referred to as an orphan page.
To find such pages, you first need a list of all the web pages on your site. Doing this can be a little tricky, but extracting the pages from your sitemap will often do the trick. You may also be able to download a full list of web pages generated by your CMS.
Once you have that, crawl your website in Ahrefs’ Site Audit tool, then go to:
Site Audit > Data Explorer > Is valid (200) internal HTML page = Yes
Export this report, which will contain all the URLs found on your site during the crawl.
Now, compare the URLs in this report with the full list of pages on your site. Any pages that the crawl did not uncover are most likely orphan pages.
You should fix such pages by either removing them (if they’re unimportant) or adding internal links to them (if they are important).
Many people feel that linking out to external resources (i.e., web pages on other sites) will somehow hurt their rankings.
That is not true. External links won’t hurt you, so you should not be worried about linking to other sites. We regularly link out to useful resources from the Ahrefs Blog, and our traffic is consistently rising.
It is true that the more links you have on a page, the less “value” each link will transfer. But we’re pretty sure that in 2018, calculating the value of each link on a page is not as straightforward as it was back in the mid-1990’s when Google filed the original PageRank patent.
So, while you can hoard links and not link out to anyone, that doesn’t mean that Google will reward you for doing so. Not linking out to any external resources whatsoever looks seriously fishy and manipulative for a start, and we know Google doesn’t respect that kind of practice.
Bottom line? External links exist because they serve a purpose; they point readers to resources that add to the conversations. You should, therefore, link out whenever it is helpful to do so.
Here are a few external linking best practices to follow:
1. Don’t “nofollow” external links unless you need to
Here’s what Google says about “nofollow” links:
In general, we don’t follow them. This means that Google does not transfer PageRank or anchor text across these links.
Some websites (Forbes, HuffPo, etc.) now “nofollow” all of their external links by default.
Is this good practice? Not at all.
Most of these websites chose to implement such an editorial policy because some of their writers were secretly selling links from their articles. Not wanting to encourage such practice, a blanket ban on “dofollow” external links ensued.
But chances are you don’t have this problem. Hopefully, you run a quality website and carefully vet any guest submissions. In which case, there’s no need to “nofollow” all your external links. It just doesn’t make sense to do so.
So, you should only “nofollow” external links when:
- Linking out to questionable pages: In this case, you might want to question whether you should be linking to that resource at all;
- Linking out from a “sponsored post:” Sponsored posts are paid for, which means that any links within the post are effectively paid links. This is exactly what the “nofollow” attribute is for.
2. Fix broken external links
Broken external links contribute to a bad user experience. Here’s what happens when a reader clicks such a link:
These links also ‘waste’ PageRank.
Think about it: The link has no value to anyone, yet it dilutes the value of the rest of the links on that page.
How do you fix these? You first need to find them.
Read this post to learn everything you need to know: How to Find and Fix Broken Links (to Reclaim Valuable “Link Juice”)
Backlinks boost the PageRank of the linked‐to page. For example, backlinko.com links to our on‐page SEO guide and thus increases its PR.
But as discussed earlier, not all backlinks are created equal. Google looks at hundreds of factors to determine the real value of a backlink.
That said, here are a few useful hacks to get the most out of your backlinks:
1. Focus on building links from high‐UR pages
PageRank flows between pages, not domains.
A link from a high‐authority page on a low‐authority website will be worth more than a link from a low‐authority page on a high‐authority website.
So when vetting link prospects in Site Explorer, we recommend sorting by URL Rating (UR):
If you found your prospects elsewhere (e.g., via a Google scrape), it’s worth pasting them into our Batch Analysis tool to check the URL Rating of each page.
You can use a tool like URL Profiler to pull URL Rating—and other Ahrefs metrics—for thousands of pages in one go.
2. Fix broken pages that waste “link juice”
Backlinks not only boost the “authority” of the page they point to, but also every internally‐linked page on the site. Reason being, PageRank flows from page to page via internal links.
But if you have backlinks pointing to a broken page, any “link juice” is effectively wasted because it has nowhere to flow from there.
You should, therefore, fix any broken pages with backlinks pointing to them. You can find such pages by adding a “404 not found” filter to the Best by links report.
Site Explorer > enter your domain > Best by links > add a 404 filter
This shows you all the broken pages on your site, plus the number of links they each have.
Learn more about finding and fixing these issues here.
3. Don’t get blinded by “authority;” context matters too
PageRank is important, but so is the context of a link.
What do I mean by this? Imagine that you run a cat blog, and you write a blog post about how your cat has scratched the seats of your beautiful new BMW. In the post, you link to a relevant product page on the official BMW website. Is this link irrelevant because it comes from a cat blog?
No. It’s still perfectly legit and relevant. However, it may have less “value” in the eyes of Google than a link from a well‐known auto blogger, who wrote an entire article about that particular BMW model.
In all honesty, if I had to choose which of these two pages would provide the best link for BMW…
… I would have a seriously hard time deciding.
These two pages aren’t real. I made them up. 🙂
Most SEOs never think about Google PageRank for obvious reasons: it’s old, and there’s no way to see the PageRank for a page anymore, even if you wanted to.
But it’s important to remember that the PageRank formula is at the heart of many of today’s SEO best practices. It’s the reason why backlinks matter, and it’s why SEO professionals still pay so much attention to internal linking.
That’s not to say that you should obsess over, or even try to optimize for PageRank directly. You shouldn’t. But understand that whenever you build links, work on your internal linking structure, or vet your external links. What you’re actually doing is indirectly optimizing for PageRank.